Hadoop is an open-source framework that is primarily used for data storage, management, and analysis across massive datasets on commodity hardware clusters, making it a data management tool.
In this post, we’ll go through Hadoop’s three operational modes.
- Standalone mode: This is the default mode. The Hadoop services are implemented using a single Java process and the local FileSystem. This means that HDFS is not really used and Standalone mode is primarily used for debugging. In standalone mode, you can use both input and output as a local file system.
- Pseudo-distributed mode: All Hadoop services are implemented here using a single-node Hadoop configuration.The pseudo-distribute mode, which places the NameNode and DataNode on the same system, is sometimes known as a single-node cluster.
All of the Hadoop daemons will be active on a single node in pseudo-distributed mode. When testing and not having to worry about the resources or other people using them, this configuration is typically utilized.
- Fully-distributed mode: Hadoop will be operating in this production mode with several nodes. Data will be split among numerous nodes in this case, with processing taking place on each node individually.In fully distributed Hadoop Mode, Master and Slave services will be executing on distinct nodes.