This blog post will compare and contrast HDFS Federation and High Availability, two approaches that were designed as a solution for the NameNode single point of failure issue.
Relation between Namenodes
The number of NameNodes in HDFS Federation is unrestricted, and they are not related to one another.
There are two NameNodes (Primary Namenode and Standby Namenode) in HDFS High Availability that are related to one another. Both standby and active NameNodes are operational at all times.
Each NameNode will have its own dedicated pool in the metadata pool that is shared by all NameNodes.
While standby NameNodes are inactive and periodically update their metadata, active NameNodes will start up one at a time.
HDFS Federation offers fault tolerance, so if one NameNode goes down, the data of the other NameNode won’t be impacted.
It takes two different machines (for Primary NN and Standby NN) to use HDFS High Availability. The primary NameNode will be configured first, followed by the standby NameNode on the other system.