Hadoop components are rack-aware. For example, HDFS block placement will use rack awareness for fault tolerance by placing one block replica on a different rack. This provides data availability in the event of a network switch failure or partition within the cluster.

Use the following instructions to configure rack awareness on an HDFS cluster.

Create a Rack Topology Script

Topology scripts are used by Hadoop to determine the rack location of nodes. This information is used by Hadoop to replicate block data to redundant racks.

  1. Create a topology script and data file. The topology script must be executable.

    Sample Topology Script Named rack-topology.sh

    #!/bin/bash
    
    # Adjust/Add the property "net.topology.script.file.name"
    # to core-site.xml with the "absolute" path the this
    # file. ENSURE the file is "executable".
    
    # Supply appropriate rack prefix
    RACK_PREFIX=default
    
    # To test, supply a hostname as script input:
    if [ $# -gt 0 ]; then
    
    CTL_FILE=${CTL_FILE:-"rack_topology.data"}
    
    HADOOP_CONF=${HADOOP_CONF:-"/etc/hadoop/conf"} 
    
    if [ ! -f ${HADOOP_CONF}/${CTL_FILE} ]; then
     echo -n "/$RACK_PREFIX/rack "
     exit 0
    fi
    
    while [ $# -gt 0 ] ; do
     nodeArg=$1
     exec< ${HADOOP_CONF}/${CTL_FILE}
     result=""
     while read line ; do
     ar=( $line )
     if [ "${ar[0]}" = "$nodeArg" ] ; then
     result="${ar[1]}"
     fi
     done
     shift
     if [ -z "$result" ] ; then
     echo -n "/$RACK_PREFIX/rack "
     else
     echo -n "/$RACK_PREFIX/rack_$result "
     fi
    done
    
    else
     echo -n "/$RACK_PREFIX/rack "
    fi

    Sample Topology Data File Named rack_topology.data

    # This file should be:
    # - Placed in the /etc/hadoop/conf directory
    # - On the Namenode (and backups IE: HA, Failover, etc)
    # - On the Job Tracker OR Resource Manager (and any Failover JT's/RM's) 
    # This file should be placed in the /etc/hadoop/conf directory.
    
    # Add Hostnames to this file. Format <host ip> <rack_location>
    192.168.2.1 01
    192.168.2.2 02
    192.168.2.3 03
  2. Copy both of these files to the /etc/hadoop/conf directory on all cluster nodes.

  3. Run the rack-topology.sh script to ensure that it returns the correct rack information for each host.

Add the Topology Script Property to core-site.xml

  1. Stop HDFS using commands

    stop-dfs.sh
  2. Add the following property to core-site.xml:

    <property>
      <name>net.topology.script.file.name</name> 
      <value>/etc/hadoop/conf/rack-topology.sh</value>
    </property>

By default the topology script will process up to 100 requests per invocation. You can also specify a different number of requests with the net.topology.script.number.args property. For example:

<property> 
  <name>net.topology.script.number.args</name> 
  <value>75</value>
</property>

Restart HDFS and MapReduce

Restart HDFS and MapReduce using commands

start-dfs.sh

Verify Rack Awareness

After the services have started, you can use the following methods to verify that rack awareness has been activated:

  1. Look in the NameNode logs located in /var/log/hadoop/hdfs/. For example: hadoop-hdfs-namenode-sandbox.log. You should see an entry like this:

    2016-12-13 00:07:31,275 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default/rack_01/192.168.0.1:50010
  2. The Hadoop fsck command should return something like the following (if there are two racks):

    hdfs fsck -racks
    Connecting to namenode via http://192.168.0.1:50070/fsck?ugi=hadoop&racks=1&path=%2F
    FSCK started by hadoop (auth:SIMPLE) from /192.168.0.1 for path / at Tue Dec 13 00:11:19 PST 2016
    ......................................................................
    ......................................................................
    ......................................................................
    ....................................................Status: HEALTHY
     Total size:    136317447128 B
     Total dirs:    17
     Total files:    562
     Total symlinks:        0
     Total blocks (validated):    1361 (avg. block size 100159770 B)
     Minimally replicated blocks:    1361 (100.0 %)
     Over-replicated blocks:    0 (0.0 %)
     Under-replicated blocks:    0 (0.0 %)
     Mis-replicated blocks:        0 (0.0 %)
     Default replication factor:    2
     Average block replication:    1.9764879
     Corrupt blocks:        0
     Missing replicas:        0 (0.0 %)
     Number of data-nodes:        3
     Number of racks:        2
    FSCK ended at Tue Dec 13 00:11:19 PST 2016 in 126 milliseconds
  3. The Hadoop dfsadmin -report command will return a report that includes the rack name next to each machine. The report should look something like the following excerpted example:

    $ hdfs dfsadmin -report
    Configured Capacity: 138556755984384 (126.02 TB)
    Present Capacity: 138546270113792 (126.01 TB)
    DFS Remaining: 138275126161408 (125.76 TB)
    DFS Used: 271143952384 (252.52 GB)
    DFS Used%: 0.20%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 2

    -------------------------------------------------
    Live datanodes (3):

    Name: 192.168.0.3:50010 (datanode2)
    Hostname: datanode2
    Rack: /default/rack_02
    Decommission Status : Normal
    Configured Capacity: 6594653847552 (6.00 TB)
    DFS Used: 133752802199 (124.57 GB)
    ...
    Last contact: Tue Dec 13 00:14:02 PST 2016


    Name: 192.168.0.2:50010 (datanode1)
    Hostname: datanode1
    Rack: /default/rack_01
    ...
    Last contact: Tue Dec 13 00:14:01 PST 2016


    Name: 192.168.0.1:50010 (namenode)
    Hostname: namenode
    Rack: /default/rack_01
    ...

Ref:

https://issues.apache.org/jira/browse/HADOOP-692