In hadoop hdfs-site.xml, there is a very important paramenter called dfs.data.dir, or dfs.datanode.data.dir.

This definition determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories that do not exist will be created if local filesystem permission allows.

The reconginzed format is:

file://${hadoop.tmp.dir}/dfs/data

For multiple entries, separate entries by comma, here is an example:

<property>
  <name>dfs.datanode.data.dir</name>
    <value>file:///disk/c0t2,/disk/c0t3,/disk/c0t4,/disk/c0t5</value>
</property>

How does dfs.datanode.data.dir work with multiple entries?

When dfs.data.dir has multiple values, data is copied to the HDFS in a round-robin fashion. If one of the directory's disk is full, round-robin data copy will continue on the rest of the directories.