Check Directory Size in HDFS

In Cloudera virtual machines, the command syntax for retrieving the directory size in HDFS is listed below.

hadoop fs -du -s -h /directory

Examples:
hadoop fs -du -s -h /user/cloudera
hadoop fs -du -s -h /user

hdfs_size

And the below in other environments and recent versions
hdfs -du -s -h /path/to/dir

Options:

  • Instead of displaying individual files, the -s option displays an aggregate overview of file lengths. The calculation is done without the -s option by travelling one level further from the supplied path.
  • With the -h option, file sizes are formatted in a human-readable manner. In the case above, the size was displayed as 793.8 MB rather than 832319138.
  • The -v option adds a header line with the names of the columns.
  • Snapshots will be excluded from the result calculation if the -x option is used. The result is always calculated from all Nodes, including all snapshots under the supplied path, unless the -x option is used (which is the default).

Hope you find this article helpful.

Please subscribe for more interesting updates.

One comment

Leave a Reply