Check Directory Size in HDFS

In Cloudera virtual machines, the command syntax for retrieving the directory size in HDFS is listed below.

hadoop fs -du -s -h /directory

Examples:
hadoop fs -du -s -h /user/cloudera
hadoop fs -du -s -h /user

hdfs_size

And the below in other environments and recent versions
hdfs -du -s -h /path/to/dir

Options:

  • Instead of displaying individual files, the -s option displays an aggregate overview of file lengths. The calculation is done without the -s option by travelling one level further from the supplied path.
  • With the -h option, file sizes are formatted in a human-readable manner. In the case above, the size was displayed as 793.8 MB rather than 832319138.
  • The -v option adds a header line with the names of the columns.
  • Snapshots will be excluded from the result calculation if the -x option is used. The result is always calculated from all Nodes, including all snapshots under the supplied path, unless the -x option is used (which is the default).

Hope you find this article helpful.

Please subscribe for more interesting updates.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s