Page Comparison

...

Again, that's not a one-size-fits-all formula, but it is a one-size-fits-MOST one that has been used rather universally with great success. Obviously, many things to consider when setting this up and just as many strategies for reducing the number of files in the first place such as concatenating files together, compressing files into a single binary (this one is most useful when the files are purely for long-term storage), leveraging Hadoop Archives, HDFS Federation, and so on. All that said, it is not uncommon to see a 96GB heap sized NN managing 100+ million files.

Not to prescribe a setting for your cluster, but at a minimum, even for early POC efforts, increase your NN heap size to, at least, 4096 MB. If using Ambari, the following screenshot shows you where to find this setting.

...

I'll save a richer description on inodes, pros/cons to increasing this value, and the actual procedure(s) to utilize for a stronger linux administrator than I, but again, the question raised to me was; "will the inode size matter on the (underlying) hdfs file system mounts". The short answer is that, in a cluster with more than a trivial amount of Datanodes (DNs) then inodes should not cause an issue. Basically, this is because *all* of the files will be spread across all the DNs *and* each of these will have multiple physical spindles to further spread the love.

Of course, things are often more complicated when we dive deeper. First, each file will be broken up into appropriately sized blocks and then each of these will have three copies. So, our example file above with 3 blocks will need to have 9 separate physical files stored by the DNs. As you peel back the onion, you'll see there really are two files for each block stored; the block itself and then a second file that contains metadata & checksum values. In fact, it gets a tiny bit more complicated than that by the DNs needing to have a directory structure so they don't overrun a flat directory. Chapter 10 of Hadoop: The Definitive Guide (3rd Edition) has a good write up on this as you can see here which is further visualized by the abbreviated directory listing from a DN's block data below.

...

To generalize (and let's be clear... that's what I'm doing here -- just creating a rough formula to make sure we are in the right ballpark), we can say that we'll need 2.1 times the amount of inodes for each copy of a block; that's 6.3 times the number of blocks we'd expect to be loaded into HDFS to account for replication.

To start the math, let's calculate the NbrOfAvailBlocksPerCluster which is simply the inode limit per disk TIMES the number of disks in a DN TIMES the number of DNs in the cluster and then DIVIDE that number by the 6.3 value described above. For example, the following values surface for a cluster that has DNs with 10 disks each whose JBOD file system partitions can support 1 million inodes.

...

Now, if you are immediately worried that these calculations suggest that a cluster with 30 DNs can only hold 20 MM files (using 2.5 blocks on average), then simply raise the inodes to something larger. If the inodes setting was raised to 100 MM then that same 30 DN cluster would allow for 1.9 BILLION files. If that cluster's block size was 128 MB then it would have twice as much data as the cluster could hold.

So, let's see if we can write that down in an algebraic formula. Remember, we're just thinking about inodes here -- the disk sizing calculations calculation is MUCH easier.

NbrOfFilesInodeSettingCanSupport = ( ( InodeSettingPerDisk * NbrDisksPerDN * NbrDNs ) / ( ReplFactor * 2.1 ) ) / AvgNbrBlocksPerFile

...

Versions Compared

Old Version 1

New Version Current

Key