Exploring the HDFS Default Value Behaviour

In this post I’ll explore how the HDFS default values really work, which I found to be quite surprising and non-intuitive, so there s a good lesson here.

In my case, I have my local virtual Hadoop cluster, and I have a different client outside the cluster. I’ll just put a file from the external client into the cluster in two configurations.

All the nodes of my Hadoop (1.2.1) cluster have the same hdfs-site.xml file with the same (non-default) value for dfs.block.size (renamed to dfs.blocksize in Hadoop 2.x) of 134217728, which is 128MB. In my external node, I also have the hadoop executables with a minimal hdfs-site.xml.

First, I have set dfs.block.size to 268435456 (256MB) in my client hdfs-site.xml and copied a 400MB file to HDFS:

 ./hadoop fs -copyFromLocal /sw/400MB.file /user/ofir

Checking its block size from the NameNode:

./hadoop fsck /user/ofir/400MB.file -files -blocks -racks
FSCK started by root from /10.0.1.111 for path /user/ofir/400MB.file at Thu Jan 30 22:21:29 UTC 2014
/user/ofir/400MB.file 419430400 bytes, 2 block(s):  OK
0. blk_-5656069114314652598_27957 len=268435456 repl=3 [/rack03/10.0.1.133:50010, /rack03/10.0.1.132:50010, /rack02/10.0.1.122:50010]
1. blk_3668240125470951962_27957 len=150994944 repl=3 [/rack03/10.0.1.133:50010, /rack03/10.0.1.132:50010, /rack01/10.0.1.113:50010]

Status: HEALTHY
 Total size:    419430400 B
 Total dirs:    0
 Total files:    1
 Total blocks (validated):    2 (avg. block size 209715200 B)
 Minimally replicated blocks:    2 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    1
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        6
 Number of racks:        3
FSCK ended at Thu Jan 30 22:21:29 UTC 2014 in 1 milliseconds

So far, looks good – the first block is 256MB.

Now, let’s remove the reference for dfs.block.size from my hdfs-site.xml file on the client. I expected that, since we don’t specify a value in our client, it should not ask for any specific block size, so the actual block size will be the value set in the NameNode and DataNodes – 128MB. However, this is what I got:

./hadoop fs -copyFromLocal /sw/400MB.file /user/ofir/400MB.file2

Checking from the NameNode:

# ./hadoop fsck /user/ofir/400MB.file2 -files -blocks -racks
FSCK started by root from /10.0.1.111 for path /user/ofir/400MB.file2 at Thu Jan 30 22:27:52 UTC 2014
/user/ofir/400MB.file2 419430400 bytes, 7 block(s):  OK
0. blk_1787803769905799654_27959 len=67108864 repl=3 [/rack03/10.0.1.132:50010, /rack03/10.0.1.133:50010, /rack01/10.0.1.113:50010]
1. blk_3875160333629322317_27959 len=67108864 repl=3 [/rack01/10.0.1.112:50010, /rack01/10.0.1.113:50010, /rack03/10.0.1.132:50010]
2. blk_-3214658865520475261_27959 len=67108864 repl=3 [/rack03/10.0.1.132:50010, /rack03/10.0.1.133:50010, /rack02/10.0.1.122:50010]
3. blk_-3454099016147726286_27959 len=67108864 repl=3 [/rack03/10.0.1.132:50010, /rack03/10.0.1.133:50010, /rack01/10.0.1.113:50010]
4. blk_-4422811025086071415_27959 len=67108864 repl=3 [/rack01/10.0.1.112:50010, /rack01/10.0.1.113:50010, /rack02/10.0.1.122:50010]
5. blk_8134056524051282216_27959 len=67108864 repl=3 [/rack02/10.0.1.123:50010, /rack02/10.0.1.122:50010, /rack03/10.0.1.132:50010]
6. blk_2811491227269477239_27959 len=16777216 repl=3 [/rack02/10.0.1.122:50010, /rack02/10.0.1.123:50010, /rack03/10.0.1.133:50010]

Status: HEALTHY
 Total size:    419430400 B
 Total dirs:    0
 Total files:    1
 Total blocks (validated):    7 (avg. block size 59918628 B)
 Minimally replicated blocks:    7 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    1
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        6
 Number of racks:        3
FSCK ended at Thu Jan 30 22:27:52 UTC 2014 in 2 milliseconds

The filesystem under path '/user/ofir/400MB.file2' is HEALTHY

Well, the file definitely has a block size of 64MB, not 128MB! This is actually the default value for dfs.block.size in Hadoop 1.x.

It seems that the value of dfs.block.size is dictated directly by the client, regarding of the cluster setting. If a value is not specified, the client just picks the default value. This finding is not  specific to this parameter – for example, the same thing happens with dfs.replication and others….

So, if you ever wondered why you must have full copy of the Hadoop config files around instead of just pointing to the NameNode and JobTracker, now you know…
Now double check your client setup!

Corrections and comments are welcomed!

P.S. Just as I was publishing it, I found a shorter way to check the block size of a file, that also works from any client:

# ./hadoop fs -stat %o /user/ofir/400MB.file
268435456
# ./hadoop fs -stat %o  /user/ofir/400MB.file2
67108864
Advertisements

One thought on “Exploring the HDFS Default Value Behaviour

  1. Nice post!
    One even simpler way to verify this on a machine in the cluster is to use the generic options:
    1. run command: hadoop fs -Ddfs.blocksize=67108864 -put logs.tar.gz file2 // copy the logs.tar.gz to the hdfs using block.dfssize = 64MB
    2. run command to get the blocksize of the file: hadoop fs -stat %o file2
    output: 67108864

    Note that: since hadoop 2.2.0, the default value of dfs.blocksize is changed to 128MB.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s