Post on 25-Jan-2017
HDFS InternalsBhupesh Chawda
bhupesh@apache.org
DataTorrent
What are Blocks?● A physical storage disk has a block size - minimum amount of data it can
read or write. Normally 512 bytes.● File systems for a single disk also deal with data in blocks. Normally few
kilo bytes (4 kb).● Hadoop has a much larger block size. By default it is 64 mb.● Files in HDFS are broken down into block sized chunks and are stored as
independent units. ● However, files smaller than a block size do not occupy the entire block.
○ Should I care?
Why so large blocks?● Minimize disk seek times● Assuming 10 ms of seek time, and 100 MB/s as disk transfer rate, if block
size if 100 MB, then seek time is 1% of transfer time which is small enough to ignore.
● Hence default is 64 MB while many production environments also use 128 MB.
Namenode and Datanode● Master - Namenode
○ Manages file system namespace○ File system tree and metadata for all files and directories○ Stores this info in -
■ Namespace image■ Edit log
○ Knows for a given file which datanodes has the corresponding blocks. Reconstructed at startup
● Worker - Datanode○ Store and retrieve blocks as requested by clients○ Periodically report back to the namenode on the list of blocks they are storing
Secondary Namenode● Not a backup namenode● Periodically merge the namespace image with the edit log, if edit log
becomes too large● Usually runs on a different machine than the namenode● The secondary however always lags behind primary and hence the
merged copy cannot be used in case of primary failure● In event of primary failure, copy the primary namespace image to the
secondary and run it as the new primary.
Small File Problem?
Each file occupies namespace irrespective of file size!!
Image Source: http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/
Further ReadingHDFS Comics :-)
https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLW
E0OGItYTU5OGMxYjc0N2M1
Thank You!!
Please send your questions at:bhupesh@apache.org