Vu Pham
Preface
ContentofthisLecture:
Inthislecture,wewilldiscussdesigngoalsofHDFS,the
read/writeprocesstoHDFS,themainconfiguration
tuningparameterstocontrolHDFSperformanceand
robustness.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Introduction
Hadoopprovidesadistributedfilesystemandaframeworkfor
theanalysisandtransformationofverylargedatasetsusing
theMapReduceparadigm.
AnimportantcharacteristicofHadoopisthepartitioningof
dataandcomputationacrossmany(thousands)ofhosts,and
executingapplicationcomputationsinparallelclosetotheir
data.
AHadoopclusterscalescomputationcapacity,storagecapacity
andIObandwidthbysimplyaddingcommodityservers.
HadoopclustersatYahoo!span25,000servers,andstore25
petabytesofapplicationdata,withthelargestclusterbeing
3500servers.Onehundredotherorganizationsworldwide
reportusingHadoop.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Introduction
HadoopisanApacheproject;allcomponentsareavailablevia
theApacheopensourcelicense.
Yahoo!hasdevelopedandcontributedto80%ofthecoreof
Hadoop(HDFSandMapReduce).
HBasewasoriginallydevelopedatPowerset,nowadepartment
atMicrosoft.
HivewasoriginatedanddevelopedatFacebook.
Pig,ZooKeeper,andChukwawereoriginatedanddevelopedat
Yahoo!
AvrowasoriginatedatYahoo!andisbeingco-developedwith
Cloudera.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Hadoop Project Components
Big Data Computing
HDFS Distributed file system
MapReduceDistributed computation framework
HBase Column-oriented table service
Pig
Dataflow language and parallel execution
framework
Hive Data warehouse infrastructure
ZooKeeperDistributed coordination service
Chukwa System for collecting management data
Avro Data serialization system
Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Design Concepts
Scalabledistributedfilesystem:Soessentially,asyouadddisks
yougetscalableperformance.Andasyouaddmore,you're
addingalotofdisks,andthatscalesouttheperformance.
Distributeddataonlocaldisksonseveralnodes.
Lowcostcommodityhardware:Alotofperformanceoutofit
becauseyou'reaggregatingperformance.
Big Data Computing
Node 1
B1
Node 2
B2
Node n
Bn…
Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Design Goals
Hundreds/Thousandsofnodesanddisks:
Itmeansthere'sahigherprobabilityofhardwarefailure.Sothedesign
needstohandlenode/diskfailures.
Portabilityacrossheterogeneoushardware/software:
Implementationacrosslotsofdifferentkindsofhardwareandsoftware.
Handlelargedatasets:
Needtohandleterabytestopetabytes.
Enableprocessingwithhighthroughput
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Techniques to meet HDFS design goals
Simplifiedcoherencymodel:
Theideaistowriteonceandthenreadmanytimes.Andthatsimplifies
thenumberofoperationsrequiredtocommitthewrite.
Datareplication:
Helpstohandlehardwarefailures.
Trytospreadthedata,samepieceofdataondifferentnodes.
Movecomputationclosetothedata:
Soyou'renotmovingdataaround.Thatimprovesyourperformanceand
throughput.
RelaxPOSIXrequirementstoincreasethethroughput.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Basic architecture of HDFS
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Architecture: Key Components
SingleNameNode:Amasterserverthatmanagesthefilesystem
namespaceandbasicallyregulatesaccesstothesefilesfrom
clients,anditalsokeepstrackofwherethedataisonthe
DataNodesandwheretheblocksaredistributedessentially.
MultipleDataNodes:Typicallyonepernodeinacluster.So
you'rebasicallyusingstoragewhichislocal.
BasicFunctions:
ManagethestorageontheDataNode.
Readandwriterequestsontheclients
Blockcreation,deletion,andreplicationisallbasedoninstructionsfrom
theNameNode.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Original HDFS Design
SingleNameNode
MultipleDataNodes
Managestorage-blocksofdata
Servingread/writerequestsfromclients
Blockcreation,deletion,replication
Big Data Computing Big Data Enabling Technologies
Vu Pham
HDFS in Hadoop 2
HDFSFederation:Basicallywhatwearedoingistryingtohave
multipledatanodes,andmultiplenamenodes.Sothatwecan
increasethenamespacedata.So,ifyourecallfromthefirstdesign
youhaveessentiallyasinglenodehandlingallthenamespace
responsibilities.Andyoucanimagineasyoustarthavingthousandsof
nodesthatthey'llnotscale,andifyouhavebillionsoffiles,youwill
havescalabilityissues.Sotoaddressthat,thefederationaspectwas
broughtin.Thatalsobringsperformanceimprovements.
Benefits:
Increasenamespacescalability
Performance
Isolation
Big Data Computing Big Data Enabling Technologies
Vu Pham
HDFS in Hadoop 2
Howitsdone
MultipleNamenodeservers
Multiplenamespaces
DataisnowstoredinBlockpools
Sothereisapoolassociatedwitheachnamenodeor
namespace.
Andthesepoolsareessentiallyspreadoutoverallthedata
nodes.
Big Data Computing Big Data Enabling Technologies
Vu Pham
HDFS in Hadoop 2
HighAvailability-
RedundantNameNodes
HeterogeneousStorage
andArchivalStorage
ARCHIVE,DISK,SSD,RAM_DISK
Big Data Computing Big Data Enabling Technologies
Vu Pham
Federation: Block Pools
Big Data Computing Big Data Enabling Technologies
So,ifyouremembertheoriginaldesignyouhaveonenamespaceandabunchof
datanodes.So,thestructurelookssimilar.
YouhaveabunchofNameNodes,insteadofoneNameNode.Andeachofthose
NameNodesisessentiallyrightintothesepools,butthepoolsarespreadoutoverthe
datanodesjustlikebefore.Thisiswherethedataisspreadout.Youcanglossover
thedifferentdatanodes.So,theblockpoolisessentiallythemainthingthat's
different.
Vu Pham
HDFS Performance Measures
Determinethenumberofblocksforagivenfilesize,
KeyHDFSandsystemcomponentsthatareaffected
bytheblocksize.
AnimpactofusingalotofsmallfilesonHDFSand
system
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Recall: HDFS Architecture
Distributeddataonlocaldisksonseveralnodes
Big Data Computing
Node 1
B1
Node 2
B2
Node n
Bn…
Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Block Size
Defaultblocksizeis64megabytes.
Goodforlargefiles!
Soa10GBfilewillbebrokeninto:10x1024/64=160blocks
Big Data Computing
Node 1
B1
Node 2
B2
Node n
Bn…
Hadoop Distributed File System (HDFS)
Vu Pham
Importance of No. of Blocks in a file
NameNodememoryusage:Everyblockthatyoucreatebasically
everyfilecouldbealotofblocksaswesawinthepreviouscase,
160blocks.Andifyouhavemillionsoffilesthat'smillionsof
objectsessentially.Andforeachobject,itusesabitofmemoryon
theNameNode,sothatisadirecteffectofthenumberofblocks.
Butifyouhavereplication,thenyouhave3timesthenumberof
blocks.
Numberofmaptasks:Numberofmapstypicallydependsonthe
numberofblocksbeingprocessed.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Large No. of small files: Impact on Name node
Memoryusage:Typically,theusageisaround150bytesper
object.Now,ifyouhaveabillionobjects,that'sgoingtobelike
300GBofmemory.
Networkload:Numberofcheckswithdatanodesproportional
tonumberofblocks
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Large No. of small files: Performance Impact
Numberofmaptasks:Supposewehave10GBofdatato
processandyouhavethemallinlotsof32kfilesizes?Thenwe
willendupwith327680maptasks.
Hugelistoftasksthatarequeued.
Theotherimpactofthisisthemaptasks,eachtimetheyspinup
andspindown,there'salatencyinvolvedwiththatbecauseyou
arestartingupJavaprocessesandstoppingthem.
InefficientdiskI/Owithsmallsizes
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS optimized for large files
Lotsofsmallfilesisbad!
Solution:
Merge/Concatenatefiles
Sequencefiles
HBase,HIVEconfiguration
CombineFileInputFormat
Big Data Computing Hadoop Distributed File System (HDFS)
Vu PhamBig Data Computing
Read/Write Processes in HDFS
Hadoop Distributed File System (HDFS)
Vu Pham
Read Process in HDFS
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Write Process in HDFS
Big Data Computing Hadoop Distributed File System (HDFS)
Vu PhamBig Data Computing
HDFS Tuning Parameters
Hadoop Distributed File System (HDFS)
Vu Pham
Overview
Tuningparameters
SpecificallyDFSBlocksize
NameNode,DataNodesystem/dfsparameters.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS XML configuration files
TuningenvironmenttypicallyinHDFSXMLconfigurationfiles,
forexample,inthehdfs-site.xml.
ThisismoreforsystemadministratorsofHadoopclusters,but
it'sgoodtoknowwhatchangesaffectimpacttheperformance,
andespeciallyifyourtryingthingsoutonyourowntheresome
importantparameterstokeepinmind.
CommercialvendorshaveGUIbasedmanagementconsole
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Block Size
Recall:impactshowmuchNameNodememoryisused,number
ofmaptasksthatareshowingup,andalsohaveimpactson
performance.
Default64megabytes:Typicallybumpedupto128megabytes
andcanbechangedbasedonworkloads.
Theparameterthatthischangesdfs.blocksizeordfs.block.size.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Replication
Defaultreplicationis3.
Parameter:dfs.replication
Tradeoffs:
Lowerittoreducereplicationcost
Lessrobust
Higherreplicationcanmakedatalocaltomoreworkers
Lowerreplication➔Morespace
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Lot of other parameters
Varioustunablesfordatanode,namenode.
Examples:
Dfs.datanode.handler.count(10):Setsthenumberofserver
threadsoneachdatanode
Dfs.namenode.fs-limits.max-blocks-per-file:Maximumnumber
ofblocksperfile.
FullList:
http://hadoop.apache.org/docs/current/hadoop-project-
dist/hadoop-hdfs/hdfs-default.xml
Big Data Computing Hadoop Distributed File System (HDFS)
Vu PhamBig Data Computing
HDFS Performance and
Robustness
Hadoop Distributed File System (HDFS)
Vu Pham
Common Failures
DataNodeFailures:Servercanfail,diskcancrash,data
corruption.
NetworkFailures:Sometimesthere'sdatacorruptionbecause
ofnetworkissuesordiskissue.So,allofthatcouldleadtoa
failureintheDataNodeaspectofHDFS.Youcouldhavenetwork
failures.So,youcouldhaveanetworkgodownbetweena
particularandthenamenodethatcanaffectalotofdatanodes
atthesametime.
NameNodeFailures:Couldhavenamenodefailures,diskfailure
onthenamenodeitselforthenamenodeitselfcouldcorrupt
thisprocess.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
HDFS Robustness
NameNodereceivesheartbeatandblockreportsfrom
DataNodes
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Mitigation of common failures
Periodicheartbeat:fromDataNodetoNameNode.
DataNodeswithoutrecentheartbeat:
Markthedata.AndanynewI/Othatcomesupisnotgoingtobesentto
thatdatanode.AlsorememberthatNameNodehasinformationonall
thereplicationinformationforthefilesonthefilesystem.So,ifitknows
thatadatanodefailswhichblockswillfollowthatreplicationfactor.
Nowthisreplicationfactorissetfortheentiresystemandalsoyoucould
setitforparticularfilewhenyou'rewritingthefile.Eitherway,the
NameNodeknowswhichblocksfallbelowreplicationfactor.Anditwill
restarttheprocesstore-replicate.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Mitigation of common failures
Checksumcomputedonfilecreation.
ChecksumsstoredinHDFSnamespace.
Usedtocheckretrieveddata.
Re-readfromalternatereplica
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Mitigation of common failures
Multiplecopiesofcentralmetadatastructures.
FailovertostandbyNameNode-manualbydefault.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Performance
Changingblocksizeandreplicationfactorcanimprove
performance.
Example:Distributedcopy
Hadoopdistcpallowsparalleltransferoffiles.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Replication trade off with respect to robustness
Oneperformancetradeoffis,actuallywhenyougoout
todosomeofthemapreducejobs,havingreplicas
givesadditionallocalitypossibilities,butthebigtrade
offistherobustness.Inthiscase,wesaidnoreplicas.
Mightloseanodeoralocaldisk:can'trecoverbecause
thereisnoreplication.
Similarly,withdatacorruption,ifyougetachecksum
that'sbad,nowyoucan'trecoverbecauseyoudon't
haveareplica.
Otherparameterschangescanhavesimilareffects.
Big Data Computing Hadoop Distributed File System (HDFS)
Vu Pham
Conclusion
Inthislecture,wehavediscusseddesigngoalsofHDFS,
theread/writeprocesstoHDFS,themainconfiguration
tuningparameterstocontrolHDFSperformanceand
robustness.
Big Data Computing Hadoop Distributed File System (HDFS)