SlidePub
Home
Categories
Login
Register
Home
Technology
Multithreading
Multithreading
abshinde
20,161 views
40 slides
Apr 13, 2017
Slide
1
of 40
Previous
Next
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
About This Presentation
Multithreading concepts and its use in computer architectures.
Size:
495.39 KB
Language:
en
Added:
Apr 13, 2017
Slides:
40 pages
Slide Content
Slide 1
Multithreading
Mr. A. B. Shinde
Assistant Professor,
Electronics Engineering,
P.V.P.I.T., Budhgaon
Slide 2
Contents…
UsingILPsupporttoexploit
thread–levelparallelism
performanceandefficiencyin
advanced multipleissue
processors
2
Slide 3
Threads
AthreadisabasicunitofCPUutilization.
Athreadisaseparateprocesswithitsowninstructionsanddata.
Athreadmayrepresentaprocessthatispartofaparallelprogram
consistingofmultipleprocesses,oritmayrepresentan
independentprogram.
3
Slide 4
Threads
ItcomprisesofathreadID,aprogramcounter,aregistersetanda
stack.
Itsharesitscodesection,datasection,andotheroperating-system
resources,suchasopenfilesandsignalswithotherthreads
belongingtothesameProcess.
Atraditionalprocesshasasinglethreadofcontrol.Ifaprocesshas
multiplethreadsofcontrol,itcanperformmorethanonetaskatatime.
4
Slide 5
Threads
Manysoftwarepackagesthatrun
onmoderndesktopPCsare
multithreaded.
Forexample:
Awordprocessormayhave:
athreadfordisplayinggraphics,
anotherthreadforrespondingto
keystrokesfromtheuser,and
athirdthreadforperformingspelling
andgrammarcheckinginthe
background.
5
Slide 6
Threads
Threadsalsoplayavitalroleinremoteprocedurecall(RPC)
systems.
RPCsallowsinterprocesscommunication byprovidinga
communicationmechanismsimilartoordinaryfunctionorprocedure
calls.
Manyoperatingsystemkernelsaremultithreaded;severalthreads
operateinthekernel,andeachthreadperformsaspecifictask,suchas
managingdevicesorinterrupthandling.
6
Slide 7
Multithreading
Benefits:
1.Responsiveness:Multithreadingisaninteractiveapplicationthat
mayallowaprogramtocontinuerunningevenifpartofitis
blocked,therebyincreasingresponsivenesstotheuser.
Forexample:Amultithreadedwebbrowsercouldstillallowuser
interactioninonethreadwhileanimagewasbeingloadedinanother
thread.
2.Resourcesharing:Bydefault,threadssharethememoryandthe
resourcesoftheprocesstowhichtheybelong.Thebenefitofsharing
codeanddataisthatitallowsanapplicationtohaveseveraldifferent
threadsofactivitywithinthesameaddressspace.
7
Slide 8
Multithreading
Benefits:
3.Economy:Allocatingmemoryandresourcesforprocesscreationis
costly.Sincethreadsshareresourcesoftheprocesstowhichthey
belong,theywillprovidecosteffectivesolution.
4.Utilizationofmultiprocessorarchitectures:Inmultiprocessor
architecture,threadsmayberunninginparallelondifferentprocessors.
AsinglethreadedprocesscanonlyrunononeCPU,nomatterhow
manyareavailable.
Multithreadingonamulti-CPUmachineincreasesconcurrency.
8
Slide 9
Multithreading Models
Supportforthreadsmaybeprovidedeitherattheuserlevelorat
thekernellevel.
Userthreadsaresupportedabovethekernelandaremanaged
withoutkernelsupport,whereaskernelthreadsaresupportedand
manageddirectlybytheoperatingsystem.
9
Slide 10
Multithreading Models
Many-to-OneModel:
Themany-to-onemodelmapsmanyuser-
levelthreadstoonekernelthread.
Threadmanagement isdonebythe
threadlibraryinuserspace,soitis
efficient.
Onlyonethreadcanaccessthekernelat
atime,hencemultiplethreadsareunableto
runinparallelonmultiprocessors.
10
Slide 11
Multithreading Models
One-to-OneModel:
Theone-to-onemodelmapseachuser
threadtoakernelthread.
Itprovidesmoreconcurrencythanthemany-
to-onemodel.Itallowsmultiplethreadstorunin
parallelonmultiprocessors.
Theonlydrawbacktothismodelisthat
creatingauserthreadrequirescreatingthe
correspondingkernelthread.
Theoverheadofcreatingkernelthreadscan
burdentheperformanceofanapplication.
11
Slide 12
Multithreading Models
Many-to-ManyModel:
Themany-to-manymodelmultiplexesmany
user-levelthreadstoasmallerorequal
numberofkernelthreads.
Thenumberofkernelthreadsmaybespecific
toeitheraparticularapplicationoraparticular
machine.
Developerscancreateasmanyuserthreads
asnecessary,andthecorrespondingkernel
threadscanruninparallelona
multiprocessor.
12
Slide 13
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
13
Slide 14
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
AlthoughILPincreasestheperformanceofsystem;thenalsoILP
canbequitelimitedorhardtoexploitinsomeapplications.
Furthermore,theremaybeparallelismoccurringnaturallyatahigher
levelintheapplication.
Forexample:
Anonlinetransaction-processingsystemhasparallelismamongthe
multiplequeriesandupdates.Thesequeriesandupdatescanbe
processedmostlyinparallel,sincetheyarelargelyindependentofone
another.
14
Slide 15
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Thishigher-levelparallelismiscalledthread-levelparallelism(TLP)
becauseitislogicallystructuredasseparatethreadsofexecution.
ILPisparalleloperationswithinalooporstraight-linecode.
TLPisrepresentedbytheuseofmultiplethreadsofexecutionthat
areinparallel.
15
Slide 16
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Thread-levelparallelismisanimportantalternativetoinstruction-
levelparallelism.
Inmanyapplicationsthread-levelparallelismoccursnaturally(many
serverapplications).
Ifsoftwareiswrittenfromscratch,thenexpressingtheparallelism
ismucheasy.
Butifestablishedapplicationswrittenwithoutparallelisminmind,
thentherecanbesignificantchallengesandcanbeextremelycostly
torewritethemtoexploitthread-levelparallelism.
16
Slide 17
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
TLPandILPexploitstwodifferentkindsofparallelstructures.
Thecrucialquestionis:
CanweexploitTLPonprocessordesignedforILP
Answeris:Yes
DatapathdesignedtoexploitILPwillfindthatmanyfunctionalunitsare
oftenidlebecauseofeitherstallsordependencesinthecode.
Thethreadscanbeusedasaindependentinstructionsthatmightkeep
theprocessorbusytoimplementTLP.
17
Slide 18
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Multithreadingallowsmultiplethreadstosharethefunctionalunits
ofasingleprocessorinanoverlappingfashion.
Topermitthissharing,theprocessormustduplicatethe
independentstateofeachthread.
Forexample:
Aseparatecopyoftheregisterfile,aseparatePCandaseparatepage
tablewererequiredforeachthread.
Inaddition,thehardwaremustsupporttheabilitytochangetoa
differentthreadsrelativelyquickly.
18
Slide 19
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Therearetwomainapproachestomultithreading.
Fine-grainedmultithreading&
Coarse-grainedmultithreading
19
Slide 20
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Fine-grainedmultithreading:
Itswitchesbetweenthreadsoneachinstruction,causingthe
executionofmultiplethreadstobeinterleaved.
Thisinterleavingisoftendoneinaround-robinfashion.
Tomakefine-grainedmultithreadingpractical,theCPUmustbe
abletoswitchthreadsoneveryclockcycle.
Advantage:Itcanhidethethroughputlossesthatarisefrombothshort
andlongstalls.
Disadvantage:Itslowsdowntheexecutionoftheindividualthreads.
20
Slide 21
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
Coarse-grainedmultithreading:
Itwasinventedasanalternativetofine-grainedmultithreading.
Coarse-grainedmultithreadingswitchesthreadsonlyoncostly
(larger)stalls.
Advantage:Thischangerelievestheneedtohavethreadswitching.
Disadvantage:Theyarelikelytoslowtheprocessordown,since
instructionsfromotherthreadswillonlybeissuedwhenathread
encountersacostly(larger)stalls.
21
Slide 22
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
CPUwithcoarse-grainedmultithreadingissuesinstructionsfroma
singlethread.
Whenastalloccurs,thepipelinemustbeemptiedorfrozen.
Newthreadthatisexecutingafterthestallmustfillthepipeline.
Becauseofthisstart-upoverhead,coarsegrainedmultithreadingis
muchmoreusefulforreducingthepenaltyofhigh-coststalls,
wherepipelinerefillisnegligiblecomparedtothestalltime.
22
Slide 23
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Simultaneousmultithreading(SMT)isavariationonmultithreading
thatusestheresourcesofamultiple-issue,dynamicallyscheduled
processortoexploitTLP.
Multiple-issueprocessorsoftenhavemorefunctionalunit
parallelismthanasinglethread,motivatestheuseofSMT.
Withregisterrenaminganddynamicscheduling,multipleinstructions
fromindependentthreadscanbeissuedwithoutconsideringthe
dependencesamongthem.
23
Slide 24
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Figureillustratesthedifferencesinaprocessor’sabilitytoexploitthe
resourcesofasuperscalarforthefollowingconfigurations:
Asuperscalarwithnomultithreadingsupport
Asuperscalarwithcoarse-grainedmultithreading
Asuperscalarwithfine-grainedmultithreading
Asuperscalarwithsimultaneousmultithreading
24
Slide 25
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Inthesuperscalarwithoutmultithreadingsupport,
theuseofissueslotsislimitedbyalackofILP.
Inaddition,amajorstall,suchasaninstruction
cachemiss,canleavetheentireprocessoridle.
25
Anempty(white)boxindicatesthatthe
correspondingissueslotisunusedinthatclock
cycle.
Blackisusedtoindicatetheoccupiedissueslots
Slide 26
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Inthecoarse-grainedmultithreadedsuperscalar,
thelongstallsarepartiallyhiddenbyswitching
toanotherthreadthatusestheresourcesofthe
processor.
Thisreducesthenumberofcompletelyidle
clockcycles,withineachclockcycle,theILP
limitationsstillleadtoidlecycles.
Inacoarsegrainedmultithreadedprocessor,
threadswitchingonlyoccurswhenthereisa
stall,thenalsotherewillbesomefullyidlecycles
remaining.
26
Theshadesofgreyandblackcorrespondto
differentthreadsinthemultithreadingprocessors.
Slide 27
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Inthefine-grainedmultithreading,the
interleavingofthreadseliminatesfullyempty
slots.
Becauseonlyonethreadissuesinstructionsin
agivenclockcycle,ILPlimitationsstillleadtoa
significantnumberofidleslotswithinindividual
clockcycles.
27
Anempty(white)boxindicatesthatthe
correspondingissueslotisunusedinthatclock
cycle.
Theshadesofgreyandblackcorrespondtofour
differentthreadsinthemultithreadingprocessors.
Slide 28
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
InSMT,TLPandILPareexploited
simultaneously.
Ideally,theissueslotusageislimitedby
imbalancesintheresourceneedsandresource
availabilityovermultiplethreads.
Inpractice,otherfactors—
-howmanyactivethreadsareconsidered,
-finitelimitationsonbuffers,
-theabilitytofetchenoughinstructionsfrom
multiplethreads,and
-practicallimitationsofwhatinstruction
combinationscanissuefromonethreadand
frommultiplethreads—canalsorestricthow
manyslotsareused.
28
Slide 29
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
DesignChallengesinSMT
Becauseadynamicallyscheduledsuperscalarprocessorhasa
deeppipeline,coarse-grainedMTwillgainmuchinperformance.
SinceSMTmakessenseonlyinafine-grainedimplementation,we
shouldthinkabouttheimpactoffine-grainedschedulingonsingle-
threadperformance.
Thiseffectcanbeminimizedbyhavingapreferredthread,which
stillpermitsmultithreadingtopreservesomeofitsperformance
advantagewithasmallercompromiseinsingle-threadperformance.
29
Slide 30
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
DesignChallengesinSMT
OtherdesignchallengesforanSMTprocessor:
Dealingwithalargerregisterfileneededtoholdmultiplecontexts.
Notaffectingtheclockcycle,particularlyininstructionissue,where
moreinstructionsneedstobeconsidered,andchoosingwhat
instructionstocommitmaybechallenging.
EnsuringthatthecacheandTLBconflictsgeneratedbythe
simultaneousexecutionofmultiplethreadsdonotcausesignificant
performancedegradationisalsochallenging.
30
Slide 31
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
DesignChallengesinSMT
Inmanycases,thepotentialperformanceoverheaddueto
multithreadingissmall.
Theefficiencyofcurrentsuperscalarsislowenoughthatthereis
scopeforsignificantimprovement,evenatthecostofsomeoverhead.
31
Slide 32
Performance and Efficiency in Advanced
Multiple-Issue Processors
32
Slide 33
Performance and Efficiency in Advanced
Multiple-Issue Processors
Thequestionofefficiencyintermsofsiliconareaandpoweris
equallycritical.
Poweristhemajorconstraintonmodernprocessors.
TheItanium2isthemostinefficientprocessorbothforfloating-point
andintegercode.
TheAthlonandPentium4bothmakesgooduseoftransistorand
areaintermsofefficiency.
TheIBMPower5isthemosteffectiveuserofenergy.
Thefactthatnoneoftheprocessorsofferangreatadvantagein
efficiency.
33
Slide 34
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
Powerisafunctionofbothstaticpower(proportionaltothetransistor
count,whetherornotthetransistorsareswitching),anddynamic
power(proportionaltotheproductofthenumberofswitching
transistorsandtheswitchingrate).
Staticpoweriscertainlyadesignconcern,anddynamicpoweris
usuallythedominantenergyconsumer.
AmicroprocessortryingtoachievebothalowCPIandahighCR
mustswitchmoretransistorsandswitchthemfaster.
34
Slide 35
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
Mosttechniquesusedforincreasingperformance,(multiplecores
andmultithreading)willincreasepowerconsumption.
Thekeyquestioniswhetheratechniqueisenergyefficient?
Doesitincreasepowerconsumptionfasterthanitincreases
performance?
35
Slide 36
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
Thisinefficiency,arisesfromtwoprimarycharacteristics:
First,issuingmultipleinstructionsincurssomeoverheadinlogic
thatgrowsfasterthantheissuerategrows.
Thislogicisresponsibleforinstructionissueanalysis,including
dependencechecking,registerrenaming,andsimilarfunctions.
Thecombinedresultisthat,lowerCPIsarelikelytoleadtolower
ratiosofperformanceperwatt,simplyduetooverhead.
36
Slide 37
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
Second,thegrowinggapbetweenpeakissueratesandsustained
performance.
Thenumberoftransistorsswitchingwillbeproportionaltothe
peakissuerate,andtheperformanceisproportionaltothe
sustainedrate.
Forexample:Ifwewanttosustainfourinstructionsperclock,wemust
fetchmore,issuemore,andinitiateexecutiononmorethanfour
instructions.
Thepowerwillbeproportionaltothepeakrate,butperformance
willbeatthesustainedrate.
37
Slide 38
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
ImportanttechniqueforincreasingtheexploitationofILP(speculation)
—isinefficient…becauseitcanneverbeperfect.
Ifspeculationwereperfect,itcouldsavepower,sinceitwould
reducetheexecutiontimeandsavestaticpower.
Whenspeculationisnotperfect,itrapidlybecomesenergy
inefficient,sinceitrequiresadditionaldynamicpower.
38
Slide 39
Performance and Efficiency in Advanced
Multiple-Issue Processors
WhatLimitsMultiple-IssueProcessors?
Focusingonimprovingclockrate:
Increasingtheclockratewillincreasetransistorswitching
frequencyanddirectlyincreasepowerconsumption.
Toachieveafasterclockrate,wewouldneedtoincreasepipeline
depth.
Deeperpipelines,incuradditionaloverheadpenaltiesaswellas
causinghigherswitchingrates.
39
Slide 40
40
This presentation is published only for educational purpose
[email protected]
Tags
simultaneous multithreading
multithreading
tlp
thread level parallelism
smt
Categories
Technology
Download
Download Slideshow
Get the original presentation file
Quick Actions
Embed
Share
Save
Print
Full
Report
Statistics
Views
20,161
Slides
40
Favorites
15
Age
3157 days
Related Slideshows
11
8-top-ai-courses-for-customer-support-representatives-in-2025.pptx
JeroenErne2
49 views
10
7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx
JeroenErne2
48 views
13
25-essential-ai-courses-for-user-support-specialists-in-2025.pptx
JeroenErne2
37 views
11
8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx
JeroenErne2
35 views
21
Know for Certain
DaveSinNM
23 views
17
PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx
novasedanayoga46
26 views
View More in This Category
Embed Slideshow
Dimensions
Width (px)
Height (px)
Start Page
Which slide to start from (1-40)
Options
Auto-play slides
Show controls
Embed Code
Copy Code
Share Slideshow
Share on Social Media
Share on Facebook
Share on Twitter
Share on LinkedIn
Share via Email
Or copy link
Copy
Report Content
Reason for reporting
*
Select a reason...
Inappropriate content
Copyright violation
Spam or misleading
Offensive or hateful
Privacy violation
Other
Slide number
Leave blank if it applies to the entire slideshow
Additional details
*
Help us understand the problem better