Unit 1.pdfmVDLMlbjsdvljjjjjjjljbeelfbkadbf;bk;s

bsvkvamsi 16 views 49 slides Sep 16, 2025
Slide 1
Slide 1 of 49
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49

About This Presentation

avasbljbsbcvjsabvljbdljabljsavjlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllJivIaaHfcCmirtwbsXEogVtOyNOnSEPmjpeErGOuTzeoKUFENrcjdnUEoBUKcdLTGypVsUNwdyioUOmmJqBERzaGRvkeCnDxMJGOOkqphuDZsdyRJfCoEzHcfe...


Slide Content

High-Performance and Cloud Computing
(23AID304)
Dr. K Dinesh Kumar
Assistant Professor (Sr. Gd.)
School of Computing
Amrita Vishwa Vidyapeetham

Unit I
IntroductiontobasicarchitectureandOSconcepts–architectureof
parallelcomputing–sharedanddistributionmemoryinparallel
computing–parallelalgorithm–performancemetricesofparallel
algorithm.

WhatisComputing?(Introtobasicarchitectures)
•Theutilizationofcomputerstocompleteatask.computerdevicethat
combinesmobiletelephonefunctionsandcomputingfunctionsintoone
•Itinvolvesbothhardware&softwarefunctionsperformingsomesortoftask
withacomputer.
•Examplesofcomputingbeingusedineverydaylife:
sendinganemail,swipingcredit/debitcardsetc.
•Computingisanyactivitythatusescomputersfromdesigningandbuilding
softwareandhardwaretoanalysedata,processdata,communicateandsolving
complexproblemstohelppushhumanity.
3

Paradigm shift ?

Distributed Computing
“Adistributedcomputingisacollectionofindependentcomputersthatappearstoits
usersasasinglecoherentsystem”.
solveasinglelargeproblembybreakingitdownintoseveraltaskswhereeachtaskis
computedintheindividualcomputersofthedistributedsystem.
Adistributedcomputingsystemconsistsofmorethanoneselfdirectedcomputerthat
communicatesthroughanetwork.
Allthecomputersconnectedinanetworkcommunicatewitheachothertoattaina
commongoalbymakinguseoftheirownlocalmemory.
SupportsFaultTolerancemechanism.

Distributed Computing…
•Composed of multiple independent components and are perceived as a single entity by users
•The primary purpose of distributed systems is to share resources and utilize them better.
•The goal of distributed computing is to distribute a single task among multiple computers and to
solve it quickly by maintaining coordination between them

Distributed Computing

Distributed Computing System -Examples
1.World Wide Web
2.Social Media Giant Facebook
3.Hadoop's Distributed File System (HDFS)
4.ATM
5.Cloud Network Systems(Specialized form of Distributed Computing
Systems)
6.Google Bots, Google Web Server, Indexing Server
7.Netflix, Amazon Prime….

Advantages of Distributed Computing
Distributedcomputingmakesallcomputersintheclusterworktogetherasiftheywereonecomputer.
1.Scalability.Thesystemcaneasilybeexpandedbyaddingmoremachinesasneeded.Higherloadscanbehandledby
simplyaddingnewhardware(versusreplacingexistinghardware).
2.Performance.Throughparallelisminwhicheachcomputerintheclustersimultaneouslyhandlesasubsetofan
overalltask,theclustercanachievehighlevelsofperformancethroughadivide-and-conquerapproach.
3.Redundancy.Duplicationofcriticalresourcestoimprovesystemreliability.Severalmachinescanprovidethesame
services,soifoneisunavailable,workdoesnotstop
4.Resilience.Howwellasystemrecoversfromfailure.Distributedcomputingclusterstypicallycopyor“replicate”data
acrossallcomputerserverstoensurethereisnosinglepointoffailure.
5.Cost-effectiveness.Distributedcomputingtypicallyleverageslow-cost,commodityhardware,makinginitial
deploymentsaswellasclusterexpansionsveryeconomical.
6.Highavailability
7.Reliability-Theprobabilitythatasystemwillfunctionasexpected
8.Transparency
9.Faulttolerance:Abletocontinueprovidingaserviceintheeventofafailure

1. Clusters

2. Grid Computing
Acomputernetworkconsistingofanumberof
computersystemsconnectedinagridtopology
Accesslargecomputationalpower,hugestorage
facilities,andavarietyofservices.
Userscan“consume”resourcesinthesamewayas
theyuseotherutilitiessuchaspower,gas,andwater.
GeographicallydispersedclustersbymeansofInternet
connections.Theseclustersbelongedtodifferent
organizations,andarrangementsweremadeamong
themtosharethecomputationalpower.

3. Cloud Computing
Cloud User Cloud Provider
Computing
Resources
Request
resources
Allocate
resources
Consume
resources
Maximize
performance
Finishing time
Cost
Maximize
revenue
Utilization

4. Fog Computing

5. Edge Computing

High Performance Computing
•Highperformancecomputing(HPC)isthepracticeofaggregatingcomputing
resourcestogainperformancegreaterthanthatofasingleworkstation,server,
orcomputer.
•HPCisatechnologythatusesclustersofpowerfulprocessorsthatworkin
paralleltoprocessmassive,multidimensionaldatasetsandsolvecomplex
problemsatextremelyhighspeeds.
•HPCcantaketheformofcustom-builtsupercomputersorgroupsofindividual
computerscalledclusters.
•HPCsolvessomeoftoday'smostcomplexcomputingproblemsinreal-time.
HPCsystemstypicallyrunatspeedsmorethanonemilliontimesfasterthan
thefastestcommoditydesktop,laptoporserversystems.
15

How does HPC work?
•Massivelyparallelcomputing:Parallelcomputingrunsmultipletasks
simultaneouslyonnumerouscomputerserversorprocessors.
•HPCclusters:AnHPCclustercomprisesmultiplehigh-speedcomputer
serversnetworkedwithacentralizedschedulerthatmanagestheparallel
computingworkload.
•High-performancecomponents:Alltheothercomputingresourcesinan
HPCcluster—suchasnetworking,memory,storageandfilesystems—arehigh
speedandhighthroughput.
•Messagepassinginterface(MPI):HPCworkloadsrelyonamessagepassing
interface(MPI),astandardlibraryandprotocolforparallelcomputer
programmingthatallowsuserstocommunicatebetweennodesinaclusteror
acrossanetwork.
16

Benefits of HPC in the Cloud
HPC in the cloud allows organizations to apply many compute assets to solve
complex problems and provides the following benefits:
•Quickly configure and deploy intensive workloads.
•Reduce time to results through scaling with on-demand capacity.
•Gain cost-efficiency by harnessing technology to meet your needs and pay
only for the compute power you use.
•Use cloud provider management tools and support to architect your specific
HPC workloads.
17

HPC Use Cases
•HPCapplicationshavebecomesynonymouswithAI,particularlymachine
learning(ML)anddeeplearningapps.Today,mostHPCsystemsaredesigned
withtheseworkloadsinmind.
•Fromdataanalysistocutting-edgeresearch,HPCisdrivingcontinuous
innovationinusecasesacrossthefollowingindustries:
•Healthcare,genomicsandlifesciences
•Mediaandentertainment
•Bankingandfinancialservices
•Governmentanddefense
•EnergyandAutomotiveindustry
•Cybersecurityetc.
18

OS Concepts
•AnOSisapersistentprogramthatcontrolstheexecutionofapplication
programs.
•Itistheprimaryinterfacebetweenuserapplicationsandsystemhardware.
•TheprimaryfunctionalityoftheOSistoexploitthehardwareresourcesofone
ormoreprocessors,provideasetofservicestosystemusers,andmanage
secondarymemoryandinput/output(I/O)devices,includingthefilesystem.
•TheOSobjectivesareconvenienceforendusers,efficiencyofsystem
resourceutilization,reliabilitythroughprotectionbetweenconcurrentjobs,and
extensibilityforeffectivedevelopment,testing,andintroductionofnew
systemfunctionswithoutinterferingwithongoingservice.
19

20
The OS is one of the key layers in the total computer system stack.

TheOSincorporatesallservicesandfacilitiesofasupercomputer.
•Itholdsthelocaldirectoryandfilessystem.
•Itcontrolstheallocationofthehardwareresources.
•Itgovernstheschedulingofuserjobs.
•Itstorestemporaryresults.
•Itprovidesmanyofthehigh-levelprogrammingtoolsandfunctions.
•Itexportstheuserinterfacetothesystemforallusercommands.
•Itprotectsuserprogramsfromerrorscausedbyotherrunningapplications.
•ItsupportsuseraccesstosystemI/O,networks,andremotesites.
•Itprovidesfirewallsforsecurityofoperationanddatastorage.
21

Operating System Structures and Services
•SystemComponents:OSsarecomplexsoftwarepackagesconsistingofmany
separatebutinterrelatedcomponents.Thesecomponentsindividuallyorin
combinationachievethefunctionsanddelivertheservicesrequiredbythe
usersdirectlyorforsystemmanagementandcontrol.
•ProcessManagement:Userandsystemprogramsthatareexecutingaremade
upofinstancescalled“processes”,whichareinstantiationsofprogram
procedures(codetext).
•MemorymanagementandFilemanagement
•I/OSystemManagement
•SecondaryStorageManagement
22

Process Management
•Aprocessisaprogramorsubprograminexecution.Itisaunitofworkwithinthe
systemthatisperformedtocompletion.
OSprocessmanagementtasksare:
•ProcessStates:ready,running,wait,terminated.
•Process Control Block: Each process being managed by an OS is represented by a
dedicated data structure referred to as a “process control block” (PCB).
•Process Management Activities
•Scheduling
23

Threads
•Threadspresentanotherlevelofparallelismcontrolwithinaprocess.
•Threadsarealsocalledlightweightprocessesastheypossesssomeofthe
propertiesofprocesses.Eachthreadbelongstoexactlyoneprocess.
TypesofThreadinOperatingSystem
•UserLevelThread:ThesearenotdirectlymanagedbytheOSbutratherbya
runtimesoftwaresysteminuserspace.
•KernelLevelThread:ThreadsdirectlymanagedbytheOSandOSallocates
specifickernelthreadstounderlyinghardwareprocessorcores.
24

Memory Management
•Memorymanagementcontrolswhatdataisinthemainmemoryandregistersatany
particulartime.
•VirtualMemory:Virtualmemoryisapowerfulabstractionfornamingmemory
blocksindependentoftheirphysicallocation,supportedbytheOS.Itis
implementedbythememorymanagementpartoftheOS.
•VirtualPageAddresses:Thereareanumberofwaystoorganizedataforuseand
storage.Amongtheseisthesimpleconceptofa“page”,whichisafixed-length
contiguousblockofdatathatcanbemappedontoanequivalent-sizedblockof
physicalmemoryorsimilarspaceonsecondarystorage.
•VirtualAddressTranslation:TheOSincorporatesapagetablewhichiscentralto
themethodforassociatingvirtualpageaddresses(orpagenumbers)tophysical
locationswithinthemainmemoryorsecondarystorage.
25

Architecture of Parallel Computing
•WhatarethedrawbacksofSerialComputingParadigm?
•WhydoweneedaParallelComputing?
•HowparallelcomputingovercometheissuesofSerial
Computing?
26

Serial Computing Paradigm
•Computersoftwarewaswrittenconventionallyforserialcomputing.Thismeantthat
tosolveaproblem,analgorithmdividestheproblemintosmallerinstructions.
•ThesediscreteinstructionsarethenexecutedontheCentralProcessingUnitofa
computeronebyone.Onlyafteroneinstructionisfinished,nextonestarts.
•Serialcomputing,alsoknownassequentialcomputing,isatypeofcomputingwhere
instructionsforsolvingcomputeproblemsarefollowedoneatatimeorsequentially.
•Thefundamentalsofserialcomputingrequiresystemstouseonly
oneprocessorratherthandistributingproblemsacrossmultipleprocessors.
27

Serial Computing Approach
•A problem is broken into a discrete series of instructions
•Instructions are executed sequentially one after another
•Executed on a single processor
•Only one instruction may execute at any moment in time
•Drawback: Huge waste of hardware resources as only one part of the hardware will be
running for particular instruction and of time. 28

Parallel Computing Paradigm
•Parallelcomputing,alsoknownasparallelprogramming,isaprocesswherelarge
computeproblemsarebrokendownintosmallerproblemsthatcanbesolved
simultaneouslybymultipleprocessors.
•Theprocessorscommunicateusingsharedmemoryandtheirsolutionsarecombined
usinganalgorithm.
•Ascomputerscienceevolved,parallelcomputingwasintroducedbecauseserial
computinghadslowspeeds.
•Operatingsystemsusingparallelprogrammingallowcomputerstorunprocessesand
performcalculationssimultaneously,atechniqueknownasparallelprocessing.
29

Parallelcomputingisthe
simultaneoususeofmultiple
computeresourcestosolvea
computationalproblem:
•Aproblemisbrokenintodiscrete
partsthatcanbesolved
concurrently
•Eachpartisfurtherbrokendown
toaseriesofinstructions
•Instructionsfromeachpart
executesimultaneouslyon
differentprocessors
•Anoverallcontrol/coordination
mechanismisemployed
30

ParallelComputers
•Virtuallyallstand-alonecomputers
todayareparallelfromahardware
perspective:
•Multiplefunctionalunits(L1cache,
L2cache,branch,prefetch,decode,
floating-point,graphicsprocessing
(GPU),integer,etc.)
•Multipleexecutionunits/cores
•Multiplehardwarethreads
31

32
Networksconnectmultiplestand-alonecomputers(nodes)tomakelarger
parallelcomputerclusters.

Advantages of Parallel Computing
•SaveTimeand/orMoney
•SolveLarger/MoreComplexProblems/Increasedefficiencies
•ProvideConcurrency/Fasteranalytics
•TakeAdvantageofNon-localResources
•MakeBetterUseofUnderlyingParallelHardware
•Etc.
33

Parallel Computing Use cases & Applications
•Atmosphere, Earth, Environment
•Physics -applied, nuclear, particle, condensed matter, high pressure, fusion, photonics
•Bioscience, Biotechnology, Genetics
•Chemistry, Molecular Sciences
•Geology, Seismology
•Mechanical Engineering -from prosthetics to spacecraft
•Electrical Engineering, Circuit Design, Microelectronics
•Computer Science, Mathematics
•"Big Data," databases, data mining
•Artificial Intelligence (AI)
•Web search engines, web based business services
•Medical imaging and diagnosis
•Defense, Weapons, etc.
34

Types of Parallelism
•Bit-levelparallelism:Itistheformofparallelcomputingwhichisbasedonthe
increasingprocessor'ssize.Itreducesthenumberofinstructionsthatthesystem
mustexecuteinordertoperformataskonlarge-sizeddata.
•Instruction-levelparallelism:Instruction-levelparallelism(ILP)isatype
ofparallelcomputingwheretheprocessorchooseswhichinstructionsitwillrun.
•TaskParallelism:Taskparallelismemploysthedecompositionofataskinto
subtasksandthenallocatingeachofthesubtasksforexecution.
•Data-levelparallelism(DLP):Instructionsfromasinglestreamoperate
concurrentlyonseveraldata.Limitedbynon-regulardatamanipulationpatternsand
bymemorybandwidth.
35

Different Parallel Computing Architectures
•Sharedmemory
Sharedmemoryarchitectureisusedincommon,everydayapplicationsof
parallelcomputing,suchaslaptopsorsmartphones.Inashared
memoryarchitecture,parallelcomputersrelyonmultipleprocessorstocontact
thesamesharedmemoryresource.
•Distributedmemory
Distributedmemoryisusedincloudcomputingarchitectures,makingit
commoninmanyenterpriseapplications.Inadistributedsystemforparallel
computing,multipleprocessorswiththeirownmemoryresourcesarelinked
overanetwork.
36

Different Parallel Computing Architectures
•Hybridmemory
Today’ssupercomputersrelyonhybridmemoryarchitectures,aparallelcomputing
systemthatcombinessharedmemorycomputersondistributedmemorynetworks.
ConnectedCPUsinahybridmemoryenvironmentcanaccesssharedmemoryandtasks
assignedtootherunitsonthesamenetwork.
•Specializedarchitectures
Inadditiontothethreemainarchitectures,otherlesscommonparallelcomputerarchitectures
aredesignedtotacklelargerproblemsorhighlyspecializedtasks.Theseinclude
vectorprocessors—forarraysofdatacalled“vectors”—andprocessorsforgeneral-purpose
computingongraphicsprocessingunits(GPGCUs).Oneofthese,CUDA,aproprietary
GPGCUapplicationprogramminginterface(API)developedbyNvidia,iscriticaltodeep
learning(DL),thetechnologythatunderpinsmostAIapplications.
37

Shared Memory in Parallel Computing
•Inashared-memorysystem,thecorescanshare
accesstothecomputer’smemory;inprinciple,
eachcorecanreadandwriteeachmemory
location.
•Inashared-memorysystem,wecancoordinate
thecoresbyhavingthemexamineandupdate
shared-memorylocations.
•Multipleprocessorscanoperateindependently
butsharethesamememoryresources.
38

Shared Memory in Parallel Computing
•SharedmemorymachineshavebeenclassifiedasUniform
MemoryAccess(UMA)andNon-UniformMemoryAccess
(NUMA),baseduponmemoryaccesstimes.
•UMAiscommonlyrepresentedtodaybySymmetric
multiprocessor(SMP)machines.Identicalprocessorsandequal
accesstimestomemory.
•NUMAoftenmadebyphysicallylinkingtwoormoreSMPs.
OneSMPcandirectlyaccessmemoryofanotherSMP.Notall
processorshaveequalaccesstimetoallmemories.
39

Distributed Memory in Parallel Computing
•Inadistributedmemorysystem,eachcorehasitsown,privatememory,andthe
coresmustcommunicateexplicitlybydoingsomethinglikesendingmessages
acrossanetwork.
•Processorshavetheirownlocalmemory.Memoryaddressesinoneprocessordonot
maptoanotherprocessor,sothereisnoconceptofglobaladdressspaceacrossall
processors.
•Whenaprocessorneedsaccesstodatainanotherprocessor,itisusuallythetaskof
theprogrammertoexplicitlydefinehowandwhendataiscommunicated.
Synchronizationbetweentasksislikewisetheprogrammer'sresponsibility.
40

Hybrid Distributed-Shared Memory
•Aparallelcomputingsystemthatcombinessharedmemorycomputerson
distributedmemorynetworks.
41

Parallel Algorithms
•Aparallelalgorithmisonewherethetaskscouldallbeperformedinparallelat
thesametimeduetotheirdataindependence.
•Example:Webserverwhereeachincomingrequestcanbeprocessed
independentlyfromotherrequests.
•Example:Multitaskinginoperatingsystemswheretheoperatingsystemdeals
withseveralapplicationslikeawebbrowser,awordprocessor,andsoon.
42

Parallelalgorithmforagivenproblemmaybedevelopedusingoneormore
ofthefollowing:
•Detectandexploittheinherentparallelismavailableintheexistingsequential
algorithm.
•Independentlyinventanewparallelalgorithm.
•Adaptanexistingparallelalgorithmthatsolvesasimilarproblem.
43

Types of Parallel Algorithms –Few Examples
•DivideandConquer:Breakstheproblemintosubproblems,solvesthemin
parallel,thencombinesresults.
•MapReduceModel:Commoninbigdataprocessing.Appliesamapfunctionand
thenareducefunctioninparallel.High-levelmodelfordistributeddataprocessing
usingkey-valuepairs.
•ParallelSearchAlgorithms:Searchelementsinlargedatasetsusingmultiple
threads.
•GraphAlgorithms:BFS,DFS,andshortestpathalgorithmsadaptedforparallel
environments.
44

Challenges in Designing Parallel Algorithms
•Loadbalancing(equalworkdistribution)
•Communicationoverheadbetweenprocessors
•Synchronizationcost
•Datadependencyandraceconditions
•Scalabilitylimitations
45

Performance Metrics of Parallel Algorithm
•Parallelalgorithmsaredesignedtoimprovethecomputationspeedofa
computer.ForanalyzingaParallelAlgorithm,wenormallyconsiderthe
followingparameters:
Timecomplexity(ExecutionTime),
Totalnumberofprocessorsused,
Totalcost.
Keymetricsusedinperformanceanalysis:
•ExecutionTime(Tₚ):Thetimetakenbyaparallelalgorithmtoexecuteonp
processors.
46

•Speedup(Sₚ):Howmuchfasteraparallelalgorithmiscomparedtoits
sequentialcounterpart.
�
??????=
�1
�??????
�
1=Executiontimeofthebest-knownsequentialalgorithm.
�
??????=Executiontimeoftheparallelalgorithmonpprocessors.
•Efficiency(Eₚ):Measurestheutilizationofprocessors.
??????
??????=
�
??????
??????
=
�1
??????.�??????
47

•Scalability:Abilityofaparallelalgorithmtomaintainefficiencyasthe
numberofprocessorsincreases.
•Cost(Cₚ):Totalcomputationalworkdoneusingpprocessors.
Cₚ=??????.�
??????
•LoadBalancing:Howevenlytheworkloadisdistributedamongprocessors.
•CommunicationOverhead:Timespentoncommunicationbetween
processors(dataexchange,synchronization).
48

49