distributed memory architecture/ Non Shared MIMD Architecture
HBukhary
2,586 views
37 slides
Dec 24, 2017
Slide 1 of 37
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
About This Presentation
Details of Distributed MIMD architectures (Non-shared MIMD architecture , loosely Coupled MIMD Architecture) and Message Passing model.Differnt types of Distributed MIMD are Clusters,MPPs
Size: 2 MB
Language: en
Added: Dec 24, 2017
Slides: 37 pages
Slide Content
Distributed Memory Architecture MS(CS) - I Hafsa Habib Syeda Haseeba Khanam Amber Azhar Zainab Khalid Lahore College for Women University Department of Computer Science
Content MIMD processor classification Distributed MIMD architecture Basic difference between DM-MIMD and SM-MIMD Communication Techniques of DM-MIMD Major classification of DM-MIMD NUMA MPP Cluster Pros and Cons of DM-MIMD over SM-MIMD architecture Scalability Issues in scalability ‹#›
MIMD Architecture: Classification ‹#›
Non-Shared MIMD Architecture (DM-MIMD)
Non Shared MIMD Architecture Also called Distributed Memory MIMD or Message Passing MIMD Computers or Loosely coupled MIMD Processors have their own memory local memory Memory address for one processor does not map on other processors No concept of global address space Each processor operates independently because of its own local memory Changes in one processor’s local memory has no effect on other processor’s local memory Therefor cache synchronization and cache coherency does not apply. Inter Process Communication is done by Message Passing. ‹#›
DM-MIMD vs SM-MIMD ‹#›
DM-MIMD vs SM-MIMD DM-MIMD Private physical address space for each processor Data must be explicitly assigned to the private address space Communication/synchronization via network by Message Passing Concept of cache coherency does not apply because no global address space SM-MIMD Global address space shared by all Data is implicitly assigned to the address space. Cooperate by reading/writing same shared variable Communication through BUS Concept of cache coherency applies due to shared Global address space ‹#›
Content MIMD processor classification Distributed MIMD architecture Basic difference between DM-MIMD and SM-MIMD Communication Technique Major classification of DM-MIMD NUMA MPP Cluster Pros and Cons of DM-MIMD over SM-MIMD architecture Scalability Issues in scalability ‹#›
Communication Technique
Communication in DM Architecture Require a communication NETWORK to connect inter processor memory. Communication and Synchronization is done through Message Passing Model. Processor share data by explicitly send and receive information. Coordination is built into message passing primitives message SEND and message RECEIVE ‹#›
Why DM-Architecture use Message Passing Model? In Distributed memory architecture there is no global memory so it is necessary to move data from one local memory to another by means of message passing. ‹#›
Message Passing Model Communication via Send/Receive Through Interconnection Network Data is packed into larger packets Send sends message to another destination processor Receive indicates that a processor is ready to receive a message; message from another source processor ‹#›
Message Passing Model (cont’d) When a process interacts with another, two requirements have to be satisfied. Synchronization and Communication Synchronization in message passing model is either asynchronous or synchronous If Asynchronous , it means no acknowledgement is required at both ends(receiver and sender) Sender and receiver don’t wait for each other and can carry on their own computations while transfer of messages is being done. If synchronous, Acknowledgement is required. Both processors have to wait for each other while transferring the message. (one blocks until the second is ready) ‹#›
‹#›
Pros and Cons of Message Passing Model The advantage for programmers is that communication is explicit, so there are fewer “performance surprises” than with the implicit communication in cache-coherent SMPs. Synchronization is naturally associated with sending messages, reducing the possibility for errors introduced by incorrect synchronization Much easier for hardware designers to design Message sending and receiving is much slower It's harder to port a sequential program to a message passing multiprocessor . Pros Cons ‹#›
Communication in DM Architecture Vs Communication in SM Architecture
Sr.No. Difference Distributed Memory Architecture Shared Memory Architecture 1. Explicit Communication/Implicit Communication Explicit via Messages Implicit via Memory Operations 2. Who is Responsible for carrying communication task? Programmer is responsible to send and receive data Sending and receiving is automatic. System is Responsible for setting data in cache. Programmer just load from memory and store to memory. 3. Synchronization Automatic Can be Achieved using different mechanism 4. Protocols Fully under programmer control Hidden within the system ‹#›
Content MIMD processor classification Distributed MIMD architecture Basic difference between DM-MIMD and SM-MIMD Communication Techniques of DM-MIMD Major classification of DM-MIMD NUMA MPP Cluster Pros and Cons of DM-MIMD over SM-MIMD architecture Scalability Issues in scalability ‹#›
C lassification of Distributed Memory Architecture
Types of Distributed Memory Architecture DM-MIMD Architecture NUMA Clusters MPP ‹#›
NUMA (Non-Uniform memory Access) ‹#›
NUMA (Non-Uniform memory Access) NUMA is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors). The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users, There are two morals to this performance story. The first is that even a single 32-bit , but already commonplace, processor is starting to push the limits of standard memory performance. The second is that even conventional memory types differences play a role in overall system performance. So it should come as no surprise that NUMA support is now in server operating systems. e.g Microsoft’s Windows Server 2003 and in Linux 2.6 kernel. ‹#›
What is a Cluster Network of independent computers Each has private memory and OS Connected using I/O system E.g., Ethernet/switch, Internet Independent Computers in a cluster are called Node Master and computing Nodes Cluster Middleware is required Message Passing Interface Node management is to be considered Appear as a single system to user ‹#›
Clusters Clusters split problem in smaller tasks that are executed concurrently Why? Absolute physical limits of hardware components Economical reasons – more complex = more expensive Performance limits – double frequency <> double performance Large applications – demand too much memory & time Advantages: Increasing speed & optimizing resources utilization.greatly independent of hardware Disadvantages: Complex programming models – difficult development Applications Suitable for applications with independent tasks SuperComputers ,Web servers, databases, simulations, ‹#›
Clusters ‹#›
Clusters vs MPP Similar to MPPs Commodity processor and memory Processor performance must be maximized Memory Hierarchy includes remote memory Non Uniform Memory Access No shared memory - message passing ‹#›
Clusters vs MPPs Clusters In a cluster, each machine is largely independent of the others in terms of memory, disk, etc. They are interconnected using some variation on normal networking. The cluster exists mostly in the mind of the programmer and how s/he chooses to distribute the work. Best to use in servers with multiple independent tasks. MPPs In a Massively Parallel Processor, there really is only one machine with thousands of CPUs tightly Interconnected with I/O subsystem. MPPs have exotic memory architectures to allow extremely high speed exchange of intermediate results with neighboring processors. MPPs are of use only on algorithms that are embarrassingly parallel . ‹#›
Content MIMD processor classification Distributed MIMD architecture Basic difference between DM-MIMD and SM-MIMD Communication Technique Major classification of DM-MIMD NUMA MPP Cluster Pros and Cons of DM-MIMD over SM-MIMD architecture Issues in DM Architecture Scalability ‹#›
Pros & Cons of DM-MIMD over SM-MIMD
Pros of DM-MIMD over SM-MIMD DM-MIMD Memory is scalable with the number of processors.Increase the number of processors and the size of memory increases proportionately. Each processor can rapidly access its own memory without interference and without overhead with trying to maintain global cache concurrency. Cost effectiveness: can use commodity, off-the-shelf processors and networking. SM-MIMD Lack of scalability between memory and CPUs: Adding more CPUs can geometrically increases traffic on the shared memory CPU path,and geometrically increase traffic associated with cache memory management. Expense:it becomes increasingly difficult and expensive to design and produce shared memory machines with ever increasing number of processors. ‹#›
Cons of DM-MIMD over SM-MIMD DM-MIMD Non uniform memory access times-data residing on a remote node takes longer to access than local data. The programmer is responsible for many of the details associated with data communication processors . SM-MIMD Data sharing between tasks is both fast and uniform due to the proximity of memory to CPUs. Global address space provides a user-friendly programming perspective to memory. ‹#›
Issues of DM Architecture Latency and Bandwidth for accessing distributed memory is the main memory performance issues: Efficiency in parallel processing is usually related to ratio of time for calculation vs time for communication, the higher the ratio the higher the performance. Problem is more even severe when access to distributed memory is needed,since there is an extra level in the memory hierarchy,with latency and bandwidth that can be very slower than local memory access. ‹#›
Scalability and its issues A scalable architecture is an architecture that can scale up to meet increased work loads. In other words, if the workload all of a sudden exceeds the capacity of your existing software + hardware combination, you can scale up the system (software + hardware) to meet the increased workload. Scalability to more processor is the key issue Access times to “distinct processors should not be very much slower than access to “nearby” processors since non-local and collective (all-to-all) communication is important for many programs.This can be a problem for large parallel computers (hundreds or thousands of processors). Many different approaches to network topology and switching have been tried in attempting to alleviate this program. ‹#›
We Welcome your Questions , Suggestions and Comments! Thank you for your Attention ‹#›