TASK AND DATA PARALLELISM in Computer Science pptx
dianachakauya
23 views
34 slides
Mar 15, 2025
Slide 1 of 34
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
About This Presentation
Data parallelism
Size: 219.22 KB
Language: en
Added: Mar 15, 2025
Slides: 34 pages
Slide Content
TASK AND DATA PARALLELISM Tendero Manyonga m210887
Task parallelism and data parallelism are two fundamental concepts in parallel computing, each addressing different types of problems and optimizing performance in different ways
Task Parallelism Task parallelism is whereby different tasks are performed simultaneously Each task can be independent and may perform different operations or computations For example; The GZU web server handles multiple requests by students and staff members Each request can be processed in a separate thread, allowing the server to handle many users at once
Data Parallelism In data parallelism, the same operation is applied to different pieces of distributed data simultaneously. This approach is highly effective when working with large datasets. For example; when processing a large array, each element can be processed in parallel by different threads, applying the same function to each element (e.g. adding a constant value to each element)
Parallel Algorithm Design Strategies Parallel algorithms are computational methods that can execute multiple processes simultaneously to solve a problem more efficiently
Types of parallelism 1. Task Parallelism : Different tasks are executed in parallel 2. Data Parallelism : The same operation is applied to different pieces of data simultaneously
Divide and Conquer The divide and conquer strategy involves dividing the problem into sub problems, recursively solving them and then recombining them for the final answer Divide and Conquer is efficient for large scale problems with a predictable structure
steps 1. Divide: Split the original problem into smaller sub problems 2. Conquer: Solve each subproblem recursively. If the subproblems are small enough, solve them directly 3. Combine: Merge the solutions of the subproblems to form a solution to the original problem
advantages Reduced Complexity – by tackling smaller problems, it can simplify the solution process Parallelism- subproblems can often be solved in parallel, leveraging modern multi-core processors for improved performance
disadvantages Overhead – recursive calls can introduce overhead, especially if not optimized e.g. excessive stack usage Base Case Requirement – it requires a clear base case to stop the recursion, which can sometimes complicate design
examples 1. Merge Sort Divide: Split the array into two halves Conquer: recursively sort each half Combine: merge the two sorted halves into a single sorted array
2 Quick Sort Divide: select a pivot element and partition the array into elements less than and greater than the pivot Conquer: recursively sort the subarrays Combine: the sorted subarrays are concatenated with the pivot to form the final sorted array
3. Binary Search Divide: split the sorted array in half Conquer: determine if the target value is in the left or right half
Pipeline Parallelism Pipelining processes data through a series of stages (or pipelines), where each stage performs a specific task For example; In image processing, one stage might filter the image while another stage enhances colours Pipelining increases throughput by allowing multiple data items to be processed simultaneously at different stages Effective in applications involving streaming data, such as video processing, audio processing and real time data analysis
Key Characteristics Stages: the overall process is divided into distinct stages, each responsible for a part of the computation Concurrent Processing: while one stage processes data, other stages can concurrently process different data items Throughput Improvement: this approach enhances throughput by allowing continuous flow of data through the pipeline, reducing idle time for processing units
advantages Increased efficiency: by overlapping the process of multiple data items, pipeline parallelism can significantly improve the overall processing speed Modularity: each stage can be developed and optimized independently, allowing for easier maintenance and enhancements Scalability: New stages can be added or existing stages can be modified without disrupting the entire system
disadvantages Latency: if one stage is slower than others, it can create a bottleneck, delaying the entire pipeline Complexity in design: designing a pipeline requires careful consideration of data dependencies and synchronization between stages Buffer requirements: buffers may be needed to hold data between stages, increasing memory usage
Master-worker paradigms It is a model for parallel computing that organizes tasks among a central control unit (the master) and multiple worker nodes (workers) that execute the tasks It is widely used in distributed systems and high performance computing
KEY CHARACTERISTICS Master node: the master node is responsible for managing the overall computation, including task distribution, coordination and result collection Worker nodes: workers are responsible for executing the assigned tasks. They operate independently and report back to the master once their tasks are completed Task distribution: the master distributes tasks to workers based on their availability and workload, optimizing resource utilization
advantages Scalability- easily scales by adding more workers to handle increased workloads without major changes to the master Simplicity- simplifies programming models by separating task management from execution Fault tolerance- if a worker fails, the master can reassign its task to other workers available, improving system reliability
disadvantages Bottleneck at the master – the master can become a performance bottleneck if it is overloaded with task management or communication Communication overhead – frequent communication between the master and workers can introduce latency, especially in large systems Load imbalance – if tasks are not evenly distributed, some workers may finish early while others are still processing, leading to inefficiency
Use cases Distributed data processing e.g. MapReduce Load balancing in web servers Scientific simulations that require parallel computing
Load Balancing Techniques Load balancing is a critical technique in computer science used to distribute workloads across multiple computing resources, such as servers, a cluster, network links, or CPUs.
1. Round Robin Requests are distributed evenly across all servers in a circular order. It’s simple and effective for servers with similar capabilities. 2. Least Connections New requests are sent to the server with the fewest active connections. This is useful when servers may have varying loads. 3. Least Response Time Directs traffic to the server that has the lowest response time, optimizing for performance. 4. IP Hashing Requests are distributed based on the hash of the client’s IP address, ensuring that a user is consistently routed to the same server.
5. Weighted Load Balancing Servers are assigned weights based on their capacity. More powerful servers receive a larger proportion of the requests. 6. Random Requests are distributed randomly among the servers. This can be effective in certain scenarios but may lead to uneven loads. 7. Content-Based Routing Requests are routed based on the content they are requesting. This can help optimize performance for specific types of requests. 8. Health Checks Load balancers regularly check the health of servers and only route traffic to those that are operational.
9. Session Persistence (Sticky Sessions) Ensures that a user is always directed to the same server for the duration of their session, which is useful for applications that rely on session state. 10. Global Server Load Balancing (GSLB) Distributes traffic across multiple geographical locations, optimizing for location, latency, and availability.
Parallel Programming Paradigm Parallel programming is a paradigm that enables the simultaneous execution of multiple computations, enhancing performance and efficiency in computing tasks.
KEY CONCEPTS Concurrency vs. Parallelism Concurrency refers to the ability to manage multiple tasks at once, which may not necessarily run simultaneously. Parallelism involves executing multiple tasks simultaneously, typically using multiple processors. Granularity Refers to the size of the tasks being executed in parallel. Fine granularity involves many small tasks, while coarse granularity involves fewer, larger tasks.
Synchronization Mechanisms used to manage the execution order of concurrent tasks to avoid conflicts, such as race conditions. Common synchronization techniques include locks, semaphores, and barriers.
Parallel Programming Models Shared Memory Model A ll processors share a single memory space. threads communicate by reading and writing to shared variables. this model is common in multi-threaded applications. Example: openmp , pthreads .
2.Distributed Memory Model Each processor has its own private memory. processes communicate by passing messages. this model is often used in clusters and supercomputers. Example: mpi (message passing interface). 3.Data Parallelism Focuses on distributing subsets of data across multiple processors, where the same operation is performed on each subset. This is common in scientific computing and image Task Parallelism Involves distributing tasks (functions or methods) across multiple processors. Each task can operate on different data or perform different operations. Example: Fork/Join model. Pipeline Parallelism Tasks are divided into stages, with each stage executed in parallel. The output of one stage is the input to the next, creating a pipeline of processing.
Tools and Libraries OpenMP: A widely used api for multi-platform shared memory multiprocessing programming in c, c++ , and fortran . MPI : A standard for message-passing used in distributed memory systems. CUDA: A parallel computing platform and application programming interface model created by nvidia for general computing on gpus . Threading Libraries: such as std::thread in c++ and threading in python, which facilitate multi-threaded programming.
Applications Data science Machine learning Scientific computing Graphics rendering