TASK AND DATA PARALLELISM in Computer Science pptx

dianachakauya 23 views 34 slides Mar 15, 2025
Slide 1
Slide 1 of 34
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34

About This Presentation

Data parallelism


Slide Content

TASK AND DATA PARALLELISM Tendero Manyonga m210887

Task parallelism and data parallelism are two fundamental concepts in parallel computing, each addressing different types of problems and optimizing performance in different ways

Task Parallelism Task parallelism is whereby different tasks are performed simultaneously Each task can be independent and may perform different operations or computations For example; The GZU web server handles multiple requests by students and staff members Each request can be processed in a separate thread, allowing the server to handle many users at once

Data Parallelism In data parallelism, the same operation is applied to different pieces of distributed data simultaneously. This approach is highly effective when working with large datasets. For example; when processing a large array, each element can be processed in parallel by different threads, applying the same function to each element (e.g. adding a constant value to each element)

Parallel Algorithm Design Strategies Parallel algorithms are computational methods that can execute multiple processes simultaneously to solve a problem more efficiently

Types of parallelism 1. Task Parallelism : Different tasks are executed in parallel 2. Data Parallelism : The same operation is applied to different pieces of data simultaneously

Divide and Conquer The divide and conquer strategy involves dividing the problem into sub problems, recursively solving them and then recombining them for the final answer Divide and Conquer is efficient for large scale problems with a predictable structure

steps 1. Divide: Split the original problem into smaller sub problems 2. Conquer: Solve each subproblem recursively. If the subproblems are small enough, solve them directly 3. Combine: Merge the solutions of the subproblems to form a solution to the original problem

advantages Reduced Complexity – by tackling smaller problems, it can simplify the solution process Parallelism- subproblems can often be solved in parallel, leveraging modern multi-core processors for improved performance

disadvantages Overhead – recursive calls can introduce overhead, especially if not optimized e.g. excessive stack usage Base Case Requirement – it requires a clear base case to stop the recursion, which can sometimes complicate design

examples 1. Merge Sort Divide: Split the array into two halves Conquer: recursively sort each half Combine: merge the two sorted halves into a single sorted array

2 Quick Sort Divide: select a pivot element and partition the array into elements less than and greater than the pivot Conquer: recursively sort the subarrays Combine: the sorted subarrays are concatenated with the pivot to form the final sorted array

3. Binary Search Divide: split the sorted array in half Conquer: determine if the target value is in the left or right half

Pipeline Parallelism Pipelining processes data through a series of stages (or pipelines), where each stage performs a specific task For example; In image processing, one stage might filter the image while another stage enhances colours Pipelining increases throughput by allowing multiple data items to be processed simultaneously at different stages Effective in applications involving streaming data, such as video processing, audio processing and real time data analysis

Key Characteristics Stages: the overall process is divided into distinct stages, each responsible for a part of the computation Concurrent Processing: while one stage processes data, other stages can concurrently process different data items Throughput Improvement: this approach enhances throughput by allowing continuous flow of data through the pipeline, reducing idle time for processing units

advantages Increased efficiency: by overlapping the process of multiple data items, pipeline parallelism can significantly improve the overall processing speed Modularity: each stage can be developed and optimized independently, allowing for easier maintenance and enhancements Scalability: New stages can be added or existing stages can be modified without disrupting the entire system

disadvantages Latency: if one stage is slower than others, it can create a bottleneck, delaying the entire pipeline Complexity in design: designing a pipeline requires careful consideration of data dependencies and synchronization between stages Buffer requirements: buffers may be needed to hold data between stages, increasing memory usage

Master-worker paradigms It is a model for parallel computing that organizes tasks among a central control unit (the master) and multiple worker nodes (workers) that execute the tasks It is widely used in distributed systems and high performance computing

KEY CHARACTERISTICS Master node: the master node is responsible for managing the overall computation, including task distribution, coordination and result collection Worker nodes: workers are responsible for executing the assigned tasks. They operate independently and report back to the master once their tasks are completed Task distribution: the master distributes tasks to workers based on their availability and workload, optimizing resource utilization

advantages Scalability- easily scales by adding more workers to handle increased workloads without major changes to the master Simplicity- simplifies programming models by separating task management from execution Fault tolerance- if a worker fails, the master can reassign its task to other workers available, improving system reliability

disadvantages Bottleneck at the master – the master can become a performance bottleneck if it is overloaded with task management or communication Communication overhead – frequent communication between the master and workers can introduce latency, especially in large systems Load imbalance – if tasks are not evenly distributed, some workers may finish early while others are still processing, leading to inefficiency

Use cases Distributed data processing e.g. MapReduce Load balancing in web servers Scientific simulations that require parallel computing

Load Balancing Techniques Load balancing is a critical technique in computer science used to distribute workloads across multiple computing resources, such as servers, a cluster, network links, or CPUs.

1. Round Robin Requests are distributed evenly across all servers in a circular order. It’s simple and effective for servers with similar capabilities. 2. Least Connections New requests are sent to the server with the fewest active connections. This is useful when servers may have varying loads. 3. Least Response Time Directs traffic to the server that has the lowest response time, optimizing for performance. 4. IP Hashing Requests are distributed based on the hash of the client’s IP address, ensuring that a user is consistently routed to the same server.

5. Weighted Load Balancing Servers are assigned weights based on their capacity. More powerful servers receive a larger proportion of the requests. 6. Random Requests are distributed randomly among the servers. This can be effective in certain scenarios but may lead to uneven loads. 7. Content-Based Routing Requests are routed based on the content they are requesting. This can help optimize performance for specific types of requests. 8. Health Checks Load balancers regularly check the health of servers and only route traffic to those that are operational.

9. Session Persistence (Sticky Sessions) Ensures that a user is always directed to the same server for the duration of their session, which is useful for applications that rely on session state. 10. Global Server Load Balancing (GSLB) Distributes traffic across multiple geographical locations, optimizing for location, latency, and availability.

Applications Web servers (e.g., nginx, apache ) Cloud computing environments Aws elastic load balancing) Distributed databases Microservices architectures

Parallel Programming Paradigm Parallel programming is a paradigm that enables the simultaneous execution of multiple computations, enhancing performance and efficiency in computing tasks.

KEY CONCEPTS Concurrency vs. Parallelism Concurrency refers to the ability to manage multiple tasks at once, which may not necessarily run simultaneously. Parallelism involves executing multiple tasks simultaneously, typically using multiple processors. Granularity Refers to the size of the tasks being executed in parallel. Fine granularity involves many small tasks, while coarse granularity involves fewer, larger tasks.

Synchronization Mechanisms used to manage the execution order of concurrent tasks to avoid conflicts, such as race conditions. Common synchronization techniques include locks, semaphores, and barriers.

Parallel Programming Models Shared Memory Model A ll processors share a single memory space. threads communicate by reading and writing to shared variables. this model is common in multi-threaded applications. Example: openmp , pthreads .

2.Distributed Memory Model Each processor has its own private memory. processes communicate by passing messages. this model is often used in clusters and supercomputers. Example: mpi (message passing interface). 3.Data Parallelism Focuses on distributing subsets of data across multiple processors, where the same operation is performed on each subset. This is common in scientific computing and image Task Parallelism Involves distributing tasks (functions or methods) across multiple processors. Each task can operate on different data or perform different operations. Example: Fork/Join model. Pipeline Parallelism Tasks are divided into stages, with each stage executed in parallel. The output of one stage is the input to the next, creating a pipeline of processing.

Tools and Libraries OpenMP: A widely used api for multi-platform shared memory multiprocessing programming in c, c++ , and fortran . MPI : A standard for message-passing used in distributed memory systems. CUDA: A parallel computing platform and application programming interface model created by nvidia for general computing on gpus . Threading Libraries: such as std::thread in c++ and threading in python, which facilitate multi-threaded programming.

Applications Data science Machine learning Scientific computing Graphics rendering
Tags