Parallel and distributed computing lecture 4

ahsanraees576 2 views 57 slides Oct 28, 2025
Slide 1
Slide 1 of 57
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57

About This Presentation

Parallel and distributed computing lecture 4


Slide Content

INTRODUCTION TO DISTRIBUTED SYSTEMS Engr. Muhammad Ahsan Raees

INTRODUCTION TO DISTRIBUTED SYSTEMS ` Definition Motivation for Distributed system Architectural Categories Characteristics, Issues, Goals, Advantages Disadvantages

DEFINITION A distributed system is a collection of independent computers, interconnected via a network, capable of collaborating on a task. A distributed system can be characterized as collection of multiple autonomous computers that communicate over a communication network and having following features: No common Physical clock Enhanced Reliability Increased performance/cost ratio Access to geographically remote data and resources Scalability 3

DEFINITION cntd … Distributed system is a collection of independent entities that cooperate to solve a problem that cannot be solved individually. So, basically it is nothing but a collection of computers. DCS do not share a common memory or do not have a common physical clock, and the only way they can communicate is through the message passing and for that they require a communication network

Definition of a Distributed System A distributed system is ( Tannenbaum ): A collection of independent computers that appears to its users as a single coherent system. A distributed system is ( Lamport ): One in which the failure of a computer you didn't even know existed can render your own computer unusable

Overview… Distributed system connects autonomous processors by communication network. The software component that run on each of the computers use the local operating system and network protocol stack. The distributed software is termed as middleware. The distributed execution is the execution of the processes across the distributed system to collectively achieve a common goal.

Centralized system All data and computational resources are kept and controlled in a single central place, such as a server, in a centralized system. Applications and users connect to this hub in order to access and handle data. Although this configuration is easy to maintain and secure, if too many users access it simultaneously or if the central server malfunctions, it could become a bottleneck.

Motivation for Distributed system Inherently distributed computation that is many applications such as money transfer in the banking, or reaching a consensus among the parties that are geographically distant, the computation is inherently distributed. Resource sharing the sharing of the resources such as peripherals, and a complete data set and so on and so forth. Access the geographically remote data and resources , such as bank database, supercomputer and so on. Reliability enhanced reliability possibility of replicating the resources and execution to enhance the reliability.

Architectures of Distributed systems Client-Server Architecture Peer-to-Peer (P2P) Architecture Three-Tier Architecture Microservices Architecture Service-Oriented Architecture (SOA )(Software Architecture) Event-Driven Architecture(Software Architecture)

Client-Server Architecture In this setup, servers provide resources or services, and clients request them. Clients and servers communicate over a network. Examples: Web applications, where browsers (clients) request pages from web servers.

Peer-to-Peer (P2P) Architecture Each node, or "peer," in the network acts as both a client and a server, sharing resources directly with each other. Examples: File-sharing networks like BitTorrent , where files are shared between users without a central server.

Three-Tier Architecture This model has three layers: presentation (user interface), application (business logic), and data (database). Each layer is separated to allow easier scaling and maintenance. Examples: Many web applications use this to separate user interfaces, logic processing, and data storage.

Microservices Architecture The application is split into small, independent services, each handling specific functions. These services communicate over a network, often using REST APIs or messaging. Examples: Modern web applications like Netflix or Amazon, where different services handle user accounts, orders, and recommendations independently.

Service-Oriented Architecture (SOA) Similar to microservices , SOA organizes functions as services. However, SOA typically uses an enterprise service bus (ESB) to manage communication between services. Examples: Large enterprise applications in finance or government, where different services handle various aspects of business processes.

Event-Driven Architecture Components interact by sending and responding to events rather than direct requests. An event triggers specific actions or processes in various parts of the system. Examples: Real-time applications like IoT systems, where sensors trigger actions based on detected events.

Architectural Categories Computer architectures consisting of interconnected, multiple processors are basically of two types: 1). Tightly coupled system 2). Loosely coupled system

TIGHTLY COUPLED SYSTEMS In these systems, there is a single system wide primary memory (address space) that is shared by all the processors . Usually tightly coupled systems are referred to as parallel processing systems. CPU CPU System- Wide Shared memory CPU Interconnection hardware CPU

LOOSELY COUPLED SYSTEMS In these systems, the processors do not share memory, and each processor has its own local memory .Loosely coupled systems are referred to as distributed computing systems, or simply distributed systems Local memory CPU Local memory CPU Local memory CPU Local memory CPU Communication network

CHARACTERISTICS OF DISTRIBUTED SYSTEM Concurrency No global clock Independent failures More reliable Fault tolerant Scalable

EXAMPLES OF DISTRIBUTED SYSTEMS Database Management System Automatic Teller Machine Network Internet/World-Wide Web Mobile and Ubiquitous Computing 21

DATABASE MANAGEMENT SYSTEM 22

AUTOMATIC TELLER MACHINE NETWORK 23

INTERNET 24 intranet ISP desktop computer: backbone satellite link server:  network link:   

WORLD-WIDE-WEB 25

WEB SERVERS AND WEB BROWSERS 26 Internet Browsers Web servers www.google.com www.uu.se www.w3c.org Protocols Activity.html http://www.w3c.org/Protocols/Activity.html http://www.google.comlsearch?q=lyu http://www.uu.se/ File system of www.w3c.org

MOBILE AND UBIQUITOUS COMPUTING 27 Laptop Mobile Printer Camera Internet Host intranet Home intranet GSM/GPRS Wireless LAN phone gateway Host site

Distributed System A distributed system organized as middleware. The middleware layer extends over multiple machines, and offers each application the same interface.

GOALS: COMMON HARACTERISTICS Making resources accessible Openness Transparency Security Scalability Failure Handling Concurrency Heterogeneity

Making resources accessible The main goal of a distributed system is to make it easy for the users (and applications) to access remote resources, and to share them in a controlled and efficient way. Resources can be just about anything, but typical examples include things like printers, computers, storage facilities, data, files, Web pages, and networks, Reasons to share resources. Economics.

OPENNESS An open distributed system is a system that offers services according to standard rules that describe the syntax and semantics of those services. Detailed interfaces of components need to be published. New components have to be integrated with existing components. An open distributed system should also be extensible. Differences in data representation of interface types on different processors (of different vendors) have to be resolved. 31

TRANSPARENCY Distributed systems should be perceived by users and application programmers as a whole rather than as a collection of cooperating components. Ability to hide the fact that process and resources are distributed . Transparency has different aspects. These represent various properties that distributed systems should have. 32

Transparency in a Distributed System

ACCESS TRANSPARENCY Enables local and remote information objects to be accessed using identical operations. Example: File system operations in NFS. Example: Navigation in the Web. Example: SQL Queries 34

LOCATION TRANSPARENCY Enables information objects to be accessed without knowledge of their location. Example: File system operations in NFS Example: Pages in the Web Example: Tables in distributed databases 35

CONCURRENCY TRANSPARENCY Enables several processes to operate concurrently using shared information objects without interference between them. Example: Automatic teller machine network Example: Database management system 36

REPLICATION TRANSPARENCY Enables multiple instances of information objects to be used to increase reliability and performance without knowledge of the replicas by users or application programs Example: Distributed DBMS Example: Mirroring Web Pages. 37

FAILURE TRANSPARENCY Enables the concealment of faults Allows users and applications to complete their tasks despite the failure of other components. Partial failure transparency is achievable but complete failure transparency is not possible Example: Database Management System 38

MIGRATION TRANSPARENCY Allows the movement of information objects within a system without affecting the operations of users or application programs Relocation Transparency: Situation in which resources can be relocated while they are being accessed without the user or application noticing anything. In such cases, the system is said to support relocation transparency. 39

PERFORMANCE TRANSPARENCY Allows the system to be reconfigured to improve performance as loads vary. Load should be evenly distributed among all the machines. 40

SCALING TRANSPARENCY Allows the system and applications to expand in scale without change to the system structure or the application algorithms. Example: World-Wide-Web Example: Distributed Database 41

HETEROGENEITY Variety and differences in Networks Computer hardware Operating systems Programming languages Implementations by different developers 42

SECURITY In a distributed system, clients send requests to access data managed by servers, resources in the networks: Doctors requesting records from hospitals Users purchase products through electronic commerce Security is required for: Concealing the contents of messages: security and privacy Identifying a remote user or other agent correctly (authentication) New challenges: Denial of service attack Security of mobile code 43

FAILURE HANDLING (FAULT TOLERANCE) Hardware, software and networks fail! Distributed systems must maintain availability even at low levels of hardware/software/network reliability . Fault tolerance is achieved by recovery redundancy 44

CONCURRENCY Components in distributed systems are executed in concurrent processes. Components access and update shared resources (e.g. variables, databases, device drivers). Integrity of the system may be violated if concurrent updates are not coordinated. 45

SCALABILITY Scalability of a system can be measured along at least three different dimensions scalability with respect to size: meaning that we can easily add more users and resources to the system. geographically scalable : system is one in which the users and resources may lie far apart. Administratively scalable: meaning that it can still be easy to manage even if it spans many independent administrative organizations.

SCALING TECHNIQUES Hiding communication latencies Asynchronous communication Allocate more job to client machine Distribution Distribution involves taking a component, splitting it into smaller parts, and subsequently spreading those parts across the system. An excellent example of distribution is the Internet Domain Name System (DNS) Replicate

4. BASIC DESIGN ISSUES Specific issues for distributed systems: Naming Communication Software structure System architecture Workload allocation Consistency maintenance 48

NAMING A name is resolved when translated into an interpretable form for resource/object reference. Communication identifier (IP address + port number) Name resolution involves several translation steps Design considerations Choice of name space for each resource type Name service to resolve resource names to comm. id. Name services include naming context resolution, hierarchical structure, resource protection 49

COMMUNICATION Separated components communicate with sending processes and receiving processes for data transfer and synchronization. Message passing: send and receive primitives synchronous or blocking asynchronous or non-blocking Abstractions defined: channels, sockets, ports. Communication patterns: client-server communication (e.g., RPC, function shipping) and group multicast 50

SOFTWARE STRUCTURE Layers in centralized computer systems: 51 Applications Middleware Operating system Computer and Network Hardware

SOFTWARE STRUCTURE Layers and dependencies in distributed systems: 52 Applications Distributed programming support Open services Open system kernel services Computer and network hardware

Challenges Performance Concurrency Failures Scalability System updates/growth Heterogeneity Openness Multiplicity of ownership, authority Security Quality of service/user experience Transparency Debugging

ADVANTAGES OF DISTRIBUTED SYSTEM Information Sharing among Distributed Users Resource Sharing Extensibility and Incremental growth Shorter Response Time and Higher Output Higher Reliability Better Flexibility’s in meeting User’s needs Better price/performance ratio Scalability Transparency 7

DISADVANTAGES OF DISTRIBUTED SYSTEM Difficulties of developing distributed software Networking Problem Security Problems Performance Openness Reliability and Fault Tolerance 8

Next Lecture •Introduction to Big Data Big Data Sources 5 V’s of Big Data Big Data Processing Frameworks ( Hadoop , Spark, and NoSQL Databases ) Introduction to Apache Hadoop Stack (HDFS, MapReduce , Sqoop , Zookeeper, HBase , Hive , Pig)

References: Tanenbaum , Andrew S., and Maarten Van Steen.  Distributed systems: principles and paradigms . Prentice-Hall, 2007. Sinha , Pradeep K.  Distributed operating systems: concepts and design . PHI Learning Pvt. Ltd., 1998. NOC:Distributed Systems ,NPTEL
Tags