Distributed Database Database - Organized collection, structured data. Distributed – Shared, Spread out. Collection of data that is shared which is physically distributed over a computer network on different sites. Logically it’s a single DB divided into number of pieces called Fragments. Distributed databases use a client/server architecture to process information requests.
Types of Distributed Database Homogenous Distributed Database Systems Heterogeneous Distributed Database Systems Generic connectivity ODBC OLE DB Protocols
Concepts Database Links Defines a one-way communication path from server to another database server. Types of DB Links Public Private Connected
Distributed Data Storage Replication Redundantly at 2 or ore sites. Increased Availability Parallel Processing Constant Updates - Overhead Fragmentation Divided into smaller parts. Consistent No redundancy
Data Allocation Intelligent distribution of your data pieces. Performance and Availability Types: Centralized, Partition, and Replicated. Strategies: Data Fragmentation Dividing the database into part/sub-table. Horizontal fragmentation. Vertical fragmentation, Mixed or Hybrid fragmentation. Data Replication Copying of Data – Multiple locations.
Horizontal Fragmentation ID Name Age Marks 1 A 21 20 2 B 22 25 3 C 23 30 4 D 24 35 SELECT * FROM student WHERE marks < 35; ID Name Age Marks 1 A 21 20 2 B 22 25 SELECT * FROM student WHERE marks > 35; ID Name Age Marks 4 D 24 35 T 1 T 2 Types: Primary Derived Complete T = T 1 ∪ T 2 ∪ …. ∪ T N
Vertical Fragmentation SELECT N ame FROM Table; ID Name Age Marks 1 A 21 20 2 B 22 25 3 C 23 30 4 D 24 35 SELECT Age FROM Table; Name A B C D Age 21 22 23 24
Hybrid Fragmentation SELECT N ame FROM Table WHERE age = 22 ; ID Name Age Marks 1 A 21 20 2 B 22 25 3 C 23 30 4 D 24 35 Name B
Types of Data Replication Transactional Replication A complete copy of your database Copies of new data changes Database are synced in real-time Snapshot Replication Simplest type of Data Replication Current state at a specific in time Merge Replication Tracks subsequent data changes and schema modifications Synchronizes using merge agents
Structure of Distributed Database
Architecture Models of Distributed Database Systems Client-Server Architecture Peer-to-peer Architecture Multi DBMS Architecture
Trade-off in Distributed Database Trade-off CAP Theorem Consistency: Every read receives the most recent write or an error. Availability: Every request receives a response Partition tolerance: Continued Functioning .
Practical Implications Understand the requirements Choose the appropriate replication scheme Use appropriate data structures Plan for failures “The fate of your distributed system rests on your ability to make the right trade-offs. Choose wisely! “
Objectives of the Design of Data Distribution Processing locality- placing data as close as possible to the applications which use them . Availability and reliability of distributed data. Workload distribution. Storage costs and availability.
Distributed Database Design - Concept Centralized DB Issue Designing the "conceptual schema" - High Level Description - Main Concept and Relationships. Designing the "physical database," i.e., mapping the conceptual schema to storage areas and determining appropriate access methods . Distributed DB Issue Designing the fragmentation. Designing the allocation of fragments- mapping to physical image.
Two Strategies Top Down Designing systems from scratch M ostly in homogeneous systems
Bottom Up Approach When the databases already exist at a number of sites
Design of Distributed Database Top Down Design Designing the global schema, and we Designing the fragmentation of the database, and then by Allocating the fragments to the sites, and Creating the physical images Bottom Up The selection of a common database model for describing the global schema of the database. 2. The translation of each local schema into the common data model. 3. The integration of the local schemata into a common global schema. Loosely Coupled System