Database that consists of two or more files located in different sites
Size: 105.5 KB
Language: en
Added: May 20, 2021
Slides: 20 pages
Slide Content
Distributed Databases A distributed database is a set of interconnected databases that is distributed over the computer network or internet. It manages the distributed database and provides mechanisms so as to make the databases transparent to the users
Distributed Databases Features Databases in the collection are logically interrelated with each other. Often they represent a single logical database. Data is physically stored across multiple sites. The processors in the sites are connected via a network. A distributed database is not a loosely connected file system.
Distributed Databases Advantages: Fast data processing Reliability and availability Reduced operating cost Easier to expand Improved sharing ability and local autonomy.
Distributed Databases Disadvantages: Complex to manage and control. The security issues must be carefully managed The system require deadlock handling during the transaction processing Need of standardization.
Distributed Databases Homogeneous Distributed Database: In this, all sites have identical database management system software. In such a system, local sites surrender a portion of their autonomy in terms of their right to change schemas or database management system software.
Distributed Databases Homogeneous Distributed Database: This software must also cooperate with other sites in exchanging information about transactions, to make transaction processing possible across multiple sites. It appears to user as a single system.
Distributed Databases Heterogeneous Distributed Database: In this, different sites may use different schemas, and different database management system software. The sites may not be aware of one another, and they may provide only limited facilities for cooperation in transaction processing.
Distributed Databases Data Storage: Replication: System maintains multiple copies of data, stored in different sites, for faster retrieval and fault tolerance Fragmentation: Relation is partitioned into several fragments stored in distinct sites
Distributed Databases Data Replication: The process of storing separate copies of the database at two or more sites. Full Replication: Entire relation is stored at all the sites. Partial Replication: Only some fragments of relation are replicated on the sites.
Distributed Databases Data Replication – Advantages: Availability Parallelism Faster Accessing Fault Tolerance Reduction in Network Load
Distributed Databases Data Replication – Disadvantages: Increased Storage Requirements Increased Cost and Complexity of Data Updating
Distributed Databases Data Fragmentation: A division of relation r into fragments r1, r2, r3… rn which contain sufficient information to reconstruct relation r.
Distributed Databases Data Fragmentation – Vertical Fragmentation: The fields or columns of a table are grouped into fragments. In order to maintain reconstructiveness , each fragment should contain the primary key field(s) of the table.
Distributed Databases Data Fragmentation – Vertical Fragmentation: Example: Student( RollNo , Marks, City) select RollNo from Student select City from Student.
Distributed Databases Data Fragmentation – Horizontal Fragmentation: In this approach, each tuple of r is assigned to one or more fragments. If relation R is fragmentation in r1 and r2 fragments, then to bring these fragments back to R we must use union operation.
Distributed Databases Data Fragmentation – Horizontal Fragmentation: Example: Select * from student where marks>50 and city=‘ chennai ’
Distributed Databases Transaction Processing: Transaction may access data at several sites Local and Global Transaction
Distributed Databases Transaction Processing – Transaction Manager: Maintaining a log for recovery purposes Participating in coordinating the concurrent execution of the transactions executing at that site
Distributed Databases Transaction Processing – Transaction Coordinator: Starting the execution of transactions that originate at the site. Distributing subtransactions at appropriate sites for execution