Module 4_PART1.pptx

Haso12 57 views 83 slides Feb 01, 2023
Slide 1
Slide 1 of 83
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83

About This Presentation

Dbms


Slide Content

www.cambridge.edu.in Department of Information Science & Engineering MODULE 4

Module 4 Normalization: Database Design Theory

Introduction to Normalization using Functional and Multivalued Dependencies So for, we learned database design using ER-DIAGRAM, SCHEMA DIAGRAM, and RELATIONSHIP MODEL but all will not measure of appropriateness or goodness to measure the QUALITY OF THE DESIGN. In this module we will learn some of the theory that has been developed with the goal of evaluating relational schema for design quality

T wo levels at which we can discuss the goodness of relation schemas. The logical (or Conceptual ) Level: How users interpret the relation schemas and the meaning of their attributes. Having good relation schemas at this level enables users to understand clearly the meaning of the data in the relations, and Hence to formulate their queries correctly . The implementation (or physical storage) level : How the tuples in a base relation are stored and updated . The RELATIONAL DATABASE DESIGN THEORY developed mainly to BASE RELATION

DATABASE DESIGN MAY BE PERFORMED USING TWO APPROACHES

Key difference between bottom up design and top down desgin Sr. No. Key Bottom-Up Model Top-Down Model 1 Redundancy Bottom-Up model is better suited as it ensures minimum data redundancy and focus is on re-usability. Top-down model has high ratio of redundancy as the size of project increases. 2 Interaction Bottom-Up model have high interactivity between various modules. Top-down model has tight coupling issues and low interactivity between various modules. 3 Approach Bottom-up model is based on composition approach. Top-down model is based on decomposition approach.

Relational database design ultimately produces a set of relations. The implicit goals of the design activity are information preservation and minimum redundancy . Information preservation : maintaining all concepts, including Attribute types Entity type Relationship types Minimizing Redundancy: Implies minimizing redundant storage of the same information MOTIVATION OF RELATIONAL DATABASE DESIGN

EXAMPLE: CORRECT SCHEMA: RollNo StudName Gender StudDept Student DeptName OfficePhone HOD Department RollNo StudName Gender DeptName Office Phone HOD INCORRECT SCHEMA: Problem with bad schema Office Phone & HOD information stored redundantly once with each student that belongs to the department Wastage of disk space A program that updates Office Phone of a Department must change it at several places More running time Error prone

Informal Design Guidelines for Relation Schemas Before discussing the formal theory of relational database design, we discuss four informal guidelines that may be used as measures to determine the quality of relation schema design : Making sure that the semantics of the attributes is clear in the schema Reducing the redundant information in tuples Reducing the NULL values in tuples Disallowing the possibility of generating spurious tuples

Making sure that the semantics of the attributes is clear in the schema Imparting Clear Semantics to Attributes in Relations Whenever we group attributes to form a relation schema, we assume that attributes belonging to one relation have certain real-world meaning and a proper interpretation associated with them The semantics of a relation refers to its meaning resulting from the interpretation of attribute values in a tuple

GUIDELINE 1: Design a relation schema so that it is easy to explain its meaning . Do not combine attributes from multiple entity types and relationship types into a single relation. It is straightforward to explain its meaning . Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation Only foreign keys should be used to refer to other entities  Entity and relationship attributes should be kept apart as much as possible.

CORRECT SCHEMA DESIGN The meaning of the EMPLOYEE relation schema is simple: Each tuple represents an employee, with values for the employee’s name ( Ename ), Social Security number ( Ssn ), birth date Bdate ), and address ( Address), and the number of the department that the employee works for ( Dnumber ). The Dnumber attribute is a foreign key that represents an implicit relationship between EMPLOYEE and DEPARTMENT.

Examples of Violating Guideline 1 or INCORRECT RELATION SCHEMA DESIGN SSN ENAME BDATE ADDRESS DNUMBER DNAME DMGR_SSN EMP_DEPT SSN PNUMBER HOURS ENAME PNAME PLOCATION EMP_PROJ_WORKSON FD1 FD2 FD3 Note : A relation schema should always correspond to one entity type or  onerelationship type, for the meaning to be clear

2. Redundant Information in Tuples and Update Anomalies If Redundant values are not reduced . We face the following problems . Redundant information increases storage space . Redundant information in Tuples introduces “UPDATE ANAMOLIES ”.

EXAMPLE: let us consider BASE TABLE EMP_DEPT ( Redundancy data )

Redundancy Data

Storing natural joins of base relations leads to an additional problem referred to as update anomalies . These can be classified into insertion anomalies, deletion anomalies , and modification anomalies.

INSERTION ANOMALIES Insertion anomalies can be differentiated into two types, illustrated by the following examples based on the EMP_DEPT relation: CONSISTENTCY INCONSISTENCY IF WE HAVE TWO SEPARATE TABLE IF WE HAVE SINGLE TABLE LIKE EMP_DEPT we do not have to worry about this inconsistency problem because we enter only the department number in the employee tuple; all other attribute values of department 5 are recorded only once in the database, as a single tuple in the DEPARTMENT relation. To insert a new employee tuple into EMP_DEPT, we must include either the attribute values for the department that the employee works for , or NULLs (if the employee does not work for a department as yet). This problem does not occur in the design of SEPARATE TABLE. because a department is entered in the DEPARTMENT relation whether or not any employees work for it, and whenever an employee is assigned to that department, a corresponding tuple is inserted in EMPLOYEE It is difficult to insert a new department that has no employees as yet in the EMP_DEPT relation. The only way to do this is to place NULL values in the attributes for employee. This violates the entity integrity for EMP_DEPT because its primary key Ssn cannot be null.

Deletion Anomalies If we delete from EMP_DEPT an employee tuple that happens to represent the last employee working for a particular department , the information concerning that department is lost inadvertently from the database.

Modification Anomalies In EMP_DEPT, if we change the value of one of the attributes of a particular department—say, the manager of department 5—we must update the tuples of all employees who work in that department; otherwise, the database will become inconsistent. If we fail to update some tuples, the same department will be shown to have two different values for manager in different employee tuples, which would be wrong.

GUIDELINE 2: Guideline for Redundant Information in Tuples and Update Anomalies Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any anomalies present, then note them so that applications can be made to take them into account.

3. NULL Values in Tuples GUIDELINE 3: If possible avoid placing attributes in a base relation whose values may frequently be null. If nulls are unavoidable , make sure they apply in exceptional cases only and not to majority of tuples in a relation . „ Problems with null values: „ Waste of disk space „ Problem of understanding the meaning of attributes „ Problems in specifying JOIN operations „ Problems in applying some aggregate functions „ May have multiple interpretations (not applicable, unknown, unavailable)

4 .Generation of Spurious Tuples – avoid at any cost Consider the two relation schemas EMP_LOCS and EMP_PROJ1 in Figure 14.5(a), which can be used instead of the single EMP_PROJ relation in Figure 14.3(b). Figure 14.3(b).

Using natural join will get spurious data

GUIDLINE 4: Design relation schemas so that they can be joined with equality conditions on attributes, that are(primary key, foreign key) pairs in a way that guarantees that no spurious Avoid table that contain matching attributes that are not (foreign key, primary key) combination because joining on such attribute may produce spurious tuples

Normalization of Relational schema Normalization of a Relation schema refers to the process of: Identifying the data dependencies in the schema; basically there are three types of data dependencies Functional Dependencies(FDs) Multi-valued Dependencies(MVDs) Join Dependencies(JDs) Identifying those data dependencies, that would cause anomalies during insert, update and delete operations on the tables created on the schema.

To avoid this anomalies from the table, we can follow the decomposing the schemas Attribute Preservation : Each attribute of the schema must be represented in at least one of the sub-schemas.(mandatory requirement) The decomposition must be a LOSS-LESS-JOIN DECOMPOSITION (also known as NOD-ADDITIVE DECOMPOSITION). It implies that the decomposition must not result in SPURIOUS TUPLES. (mandatory requirement) Preferably. The decomposition should be dependency-preserving , that is each functional dependency holding on the parent schema should be preserved in one of the sub-schemas( Desirable requirement)

Functional Dependencies The functional dependency is a relationship that exists between two attributes. It typically that exists between the primary key attribute and non key attributes within a relation or table. To understand the concept thoroughly, let us consider R is a table with attribute A and B. Functional dependency is represented by (arrow sign) Example AB Functional dependancy : B – Functionally dependent on A or A determines B A- Determinant set B- dependent attribute

social security number determines employee name SSN -> ENAME project number determines project name and location PNUMBER -> {PNAME, PLOCATION} employee ssn and project number determines the hours per week that the employee works on the project {SSN, PNUMBER} -> HOURS

Definition . the values of the X component of a tuple uniquely (or functionally) determine the values of the Y component. The set of attributes X is called the left-hand side of the FD, and Y is called the right-hand side. Note: {legal relation states (or legal extensions)] X is a candidate key of R—this implies that X → Y for any subset of attributes Y of R (because the key constraint implies that no two tuples in any legal state r(R) will have the same value of X). If X is a candidate key of R, then X → Y. If X → Y in R, this does not say whether or not Y → X in R. Mathematical notation and definition

Normal Forms Based on Primary Keys REDUNDANT DATA NORMALIZATION Same piece of data is held in two separate places Aims to reduce data redundancy Various problem if redundancy data INSERT ANOMALIES DELETE ANOMALIES UPDATE ANOMALIES

WHAT IS NORMALIZATION? Normalization of Data [ “ filtering” or “ purification] : Analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of (1) minimizing redundancy and (2 ) minimizing the insertion, deletion, and update anomalies

RECALL KEY CONCEPT Definitions of Keys and Attributes Participating in Keys CANDIDATE KEY: If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key and the others are called secondary keys. EXAMPLE: { Ssn } is the only candidate key for EMPLOYEE, so it is also the primary key . Super key: { Ssn }, { Ssn , Ename }, { Ssn , Ename , Bdate }, and any set of attributes that includes Ssn are all superkeys .

First Normal Form (1 ST NF) 1NF (First Normal Form) Rules Each table cell should contain a single value or atomic value. Each record needs to be unique.  Example  can be easily understood with the help of a case study . Assume, a video library maintains a database of movies rented out . Without any normalization in database, all information is stored in one table as shown below. Let’s understand Normalization database with normalization example with solution: it was defined to disallow multivalued attributes, composite attributes, and their combinations.

Here you see  Movies Rented column has multiple values.  Now let’s move into 1st Normal Forms:

What is Composite Key? A composite key is a primary key composed of multiple columns used to identify a record uniquely In our database, we have two people with the same name Robert Phil, but they live in different places. Hence, we require both Full Name and Address to identify a record uniquely. That is a composite key.

2NF (Second Normal Form) Second normal form (2NF) is based on the concept of full functional dependency or A relation schema R is in 2NF, if every non-prime attribute A in R is fully functionally dependent an the Primary Key of R 2NF (Second Normal Form) Rules Rule 1- Be in 1NF Rule 2- Single Column Primary Key that does not functionally dependent on any subset of candidate key relation Stud_id Proj_id Stud_name Proj_name Student_Project

TABLE 1 TABLE 2

What are transitive functional dependencies? A transitive  functional dependency  is when changing a non-key column, might cause any of the other non-key columns to change

3NF (Third Normal Form)  Relation should not have a non key attribute functionally determined by another non key attribute (or by a set of non key attributes). 3NF (Third Normal Form) Rules Rule 1- Be in 2NF Rule 2- Has no transitive functional dependencies

To move our 2NF table into 3NF, we again need to again divide our table.

BCNF (Boyce- Codd Normal Form ) OR 3.5 Normal Form Advance version of 3 rd N ormal form RULE: It is in 3NF For every FD X Y, X should be super key of the table. H ow to know, we have to apply BCNF on table Whenever a NON KEY prime attribute DETERMINE prime attribute, then we have to apply BCNF

EXAMPLE: Stud-ID SUBJECT Professor 123 physics Faculty 1 123 Music Faculty 2 456 Biology Faculty 3 789 Physics Faculty 4 999 Physics Faculty 1 Super key Non key attribute or non prime attribute In this table there are two FDs in this relation { Stud_ID,Subject } professor Professor  subject (non key attribute determine prime attribute) In this case insertion anomalies , delete anomalies will be effect. Due to this we have to apply BCNF RULE. By split the mother table into two table

Stud_ID Professor 123 Faculty 1 123 Faculty 2 456 Faculty 3 789 Faculty 4 999 Faculty 1 Stud_Professor Professor Subject Faculty 1 Physics Faculty 2 Music Faculty 3 Biology Faculty 4 Physics Professor_Subject Stud_id professor Professor Subject

4 th NF(4 th NORMAL FORM) Multivalued Dependencies and fourth Normal form Rules: 4 th NF eliminate independent many-to-one relationship between columns A relation must first ben in BCNF A given relation may not contain more than one multivalued attributes

EXAMPLE: Stud_id Stud_Course Stud_hobby 1 DMBS,CN Swimming, Tennis 2 Java , C# Cricket, Foot ball Super key Above table does not  holds the conditions of 4NF because it Holds the Multivalued Dependency. Explanation FOR std_id =1 there are two values of Std_course (i.e. DBMS and CN) and same for Std_id =2 there are two values of Std_course (i.e. Java and C#).  Columns “ Std_course ” and “ Std_hobby ” are independent to each other but depend on “ Std_id ”.  Atleast three values also exist in the given table.  X Y Z X →→ Y X →→Z

Stud_id Stud_Course Stud_hobby 1 DMBS,CN Swimming, Tennis 2 Java , C# Cricket, Foot ball Stud_id Stud_Course 1 DBMS 1 CN 2 JAVA 2 C# Student_Course table Stud_id Stud_Hobby 1 Swimming 1 Tennis 2 Cricket 2 Foot ball Student_Hobby table

5 th Normal Form (5NF) Join dependencies and fifth Normal form or Project Join Normal Form(PJ/NF ) A relation R is in Fifth Normal Form (5NF) if and only if the following conditions are satisfied simultaneously: 1. R is already in 4NF. 2. It cannot be further non-loss decomposed. R(mother table) (R1 R2) R3 R1 (R2 R3)

5 th normal form

Example 2: company database

NOTE: There are three main techniques to achieve first normal form for such a relation: 1. Remove the attribute Dlocations that violates 1NF and place it in a separate relation DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT. The primary key of this newly formed relation is the combination { Dnumber , Dlocation }, as shown in Figure 14.2. A distinct tuple in DEPT_LOCATIONS exists for each location of a department. This decomposes the non-1NF relation into two 1NF relation

2. Expand the key so that there will be a separate tuple in the original DEPARTMENT relation for each location of a DEPARTMENT, as shown in Figure 14.9(c). In this case, the primary key becomes the combination { Dnumber , Dlocation }. This solution has the disadvantage of introducing redundancy in the relation and hence is rarely adopted.

3. If a maximum number of values is known for the attribute—for example, if it is known that at most three locations can exist for a department—replace the Dlocations attribute by three atomic attributes: Dlocation1, Dlocation2, and Dlocation3. This solution has the disadvantage of introducing NULL values if most departments have fewer than three locations. It further introduces spurious semantics about the ordering among the location values; Dname Dnumber Dmgr_ssn Dlocation1 Dlocation2 Dlocation3 Research 5 333445555 Bellaire NULL NULL Research 5 333445555 NULL Sugarland NULL Research 5 333445555 NULL NULL Houston Administration 4 987654321 Stafford NULL NULL Headquarters 1 888665555 Houston NULL NULL

First normal form also disallows multivalued attributes that are themselves composite . These are called nested relations because each tuple can have a relation within it. Figure 14.10 shows how the EMP_PROJ relation could appear if nesting is allowed. COMPOSITE ATTRIBUTE EMP_PROJ( Ssn , Ename , {PROJS( Pnumber , Hours)})

EXAMPLE: Second Normal Form

Example: Third Normal Form NOTE: A NATURAL JOIN operation on ED1 and ED2 will recover the original relation EMP_DEPT without generating spurious tuples.

Normalization Algorithms Inference Rule (IR) in DBMS: Armstrong's axioms in database management systems were developed by William w. Armstrong in 1974. Armstrong's axioms are the basic inference rule, used to conclude functional dependencies on a relational database. It provides a set of rules for a simple reasoning technique in functional dependencies. An inference rule is an assertion that can apply a user on a set of functional dependencies to derive other FD (functional dependencies).

Type of functional dependencies Trivial FD Non Trivial FD Multi valued FD Transitive FD

Trivial functional Dependency 1. FD: X  X EXAMPLE Roll.No  Roll.No 2. FD: X Y if Y SUBSET OF X (Y ⊆ X) Example: ( Roll.No,Name )  Name if t1(x) =t2(x) then t1(y) = t2(y) X Y Non Trivial functional Dependency FD: X Y X ∩Y= Φ R.No Name Semi Trivial functional Dependency X⊄ Y R.No,Name Name,Marks The Trivial Functional Dependency is a set of attributes or columns that are known as trivial if the non-key-dependent attribute is a subset of the determinant attribute, which makes jointly a primary key attribute.

Armstrong Axioms/Inference Rules(IR) example R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 1 Reflexive Rule (IR1 ) Proof: If X is a set of attributes and Y is the subset of X, then X functionally determines Y. In the reflexive rule, if Y is a subset of X, then X determines Y. 1. FD: X  X 2 . FD: X Y if (Y ⊆ X) R.No R.No NAMENAME R.No,NameName

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 2 . Augmentation Rule ( IR 2 ) Proof: The augmentation is also called a partial dependency. If X determines Y, then XZ determines YZ for any Z. if X determines Y and Z is any attribute set, then XZ determines YZ. It is also called a partial dependency. 1 .    If  (X     →   Y)  then  XZ   →   YZ    R.NoName ( R.No,Marks )( Name,Marks ) Roll No 1 tell me your Name Roll No 1 and Marks is 78 tell me your Name and Marks

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 6 d 80 EC C4 3 . Transitive Rule ( IR 3 ) Proof: In the transitive rule, if X determines Y and Y determines Z, then X must also determine Z. if X determines Y and Y determines Z, then X also determines Z. 1.    If  (X    →   Y  &  Y  →   Z)  then X  →   Z   NAMEMarks MarksDept Name Dept Ram is sibling of sham Sham is sibling of mohan Ram is Sibling of Mohan As per rule of X Y If t1(X) = t2(x) then t1(y)=t2(y) as per tirivial FD)

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 4. Union Rule (IR 4 ) This rule is also known as the  additive rule or secondary rule . if X determines Y and X determines Z, then X also determines both Y and Z. Union rule says, if X determines Y and X determines Z, then X must also determine Y and Z. 1 .    If  (X     →  Y and X   →   Z)  then X  →    YZ      if R.NoName & R.NoMarks then R.No  Name,Marks Roll No 1 tell me your Name and Roll No 1 tell me your Mark Or Roll No 1 tell me your name and marks

Proof: 1. X → Y (given) 2. X → Z (given) 3. X → XY (using IR2 on 1 by augmentation with X. Where XX = X) 4. XY → YZ (using IR2 on 2 by augmentation with Y) 5. X → YZ (using IR3 on 3 and 4)

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 5. Decomposition Rule (IR 5 ) This rule is the reverse of the Union rule and is also known as the  project rule . if X determines Y and Z together, then X determines Y and Z separately 1 .    If X   →   YZ then X   →   Y and X  →    Z       X Y Z if ( Name,Marks ) Dept , Course then ( Name,Marks ) Dept ( Name,Marks ) Course If XY Z Then XZ THIS TYPE OF SPLITE NOT YZ ALLOWED Note: We can not split left side only suppose to split on right side *

Proof: 1. X → YZ (given) 2. YZ → Y (using IR1 Rule) 3. X → Y (using IR3 on 1 and 2)

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 6 d 80 EC C4 6. Pseudo transitive Rule (IR 6 ) In the  pseudo transitive rule , if X determines Y, and YZ determines W, then XZ also determines W. 1.     If(  X   →   Y and YZ    →   W)  then XZ   →   W  if R.NoName and Name,MarksDept then ( R.No,Marks ) Dept

Proof: 1. X → Y (given) 2. WY → Z (given) 3. WX → WY (using IR2 on 1 by augmenting with W) 4. WX → Z (using IR3 on 3 and 2)

Armstrong Axioms/Inference Rules(IR) Example: STUDENT R.No Name Marks Dept Course 1 a 78 CS C1 2 b 60 EE C1 3 a 78 CS C2 4 b 60 EE C3 5 c 80 IT C2 6. Composition Rule ( IR 7 ) 1 .     If(  X   →   Y and  A    →  B )  then  XA    →   YB   if R.NoName and ( Marks,Dept )Course then ( R.No,Marks,Dept ) Name, Course

Equivalence of sets of Functional Dependencies Definition: A set of functional dependencies F is said to cover another set of functional dependencies E , if every FD in E is also in F + FD1:E FD2: F FD1 SUBSET 0F FD2 FD2:F FD1: E FD1: E FD2:F = FD2 SUBSET 0F FD1 FD1=FD2

To check the FDs are equivalent sets of R(A,B,C,D) F={A B, BC, ABD} E={AB,BC,AC,AD} SOLUTION: STEP1: [CHECK Whether all FDs of F are present in E] F E A B AB BC BC ABD - - AC - AD

STEP 2: [NOW CHECK E IS SUBSET OF F ] F E A B AB BC BC ABD - - AC - AD Now A  C is present in Functional dependencies F or not F={A B, BC, ABD} Find the closure of A attribute notation is A + A + ={ AB CD} A  A AB BC AB-D A Determine C There for A  C is present in F indirectly ( inference conclude)

STEP 2: [NOW CHECK E IS SUBSET OF F ] F E A B AB BC BC ABD - - AC - AD Now A  D is present in Functional dependencies F or not F={A B, BC, ABD} Find the closure of A attribute notation is A + A + ={ AB CD} A  A AB BC AB-D A Determine D There for A  D is present in F indirectly ( inference conclude) E is the subset of F

STEP 3: [NOW CHECK F IS SUBSET OF E ] F E A B AB BC BC ABD - - AC - AD Now AB  D is present in Functional dependencies F or not E ={ A B, BC, AC,AD} Find the closure of A attribute notation is A + AB + ={ AB C D } AB  AB AB BC AC AD There for A  D is present in F indirectly ( inference conclude) F is the subset of E AB determine D

Inference conclusion There for F = E IS Semantically equivalent

Minimal Sets of Functional Dependencies if we could shrink or reduce the set F to its minimal form so that the minimal set is still equivalent to the original set F. Functional dependencies F Functional dependencies E Shrink or Reduce Both F and E are equal

Algorithm : Finding a Minimal Cover F for a Set of Functional Dependencies E Step1: [ split the right hand attribute of all FDs in single attribute] A XY == > A->X , A->Y Step 2: [Remove all redundant FDs] {A B, BC, AC} Here A C is redundant since it can already be achieved using the transitivity property Step 3: [Find the extraneous attribute and remove it] AB C, either A or B or none can be extraneous If A closure contain B then B is extraneous and it can be removed If B closure contain A then A is extraneous and it can be removed

Algorithm: Finding a Key K for R Given a Set F of Functional Dependencies Input : A relation R and a set of functional dependencies F on the attributes of R. Set K := R. 2 . For each attribute A in K {compute (K − A) + with respect to F ; if (K − A) + contains all the attributes in R, then set K := K − {A} };
Tags