transection mangemanrtin data bse mangement system .pdf

jitborawan330 4 views 43 slides Sep 24, 2024
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

notes best


Slide Content

UNIT 4: Data Normalization: Anomalies in relational database design.
Decomposition. Functional dependencies. Normalization. First normal
form,
Second normal form, Third normal form. Boyce-Codd normal form.
Dr Hamela K
Dept of Computer Science
GFGC, Malur
Prepared by Dr Hamela, GFGC, Malur

Types of Anomalies in DBMS
Insert Anomaly: The term "insertion anomaly" is used to describe when a
new row is added to a table and it causes an inconsistency.
Update Anomaly: If there are some changes in the database, we have to
apply that change in all the rows. And if we miss any row, we will have one
more field, creating an update anomaly in the database.
Delete Anomaly: The term "deletion anomaly in the database" is used
when we delete some rows from a table and any necessary additional
information or data is also lost from the database.
Prepared by Dr Hamela, GFGC, Malur

Assume a manufacturing company stores employee details in a table called Employee
having four attributes:
Prepared by Dr Hamela, GFGC, Malur

Insert anomaly
If there is a new row inserted in the table and it creates the inconsistency in the table
then it is called the insertion anomaly.
Example
Assume that a new employee is joining the company under training and not assigned to
any department. Then, we would not insert the data into the table if the emp_deptfield
doesn't allow nulls.
Prepared by Dr Hamela, GFGC, Malur

Updateanomaly
Whenweupdatesomerowsinthetable,andifitleadstothe
inconsistencyofthetablethenthisanomalyoccurs.Thistype
ofanomalyisknownasanupdationanomaly.
Example
Inthegiventable,wehavetworowsforanemployeenamed
Rick,andhebelongstotwodifferentdepartmentsofthe
company.IfweneedtoupdateRick'saddress,wemustupdate
thesameaddressintworows.Otherwise,thedatawillbecome
inconsistent.
If,insomeway,wecanupdatethecorrectaddressinone
departmentbutnottheother,thenaccordingtothedatabase,
Rickwillhavetwodifferentaddresses,whichisnotcorrectand
wouldleadtoinconsistentdata.
Prepared by Dr Hamela, GFGC, Malur

Deleteanomaly
Ifwedeletesomerowsfromthetableandif
anyotherinformationordatawhichisrequired
isalsodeletedfromthedatabase,thisiscalled
thedeletionanomalyinthedatabase.
Example
Assumethatifthecompanyclosesthe
departmentD890,thendeletingtherowsthat
haveemp_deptasD890wouldalsodeletethe
informationofemployeeMaggiesincesheis
assignedonlytothisdepartment.
Prepared by Dr Hamela, GFGC, Malur

DECOMPOSITION
•Decomposition can be defined as a database management
system process for dividing a single relation into multiple sub-
relations.
•Its main purpose is to break down the functions of a company
into fine levels of detail.
•It eliminates the anomalies and redundancy from the database
by breaking it up into many different tables.
Prepared by Dr Hamela, GFGC, Malur

Types of decomposition
Prepared by Dr Hamela, GFGC, Malur

In a Lossy Decomposition, the relation needs to be decomposed into two or
more relational schemas. There is no way that loss of information can be
avoided during the retrieval of the original relation.
Now, we won’t be
able to join the above
tables, since Emp_ID
isn’t part of the
DeptDetailsrelation.
Therefore, the above
relation has lossy
decomposition.
Prepared by Dr Hamela, GFGC, Malur

Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables
using Joins. This is the preferred choice. The information will not lose from the relation
when decomposed. The join would result in the same original relation.
Prepared by Dr Hamela, GFGC, Malur

Prepared by Dr Hamela, GFGC, Malur

Advantages of decomposition in DBMS?
Easy use of Codes
The availability of decomposition makes it easier for programs to copy and reuse important codes for
other works in DBMS. It only not helps in saving lots of time but also makes things convenient for the
users.
Finding Mistakes
Another reason the programmers opt for decomposition is to allow them conveniently complete complex
programs. The mistakes are quite easier to find with this sort of programming.
Problem-Solving Approach
It is considered a perfect problem-solving strategy using which complex computer programs can be
written easily. The users can precisely join tons of code together for adequate results.
Eliminating Errors
The biggest advantage of having decomposition in DBMS is eliminating the inconsistencies and
duplication to a greater extent. The data can be easily identified in cases when decomposition happens
in DBMS.
Prepared by Dr Hamela, GFGC, Malur

Properties of decomposition
The programmers must be aware of the main properties of decomposition in DBMS.
We have mentioned the major ones in detail below:
Attribute Preservation
The functional dependencies decompose the universal relation that attributes
preservation of decomposition.
Dependency Preservation
Dependency preservation can be defined as the functionality that features directly
in the relation schemas. There is a chance of dependency loss if the decomposition
is not preserved.
•No Redundancy
It is used for removing a few of the issues related to improper design, such as
redundancy, anomalies, and inconsistencies.
Prepared by Dr Hamela, GFGC, Malur

Issues of decomposition in DBMS?
There are many problems regarding the decomposition in DBMS mentioned below:
Redundant Storage
Many instances where the same information gets stored in a single place can confuse
the programmers. It will take lots of space in the system.
Insertion Anomalies
It isn’t essential for storing important details unless some kind of information is stored
in a consistent manner.
Deletion Anomalies
It isn’t possible to delete some details without eliminating any sort of information.
Prepared by Dr Hamela, GFGC, Malur

FunctionalDependency
Thefunctionaldependencyisarelationshipthatexistsbetween
twoattributes.Ittypicallyexistsbetweentheprimarykeyand
non-keyattributewithinatable.
X→Y
TheleftsideofFDisknownasadeterminant,therightsideof
theproductionisknownasadependent.
Forexample:
Assumewehaveanemployeetablewithattributes:Emp_Id,
Emp_Name,Emp_Address.
HereEmp_IdattributecanuniquelyidentifytheEmp_Name
attributeofemployeetablebecauseifweknowtheEmp_Id,we
cantellthatemployeenameassociatedwithit.
Functionaldependencycanbewrittenas:
Emp_Id→Emp_Name
WecansaythatEmp_Nameisfunctionallydependenton
Emp_Id.

Types of Functional dependency
Types of Functional dependency

1. Trivial functional dependency
A → B has trivial functional dependency if B is a subset of A.
The following dependencies are also trivial like: A → A, B → B
Example:
Consider a table with two columns Employee_Id and
Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial
functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name →
Employee_Name are trivial dependencies too.

2. Non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a
subset of A.
When A intersection B is NULL, then A → B is called as
complete non-trivial.
Example:
ID → Name,
Name → DOB

Armstrong's Axioms
If F is a set of functional dependencies then the closure
of F, denoted as F+, is the set of all functional
dependencies logically implied by F. Armstrong's
Axioms are a set of rules, that when applied
repeatedly, generates a closure of functional
dependencies.
Reflexive rule− If alpha is a set of attributes and beta
is_subset_of alpha, then alpha holds beta.
Augmentation rule− If a → b holds and y is attribute
set, then ay → by also holds. That is adding attributes
in dependencies, does not change the basic
dependencies.
Transitivity rule− Same as transitive rule in algebra, if
a → b holds and b → c holds, then a → c also holds. a
→ b is called as a functionally that determines b.

•Trivial Functional Dependency
•Trivial− If a functional dependency (FD) X → Y holds,
where Y is a subset of X, then it is called a trivial FD.
Trivial FDs always hold.
•Non-trivial− If an FD X → Y holds, where Y is not a
subset of X, then it is called a non-trivial FD.
•Completely non-trivial− If an FD X → Y holds, where
x intersect Y = Φ, it is said to be a completely non-
trivial FD.

The Purpose of Normalization
Normalizationis a technique for producing a set of
relations with desirable properties, given the data
requirements of an enterprise.
The process of normalization is a formal method that
identifies relations based on their primary or candidate
keys and the functional dependencies among their
attributes.

Problem Without Normalization
Without Normalization, it becomes difficult to handle and update the database,
without facing data loss. Insertion, Updation and Deletion Anamolies are very
frequent if Database is not Normalized.

Purpose
•Normalizationistheprocessoforganizingthedatain
thedatabase.
•Normalizationisusedtominimizetheredundancy
fromarelationorsetofrelations.Itisalsousedto
eliminateundesirablecharacteristicslikeInsertion,
Update,andDeletionAnomalies.
•Normalizationdividesthelargertableintosmaller
andlinksthemusingrelationships.
•Thenormalformisusedtoreduceredundancyfrom
thedatabasetable.

Advantages of Normalization
•Normalization helps to minimize data
redundancy.
•Greater overall database organization.
•Data consistency within the database.
•Much more flexible database design.
•Enforces the concept of relational integrity.

Disadvantages of Normalization
We cannot start building the database before knowing
what the user needs.
The performance degrades when normalizing the
relations to higher normal forms, i.e., 4NF, 5NF.
It is very time-consuming and difficult to normalize
relations of a higher degree.
Careless decomposition may lead to a bad database
design and serious problems.

FIRST NORMAL FORM
•First Normal Formis a relation in which the
intersection of each row and column contains one
and only one value.

As per First Normal Form, no two Rows of data must contain repeating group
of information i.e each set of column must have a unique value, such that
multiple columns cannot be used to fetch the same row. Each table should be
organized into rows, and each row should have a primary key that
distinguishes it as unique.
ThePrimary keyis usually a single column, but sometimes more than one
column can be combined to create a single primary key. For example consider
a table which is not in First normal form
Student Table :
Student
AgeSubject
Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths

In First Normal Form, any row must not have a column in which more
than one value is saved, like separated with commas. Rather than that,
we must separate such data into multiple rows.
Student Table following 1NF will be :
Student
Age Subject
Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
Using the First Normal Form, data redundancy increases, as there will be many
columns with same data in multiple rows but each row as a whole will be unique.

Full functional dependency
Full functional dependency indicates that if A and B are
attributes of a relation, B is fully functionally dependent on
A if B is functionally dependent on A, but not on any proper
subset of A.
A functional dependency A→B is partially dependentif there
is some attributes that can be removed from A and the
dependency still holds.

Second Normal Form (2NF)
Second normal form (2NF)is a relation that is in first
normal form and every non-primary-key attribute is fully
functionally dependent on the primary key.
The normalization of 1NF relations to 2NF involves the
removal of partial dependencies. If a partial dependency
exists, we remove the function dependent attributes from
the relation by placing them in a new relation along with
a copy of their determinant.

•In example of First Normal Form there are two rows for Adam, to include
multiple subjects that he has opted for. While this is searchable, and follows
First normal form, it is an inefficient use of space. Also in the above Table in
First Normal Form, while the candidate key is {Student,Subject},Ageof
Student only depends on Student column, which is incorrect as per Second
Normal Form. To achieve second normal form, it would be helpful to split
out the subjects into an independent table, and match them up using the
student names as foreign keys.
New Student Table following
2NF will be :
Student
Age
Adam 15
Alex 14
Stuart 17
In Student Table the candidate key will
beStudentcolumn, because all other
column i.eAgeis dependent on it.
New Subject Table introduced for 2NF
will be :
Student
Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths

•In Subject Table the candidate key will be {Student,Subject} column. Now,
both the above tables qualifies for Second Normal Form and will never
suffer from Update Anomalies. Although there are a few complex cases in
which table in Second Normal Form suffers Update Anomalies, and to
handle those scenarios Third Normal Form is there.

Third Normal Form (3NF)
Transitive dependency
A condition where A, B, and C are attributes of a relation such that
if A →B and B →C, then C is transitively dependent on A via B
(provided that A is not functionally dependent on B or C).
Third normal form (3NF)
A relation that is in first and second normal form, and in which
no non-primary-key attribute is transitivelydependent on the
primary key.
The normalization of 2NF relations to 3NF involves the removal
of transitive dependencies by placing the attribute(s) in a new
relation along with a copy of the determinant.

•Student_Detail Table :
In this table Student_id is Primary key, but street, city and state depends upon Zip. The
dependency between zip and other fields is calledtransitive dependency. Hence to
apply3NF, we need to move the street, city and state to new table, withZipas primary key.
Third Normal formapplies that every non-prime attribute of table must be
dependent on primary key, or we can say that, there should not be the case
that a non-prime attribute is determined by another non-prime attribute. So
thistransitive functional dependencyshould be removed from the table and
also the table must be inSecond Normal form. For example, consider a
table with following fields.

New Student_Detail Table :
Student_id Student_name DOB Zip
Address Table :
Zip Street citystate
The advantage of removing transtive dependency is,
Amount of data duplication is reduced.
Data integrity achieved.

Boyce Codd normal form (BCNF)
•BoyceCoddnormalform(BCNF)
•BCNFistheadvanceversionof3NF.Itisstricterthan
3NF.
•AtableisinBCNFifeveryfunctionaldependencyX→
Y,Xisthesuperkeyofthetable.
•ForBCNF,thetableshouldbein3NF,andforevery
FD,LHSissuperkey.

EMP_ID EMP_COUNT
RY
EMP_DEPT DEPT_TYPE EMP_DEPT_N
O
264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549
Example:Let's assume there is a company where employees work in more than
one department.
EMPLOYEE table:
In the above table Functional dependencies are as follows:
1.EMP_ID→EMP_COUNTRY
2.EMP_DEPT→{DEPT_TYPE,EMP_DEPT_NO}

Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_ID EMP_COUN
TRY
264 India
264 India
EMP_COUNTRYtable:
EMP_DEPT DEPT_TYP
E
EMP_DEPT
_NO
DesigningD394 283
Testing D394 300
Stores D283 232
DevelopingD283 549
EMP_DEPTtable:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
EMP_DEPT_MAPPINGtable:

Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because the left side part of both the
functional dependencies is key.