databasedatabasedatabasedatabasedatabase.pptx

rajarahul567 8 views 50 slides May 13, 2024
Slide 1
Slide 1 of 50
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50

About This Presentation

database


Slide Content

Data Modelling for Databases (4COM2004) Data Modelling for Databases – 4 15:22

Learning Objectives Normalisation Functional Dependence Data Modelling for Databases – 4 15:22 Lecture 4

Developed for Relational Model ( Codd , 1970) To define a simple structure for storing data To eliminate certain kinds of redundancy To avoid certain update anomalies. Data Modelling for Databases – 4 15:22 Normalisation

Process of breaking down or decomposing a relation (table) into other relations (tables) Must be reversible, so no data is lost during the normalisation process Data Modelling for Databases – 4 15:22 Normalisation

Consider the activities listed and determine how changing the data could be undertaken more efficiently by reorganizing the data. a) Kathy Lewis leaves on maternity leave, replaced by Andrea Smith. b) Senior Management agrees to increase the Salaries System budget by 5% c) The Database department is renamed Information Services d) Employee Numbers are to be changed, the leading 5 is no longer required . Data Modelling for Databases – 4 15:22 Normalisation Project Code Project Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions System M. Phillips 24500 510001 A. Smith L004 IT 22.00 PC010 Pensions System M. Phillips 24500 510030 L. Jones L023 Pensions 18.50 PC010 Pensions System M. Phillips 24500 521010 P. Gold L004 IT 22.00 PC045 Salaries System H. Martin 17400 510010 B. Jones L004 IT 21.75 PC045 Salaries System H. Martin 17400 510001 A. Smith L004 IT 18.00 PC045 Salaries System H. Martin 17400 531002 T. Gilbert L028 Database 25.50 PC045 Salaries System H. Martin 17400 513210 W. Richards L008 Salary 17.00 PC064 HR System K. Lewis 12250 531002 T. Gilbert L028 Database 23.25 PC064 HR System K. Lewis 12250 521010 P. Gold L004 IT 17.50 PC064 HR System K. Lewis 12250 510034 B. Jeffries L009 HR 16.50

UNF Data Modelling for Databases – 4 15:22

Data Modelling for Databases – 4 15:22 Un- normalised Form Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases Software Development ISDE C A D 960105 White, A 10/05/1975 Software Development ISDE B B 960120 Moore, T 11/03/1970 Databases Software Development Workshop A B C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases Software Development ISDE Workshop B D C D

We describe this state as below: (StudentID, Name, DoB, (Subject, Grade)) Data Modelling for Databases – 4 15:22 Un-normalised Form Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases Software Development ISDE C A D 960105 White, A 10/05/1975 Software Development ISDE B B 960120 Moore, T 11/03/1970 Databases Software Development Workshop A B C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases Software Development ISDE Workshop B D C D List all columns Nest repeated sets of columns inside parentheses

We will consider 3 stages in the normalisation process Through 1 st , 2 nd and 3 rd Normal Form There are other Normal Forms, but we will not be considering these on this course Why? Data Modelling for Databases – 4 15:22 Normalisation

1 st NF Data Modelling for Databases – 4 15:22

All attributes in a table must be atomic (no repeating groups. A Primary Key is defined from all Candidate Keys Data Modelling for Databases – 4 15:22 First Normal Form – Rules

Data Modelling for Databases – 4 15:22 Normalisation Non Atomic Fields Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases Software Development ISDE C A D 960105 White, A 10/05/1975 Software Development ISDE B B 960120 Moore, T 11/03/1970 Databases Software Development Workshop A B C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases Software Development ISDE Workshop B D C D

Data Modelling for Databases – 4 15:22 Normalisation Which Column(s) should act as Primary Key – Candidate Key Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases C 960100 Smith, J 14/11/1977 Software Development A 960100 Smith, J 14/11/1977 ISDE D 960105 White, A 10/05/1975 Software Development B 960105 White, A 10/05/1975 ISDE B 960120 Moore, T 11/03/1970 Databases A 960120 Moore, T 11/03/1970 Software Development B 960120 Moore, T 11/03/1970 Workshop C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases B 960150 Black, D 21/08/1973 Software Development D 960150 Black, D 21/08/1973 ISDE C 960150 Black, D 21/08/1973 Workshop D

Data Modelling for Databases – 4 15:22 Normalisation Table is in First Normal Form Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases C 960100 Smith, J 14/11/1977 Software Development A 960100 Smith, J 14/11/1977 ISDE D 960105 White, A 10/05/1975 Software Development B 960105 White, A 10/05/1975 ISDE B 960120 Moore, T 11/03/1970 Databases A 960120 Moore, T 11/03/1970 Software Development B 960120 Moore, T 11/03/1970 Workshop C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases B 960150 Black, D 21/08/1973 Software Development D 960150 Black, D 21/08/1973 ISDE C 960150 Black, D 21/08/1973 Workshop D

We have eliminated repeating groups Decided on a key for the table We describe this state as below: Enrolment ( StudentID , Name, DoB, Subject , Grade) Table Name provided - Enrolment No repeating attribute – no nested round brackets Primary Key defined - (Composite) Primary Key indicated by Underline Data Modelling for Databases – 4 15:22 @ First Normal Form

2 nd NF Data Modelling for Databases – 4 15:22

Table must be in First Normal Form Every non-key attribute must be fully functionally dependent on the Primary Key. Data Modelling for Databases – 4 15:22 Second Normal Form – Rules ?

The role of a key is as follows: Define uniqueness Determination of non-key attributes Key also can be called a Determinant Data Modelling for Databases – 4 15:22 The Role of Keys

If the value of an attribute (B) can be determined by the value of another (A), then B is functionally dependant on A. Data Modelling for Databases – 4 15:22 Functional Dependence A is the determinant … it determines the value of B B is the dependant … it depends upon the value of A Therefore, A is a key

Example: Data Modelling for Databases – 4 15:22 Functional Dependence Country is unique identifier, there are no two countries with the same name. Thus, the country determines the other four attributes COUNTRY  ( CAPITAL, AREA, POPULATION, ISO_CODE ) Country Capital City Area Population Iso Code United Kingdom London 242,900 63,489,234 GB Switzerland Berne 41,284 8,157,896 CH South Africa Pretoria 1,221,037 53,139,528 ZA

Example: Data Modelling for Databases – 4 15:22 Functional Dependence The Iso_Code is also unique, So: … the Iso_Code also determines the other four attributes ISO_CODE  ( COUNTRY, CAPITAL, AREA, POPULATION ) Country Capital City Area Population Iso Code United Kingdom London 242,900 63,489,234 GB Switzerland Berne 41,284 8,157,896 CH South Africa Pretoria 1,221,037 53,139,528 ZA

Tutorial 2, Task 7 Data Modelling for Databases – 4 15:22 Normalisation What are the Functional dependencies here? Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases C 960100 Smith, J 14/11/1977 Software Development A 960100 Smith, J 14/11/1977 ISDE D 960105 White, A 10/05/1975 Software Development B 960105 White, A 10/05/1975 ISDE B 960120 Moore, T 11/03/1970 Databases A 960120 Moore, T 11/03/1970 Software Development B 960120 Moore, T 11/03/1970 Workshop C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases B 960150 Black, D 21/08/1973 Software Development D 960150 Black, D 21/08/1973 ISDE C 960150 Black, D 21/08/1973 Workshop D

Enrolment ( StudentID , Name, DoB, Subject , Grade) Data Modelling for Databases – 4 15:22 Functional Dependency StudentID Name DoB Subject Grade

Enrolment ( StudentID , Name, DoB, Subject , Grade ) Data Modelling for Databases – 4 15:22 Functional Dependency Student ID Student ID, Subject Name DoB Grade Functional Dependence FULL Functional Dependence

Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases C 960100 Smith, J 14/11/1977 Software Development A 960100 Smith, J 14/11/1977 ISDE D 960105 White, A 10/05/1975 Software Development B 960105 White, A 10/05/1975 ISDE B 960120 Moore, T 11/03/1970 Databases A 960120 Moore, T 11/03/1970 Software Development B 960120 Moore, T 11/03/1970 Workshop C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases B 960150 Black, D 21/08/1973 Software Development D 960150 Black, D 21/08/1973 ISDE C 960150 Black, D 21/08/1973 Workshop D Table must be in First Normal Form Every non-key attribute must be fully functionally dependant on the Primary Key. Are they for the Enrolment table? Data Modelling for Databases – 4 15:22 Second Normal Form – Rules

Data Modelling for Databases – 4 15:22 Functional Dependencies Student ID Name Date of Birth Subject Grade 960100 Smith, J 14/11/1977 Databases C 960100 Smith, J 14/11/1977 Software Development A 960100 Smith, J 14/11/1977 ISDE D 960105 White, A 10/05/1975 Software Development B 960105 White, A 10/05/1975 ISDE B 960120 Moore, T 11/03/1970 Databases A 960120 Moore, T 11/03/1970 Software Development B 960120 Moore, T 11/03/1970 Workshop C 960145 Smith, J 09/01/1972 Databases B 960150 Black, D 21/08/1973 Databases B 960150 Black, D 21/08/1973 Software Development D 960150 Black, D 21/08/1973 ISDE C 960150 Black, D 21/08/1973 Workshop D

Data Modelling for Databases – 4 15:22 NOT in Second Normal Form How do we get the data into Second Normal Form? If an attribute is only functionally dependent on part of the key : Make a separate table including the part of the key on which the attribute is fully functionally dependant, and that attribute Set the determinant as the primary key for that new table. The determinant becomes a foreign key in the original table

Data Modelling for Databases – 4 15:22 @ Second Normal Form Student ID Subject Grade 960100 Databases C 960100 Software Development A 960100 ISDE D 960105 Software Development B 960105 ISDE B 960120 Databases A 960120 Software Development B 960120 Workshop C 960145 Databases B 960150 Databases B 960150 Software Development D 960150 ISDE C 960150 Workshop D Student ID Name Date of Birth 960100 Smith, J 14/11/1977 960105 White, A 10/05/1975 960120 Moore, T 11/03/1970 960145 Smith, J 09/01/1972 960150 Black, D 21/08/1973

All non key attributes are fully functionally dependent on the Primary Key We describe this state as below: Enrolment ( StudentID *, Subject , Grade) Student ( StudentID , Name, DoB ) Data Modelling for Databases – 4 15:22 @ Second Normal Form

Data Modelling for Databases – 4 15:22 @ Second Normal Form Note: A table in 1NF with a Simple Primary Key, MUST be in 2NF

3 rd NF Data Modelling for Databases – 4 15:22

Table must be in Second Normal Form No non-key attribute must be dependent on any other attribute(s) other than the Primary Key. Known as a Transitive Dependency Data Modelling for Databases – 4 15:22 Third Normal Form – Rules

Data Modelling for Databases – 4 15:22 Are we in Third Normal Form? Student ID Subject Grade 960100 Databases C 960100 Software Development A 960100 ISDE D 960105 Software Development B 960105 ISDE B 960120 Databases A 960120 Software Development B 960120 Workshop C 960145 Databases B 960150 Databases B 960150 Software Development D 960150 ISDE C 960150 Workshop D Student ID Name Date of Birth 960100 Smith, J 14/11/1977 960105 White, A 10/05/1975 960120 Moore, T 11/03/1970 960145 Smith, J 09/01/1972 960150 Black, D 21/08/1973 Yes

Data Modelling for Databases – 4 15:22 What about this table? Project Code Project Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions System M. Phillips 24500 510001 A. Smith L004 IT 22.00 PC010 Pensions System M. Phillips 24500 510030 L. Jones L023 Pensions 18.50 PC010 Pensions System M. Phillips 24500 521010 P. Gold L004 IT 22.00 PC045 Salaries System H. Martin 17400 510010 B. Jones L004 IT 21.75 PC045 Salaries System H. Martin 17400 510001 A. Smith L004 IT 18.00 PC045 Salaries System H. Martin 17400 531002 T. Gilbert L028 Database 25.50 PC045 Salaries System H. Martin 17400 513210 W. Richards L008 Salary 17.00 PC064 HR System K. Lewis 12250 531002 T. Gilbert L028 Database 23.25 PC064 HR System K. Lewis 12250 521010 P. Gold L004 IT 17.50 PC064 HR System K. Lewis 12250 510034 B. Jeffries L009 HR 16.50 Is it in First Normal Form (1NF)? No Is it in Second Normal Form (2NF)?

Project Code Project Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions System M. Phillips 24500 510001 A. Smith L004 IT 22.00 PC010 Pensions System M. Phillips 24500 510030 L. Jones L023 Pensions 18.50 PC010 Pensions System M. Phillips 24500 521010 P. Gold L004 IT 22.00 PC045 Salaries System H. Martin 17400 510010 B. Jones L004 IT 21.75 PC045 Salaries System H. Martin 17400 510001 A. Smith L004 IT 18.00 PC045 Salaries System H. Martin 17400 531002 T. Gilbert L028 Database 25.50 PC045 Salaries System H. Martin 17400 513210 W. Richards L008 Salary 17.00 PC064 HR System K. Lewis 12250 531002 T. Gilbert L028 Database 23.25 PC064 HR System K. Lewis 12250 521010 P. Gold L004 IT 17.50 PC064 HR System K. Lewis 12250 510034 B. Jeffries L009 HR 16.50

Project ( ProjectCode , ProjectTitle, ProjectManager, ProjectBudget, EmployeeNo , EmployeeName, DepartmentNo, DepartmentName, HourlyRate ) Data Modelling for Databases – 4 15:22 Functional Dependency ProjectCode, EmployeeNo HourlyRate Functional Dependence FULL Functional Dependence ProjectCode ProjectTitle Project Manager Project Budget EmployeeNo EmployeeName DepartmentNo DepartmentName

Data Modelling for Databases – 4 15:22 Which are not in Third Normal Form? Project Code Employee No Hourly Rate PC010 510001 22.00 PC010 510030 18.50 PC010 521010 21.00 PC045 510010 21.75 PC045 510001 18.00 PC045 531002 25.50 PC045 513210 17.00 PC064 531002 23.25 PC064 521010 17.50 PC064 510034 16.50 Project Code Project Title Project Manager Project Budget PC010 Pensions System M. Phillips 24500 PC045 Salaries System H. Martin 17400 PC064 HR System K. Lewis 12250 Employee No Employee Name Department No Department Name 510001 A. Smith L004 IT 510030 L. Jones L023 Pensions 521010 P. Gold L004 IT 510010 B. Jones L004 IT 531002 T. Gilbert L028 Database 513210 W. Richards L008 Salary 510034 B. Jeffries L009 HR   

Project ( ProjectCode , ProjectTitle, ProjectManager, ProjectBudget, EmployeeNo , EmployeeName, DepartmentNo, DepartmentName, HourlyRate ) Data Modelling for Databases – 4 15:22 Functional Dependency ProjectCode, EmployeeNo HourlyRate ProjectCode ProjectTitle Project Manager Project Budget EmployeeNo EmployeeName DepartmentNo DepartmentName

Table must be in Second Normal Form No non-key attribute must be dependent on any other attribute(s) other than the Primary Key. Data Modelling for Databases – 4 15:22 Third Normal Form – Rules Employee No Employee Name Department No Department Name 510001 A. Smith L004 IT 510030 L. Jones L023 Pensions 521010 P. Gold L004 IT 510010 B. Jones L004 IT 531002 T. Gilbert L028 Database 513210 W. Richards L008 Salary 510034 B. Jeffries L009 HR Department Name is dependent on Department No

Data Modelling for Databases – 4 15:22 NOT in Third Normal Form How do we get the data into Third Normal Form? If an attribute is dependent on another non-key attribute: Make a new table formed from the non-key determinant and dependant attribute The determinant remains as a non-key attribute in the original table as a Foreign Key The determinant becomes the primary key for the new table.

Data Modelling for Databases – 4 15:22 Conversion to 3NF Employee No Employee Name Department No Department Name 510001 A. Smith L004 IT 510030 L. Jones L023 Pensions 521010 P. Gold L004 IT 510010 B. Jones L004 IT 531002 T. Gilbert L028 Database 513210 W. Richards L008 Salary 510034 B. Jeffries L009 HR Department No Department Name L004 IT L023 Pensions L028 Database L008 Salary L009 HR Non-key determinant Dependent Attribute Department No Department Name L004 IT L023 Pensions L028 Database L008 Salary L009 HR Employee No Employee Name Department No 510001 A. Smith L004 510030 L. Jones L023 521010 P. Gold L004 510010 B. Jones L004 531002 T. Gilbert L028 513210 W. Richards L008 510034 B. Jeffries L009

Data Modelling for Databases – 4 15:22 Notation - UNF Project Code Project Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions System M. Phillips 24500 510001 A. Smith L004 IT 22.00 PC010 Pensions System M. Phillips 24500 510030 L. Jones L023 Pensions 18.50 PC010 Pensions System M. Phillips 24500 521010 P. Gold L004 IT 21.00 PC045 Salaries System H. Martin 17400 510010 B. Jones L004 IT 21.75 PC045 Salaries System H. Martin 17400 510001 A. Smith L004 IT 18.00 PC045 Salaries System H. Martin 17400 531002 T. Gilbert L028 Database 25.50 PC045 Salaries System H. Martin 17400 513210 W. Richards L008 Salary 17.00 PC064 HR System K. Lewis 12250 531002 T. Gilbert L028 Database 23.25 PC064 HR System K. Lewis 12250 521010 P. Gold L004 IT 17.50 PC064 HR System K. Lewis 12250 510034 B. Jeffries L009 HR 16.50 Rates ( ProjCode, ProjTitle, ProjManager, ProjBudget, EmpNo , EmpName, DeptNo, DeptName , Hrate ) ( )

Data Modelling for Databases – 4 15:22 Notation - 1NF Project Code Project Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions System M. Phillips 24500 510001 A. Smith L004 IT 22.00 PC010 Pensions System M. Phillips 24500 510030 L. Jones L023 Pensions 18.50 PC010 Pensions System M. Phillips 24500 521010 P. Gold L004 IT 21.00 PC045 Salaries System H. Martin 17400 510010 B. Jones L004 IT 21.75 PC045 Salaries System H. Martin 17400 510001 A. Smith L004 IT 18.00 PC045 Salaries System H. Martin 17400 531002 T. Gilbert L028 Database 25.50 PC045 Salaries System H. Martin 17400 513210 W. Richards L008 Salary 17.00 PC064 HR System K. Lewis 12250 531002 T. Gilbert L028 Database 23.25 PC064 HR System K. Lewis 12250 521010 P. Gold L004 IT 17.50 PC064 HR System K. Lewis 12250 510034 B. Jeffries L009 HR 16.50 Rates ( ProjCode , ProjTitle, ProjManager, ProjBudget, EmpNo , EmpName, DeptNo, DeptName, Hrate )

Data Modelling for Databases – 4 15:22 Notation - 2NF Rates ( ProjCode *, EmpNo *, Hrate ) Project Code Employee No Hourly Rate PC010 510001 22.00 PC010 510030 18.50 PC010 521010 21.00 PC045 510010 21.75 PC045 510001 18.00 PC045 531002 25.50 PC045 513210 17.00 PC064 531002 23.25 PC064 521010 17.50 PC064 510034 16.50 Employee No Employee Name Department No Department Name 510001 A. Smith L004 IT 510030 L. Jones L023 Pensions 521010 P. Gold L004 IT 510010 B. Jones L004 IT 531002 T. Gilbert L028 Database 513210 W. Richards L008 Salary 510034 B. Jeffries L009 HR Project Code Project Title Project Manager Project Budget PC010 Pensions System M. Phillips 24500 PC045 Salaries System H. Martin 17400 PC064 HR System K. Lewis 12250 Employee ( EmpNo , EmpName, DeptNo, DeptName ) Project ( ProjCode , ProjTitle, ProjManager, ProjBudget )

Data Modelling for Databases – 4 15:22 Notation - 3NF Rates ( ProjCode *, EmpNo *, Hrate ) Project Code Employee No Hourly Rate PC010 510001 22.00 PC010 510030 18.50 PC010 521010 21.00 PC045 510010 21.75 PC045 510001 18.00 PC045 531002 25.50 PC045 513210 17.00 PC064 531002 23.25 PC064 521010 17.50 PC064 510034 16.50 Project Code Project Title Project Manager Project Budget PC010 Pensions System M. Phillips 24500 PC045 Salaries System H. Martin 17400 PC064 HR System K. Lewis 12250 Employee ( EmpNo , EmpName, DeptNo * ) Project ( ProjCode , ProjTitle, ProjManager, ProjBudget ) Employee No Employee Name Department No 510001 A. Smith L004 510030 L. Jones L023 521010 P. Gold L004 510010 B. Jones L004 531002 T. Gilbert L028 513210 W. Richards L008 510034 B. Jeffries L009 Department No Department Name L004 IT L023 Pensions L028 Database L008 Salary L009 HR Department ( DeptNo , DeptName )

Reduces anomalies by reorganising the storage of data Reducing the need for repeated data Update only one occurrence Insert just relevant data Data Modelling for Databases – 4 15:22 Normalisation

The Downside to Normalisation Queries requiring to rebuild the data Joining tables via Primary and Foreign Key pairs Data Modelling for Databases – 4 15:22 Rebuilding the Dataset

Every time we split a table in Normalisation, the new tables share Key values Data Modelling for Databases – 4 15:22 Rebuilding the Dataset Enrolment ( StudentID *, Subject , Grade) Student ( StudentID , Name, DoB ) The StudentID value in Enrolment can reference the StudentID value in Student for other information about that student

To rebuild the original table: Use the StudentID value in Enrolment, to reference the Student ID value in Student The referencing value is known as the FOREIGN KEY, the reference key is an existing PRIMARY KEY. Foreign keys are denoted by an * Data Modelling for Databases – 4 15:22 Rebuilding the Dataset Enrolment ( StudentID *, Subject , Grade) Student ( StudentID , Name, DoB )

Data Modelling for Databases (4COM2004) Data Modelling for Databases – 4 15:22