References
Lecture notes and Papers provided in the class.
A. Silberschatz,H. KorthS. Sudarshan,”Database
System Concepts”
C.J. Date, “An Introduction to Database Systems”
Overview of
Database Management
What is a Database System?
What is a Database?
Why Database?
Data Independence
Relational Systems, and Others
Figure A simplified database system environment
Database System
Computerized record-keeping system
Supports operations
Add or delete files to the database
Insert, retrieve, remove, or change data in database
Components
Data, hardware, software, users
Drawbacks of using file systems
to store data
Data redundancy and inconsistency
Multiple file formats, duplication of information in
different files
Difficulty in accessing data
Need to write a new program to carry out each
new task
Data isolation —multiple files and formats
Integrity problems
Integrity constraints (e.g., account balance > 0)
become “buried” in program code rather than
being stated explicitly
Hard to add new constraints or change existing
ones
Drawbacks of using file systems to
store data (Cont.)
Atomicity of updates
Failures may leave database in an inconsistent state with partial
updates carried out
Example: Transfer of funds from one account to another should
either complete or not happen at all
Concurrent access by multiple users
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
•Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the same
time
Security problems
Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Database System -Components
Data
May support single or many users
Many users in organizations
•Data is integrated
•Data is shared
Different users will require different views
Hardware
Data is stored on Disk
Data operated on in main memory
Database System -Components
Software
Database server
Database management system (DBMS)
DBMS provided by specific vendor
DBMS is not ( but may come with)
•Application Development Tools
•Application Software
•Report Writer
•System utilities
Users
Application Programmers
End Users
Database Administrators
Database administrators(DBAs): Responsible for
managing the database system, authorizing access,
coordinating & monitoring uses, acquiring resources.
Database designers: Responsible for designing the
database, identifying the data to be stored, choosing the
structures to represent and store this data.
End Users: The persons that use the database for
querying, updating, generating reports, etc. ( Casual,
Parametric and Sophisticated)
System Analysts/Application programmers: Design
and implement canned transactions for parametric users.
Persons involved in large database
Persons involved in design, development, operation, and
maintenance of the DBMS software and system environment.
DBMS designers and implementers:
Design and implement the DBMS software package itself.
Tool developers:
Design and implement tools that facilitate the use of the
DBMS software. Tools include design tools, performance tools,
special interfaces, etc.
Operators and maintenance personnel:
Work on running and maintaining the hardware and software
environment for the database system.
Persons behind the scene
Database Users and Administrators
Database
What is a Database?
Collection of persistentdata
Collection of true propositions
Made up of entities, relationships, properties
Implements a data model
What is a Database?
Stores enterprise information over time
Outlasts the running of a computer program
Updated and retrieved in OLTP
operational/production system
Offers decision support via data warehouse
Entities and Relationships
Entity is a person, place, event or thing,
about which we wish to store information
Relationship is a connection between
entities, about which we wish to store
information
A relationship can be considered a special
case of entity
Properties
Entities have properties
Properties are the characteristics of an
entity
Properties can be simple or complex
Data and Data Models
Database is a collection of true propositions
Data model is an abstract, self-contained,
logical representation
Implementation of the data model on a
specific platform
Why Database?
Shared data
Reduced redundancy
Reduced inconsistent data
Transaction support
Support for data integrity
Security enforcement
Support for standards
Conflicting requirements can be met
Data Independence
Database separates logical and physical
representation of data
Allows changes to application programs without
changing the structure of the underlying data
and vice versa
Relational Systems
Most important innovation in database history
Based on logic and mathematics
Data is perceived as tables, only
Operators derive new tables from existing
A table is a “relation,” mathematically
Not Relational Systems
Hierarchic
Network
Object etc..
Main costs of using a DBMS
High initial investment in hardware, software,
training and possible need for additional hardware.
Overhead for providing generality, security,
recovery, integrity, and concurrency control.
When DBMS may be unnecessary
If the database and applications are simple, well
defined, and not expected to change.
If there are stringent real-time requirements that
may not be met because of DBMS overhead.
If access to data by multiple users is not
required
History of Database Systems
1950s and early 1960s:
Data processing using magnetic tapes for storage
•Tapes provided only sequential access
Punched cards for input
Late 1960s and 1970s:
Hard disks allowed direct access to data
Network and hierarchical data models in widespread
use
Ted Codd defines the relational data model
•Would win the ACM Turing Award for this work
•IBM Research begins System R prototype
•UC Berkeley begins Ingres prototype
High-performance (for the era) transaction processing
History (cont.)
1980s:
Research relational prototypes evolve into commercial
systems
•SQL becomes industrial standard
Parallel and distributed database systems
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
Early 2000s:
XML and XQuery standards
Automated database administration
Later 2000s:
Giant data storage systems
•Google, Yahoo,, Amazon, ..
Database Systems
Architecture
DataBase Management System
(DBMS)
High-level
Query Q
DBMS
Data
Answer
Translates Q into
best execution plan
for current conditions,
runs plan
Keeps data safe
and correct
despite failures,
concurrent
updates, online
processing, etc.
Database Systems Architecture
Three levels of architecture ( External,
Conceptual, Internal)
Mappings
Database Administrator (DBA)
Database Management System (DBMS)
Database Communications
Client/Server Architecture
Utilities
Distributed Processing
Database Architecture
Internal Level Conceptual Level External or View Level
Three Tier Architecture
External Level
Concerned with the way individual users see the data.
A user is anyone who needs to access some portion of
the data.
May support single or groups of users
The application programmer may use a high level
language while the casual user will probably use a query
language.
Implemented via SQL views
a data definition language (DDL) -provides for the definition or
description of database objects
a data manipulation language (DML) -supports the manipulation
or processing of database objects.
External Level-II
Different users will require different views
Examples:
Students should not see faculty salaries.
Faculty should not see billing or payment data.
Information that can be derived from stored data might be
viewed as if it were stored.
GPA not stored, calculated when needed.
Conceptual Level
An abstract representation of the entire information
content of the database.(contains the view of the whole
enterprise)
Hides details of the physical level, no concern for the
physical implementation.
Storage structure is ignored
Access strategy is ignored
Shared by all users
Foundation for database design
Defined by conceptual schema
Internal Level
The internal view is the view about the actual physical
storage of data.
Physical layer
Blocks, pages, I/O
Described by internal schema, DDL
Hardware dependent
Following aspects are considered at this level:
Storage allocation e.g. B-trees, hashing etc.
Access paths e.g. specification of primary and secondary keys,
indexes and pointers and sequencing.
Miscellaneous e.g. data compression and encryption techniques,
optimization of the internal structures.
Working of 3 Layers
Assuming the three level view of the database, a number
of mappings are needed to enable the users working
with one of the external views.
For example, the payroll office may have an external
view of the database that consists of the following
information only:
Staff number, name and address.
Staff tax information e.g. number of dependents.
Staff bank information where salary is deposited.
Staff employment status, salary level, leave information etc.
Mappings
The conceptual view of the database may contain
academic staff, general staff, casual staff etc.
Mapping will need to be created where all the staff in the
different categories are combined into one category for
the payroll office.
The conceptual view would include information about
each staff's position, the date employment started, full-
time or part-time, etc`. This will need to be mapped to
the salary level for the salary office.
Also, if there is some change in the conceptual view, the
external view can stay the same if the mapping is
changed.
Mappings-II
Mappings-III
Conceptual/internal
Implementation of logical design
Corresponds between conceptual view and stored
database
Specifies how conceptual records and fields are
represented at internal level.
External/conceptual
Defines the correspondence between a particular
external view and conceptual view.
Example field and record name change
External/external
Sometime it is needed to some external views to
be defined of others
DBMS Interfaces
Menu-Based Interfaces
Use system of menus to provide user with possible
choices instead of using commands in certain language
Form-Based Interfaces
Displays a form to each user. User can only enter data
provided by the form. used for naïve users and canned
transactions.
Graphical User Interfaces
Displays GUI schema (windows, list boxes, buttons,
combo boxes ) and users select from possible choices
using mouse.
DBMS Interfaces
Natural language Interfaces
Accept instructions in "English".
Interfaces for Parametric Users
A special interface for naive users such as bank tellers
that repeatedly do few operations. Grey keys may be
used for this few commands to minimize the number of
keystrokes.
Interfaces for the DBA.
Only DBA can do some privileged commands ( creating
user account, setting system parameters, granting
account authorization, reorganizing the storage structure
of a DB)
Database System Utilities
loading ( load files into DB and providing conversion
from file format into DB format)
backup (usually done by dumping the entire DB onto
tape)
file reorganization ( reorganize DB file into a different file
organization to improve performance)
performance monitoring (monitors using of a DB and
provides statistics)
other utilities
sorting files
handling data compression
monitoring access by users
Database Administrator
Participates in conceptual database design
Determines how to implement conceptual schema
Teach users, and help them report
Implement security and integrity
Implement unload/reload utilities
Monitor and tune database performance
Database Management System
DBMS is a software that handles all access to
the database.
DDL processor / compiler
DML processor / compiler
Handle scheduled and ad hocqueries
Optimizer and run-time manager
Security and integrity
Recovery and concurrency
Data dictionary
Performance tuning utilities
Data Dictionary
The DBMS must provide a data dictionary function.
The data dictionary is a system database (not user
database)
It contains “data about the data” called “meta-data
or descriptor” –that is, definitions of those objects
in the system, instead of just “raw data”.
It will keep all of various schemas and mapping
and all of the various security and integrity
constraints, both source and object form.
Other terms are “dictionary”, “catalog”, “data
repository”.
Support for System Processes
Data Communications interface
Client Server Architecture
External tool support: query, reports, graphics,
spreadsheets, statistics
Utilities: unload/reload, stats, reorg
Distributed processing
Centralized and Client/Server
Architecture for DBMSs
Centralized DBMSs Architecture
Centralized DBMS => all DBMS function
All processing performed remotely on the
computer system, only display information
and controls were sent from the computer to
display terminal
Basic Client/Server
Idea is to define specialized servers with specific
functionalities.
Client machines provide the user with the
appropriate interfaces to utilize the server, and local
processing power to run local application.
Client is a user machine that provide user interface
capabilities and local processing. When requires
process not in that machine it connects to a server
that provide the needed functions.
Server is a machine that can provide service to the
client machine.
Architecture on
client/server frame work
Two-tier Client/Server Architecture
Three-tier Client/Server Architecture
Two-tier Client/Server
Client
site1
Client
Site2
Server
Site3
Server
Client
Site n
……
Communication Network
Three-tier Client/Server
GUI
Web Interface
Application
Program
Web Pages
Database
Management
System
Client
Application Server
Or
Web server
Database
Server
This server plays an
intermediate role by
storing business rule
(procedure/constraints)
that are used to access
data from the database
server. Also improve
database security.
Client contain GUI interface
and some additional business
rule
Classification of DBMS
By Data Model
Relational Data Model
Object Data Model
Hierarchical and Network Data Models
Relational databases were the most commonly used (comprising about
three-quarters of all databases).
Classification of DBMS
By numbers of users supported by the system
Single User system
Multi User system
Number of Sites
Centralized => the data stored at a single computer site
Distributed DBMS (DDBMS) => have the actual database and
DBMS software distributed over many sites
•Homogenous DDBMSs
•Heterogeneous DBMSs
Purpose
General Purpose
Special Purpose