this chapter deals with: Basic Concepts of Databases, Sources of data, Evolution of Database
Database Benefits, Types of Database, Database Design, Characteristics of Database, and Jobs with Database.
§Basic Concepts of Databases
§Sources of data
§Evolution of Database
§Database Benefits
§Types of Database
§Database Design
§Characteristics of Database
§Jobs with Database
3/5/2024ECEg 4181 - By: Eyob S. 2
§Data: raw facts
§It must be formatted for processing and storage
§Bigdata: refers to huge amount of data
§Can be structured, unstructured or semistructured
§Database: a collection of data organized to enable the
creation, reading, updating, and deletion (CRUD) of data
§Information: generated from processed data
§It requires context to determine meaning
§It can become knowledge used for decision making
§Metadata: data about data, or description of the data
3/5/2024ECEg 4181 - By: Eyob S. 3
§Data can be collected from different sources
§You can collect from
§Internet searching: e.g. Google Dataset, Kaggle, & Earthdata
§Web scraping: is the process of using bots to extract content
and data from a website.
§Transactions: from finances, banks, etc.
§Asking servers with APIs: e.g. about music in Spotify API
§Querying a database,… etc.
§Nowadays, any company uses data
3/5/2024ECEg 4181 - By: Eyob S. 4
3/5/2024ECEg 4181 - By: Eyob S. 5
§Traditional
§Manual System: Files, folders, file cabinets
§Computerized: Apple numbers, Google sheet, MS excel
– Poor structure: changing a structure breaks the system
– Poor data dependency, redundancy, and inconsistency
– Data insecurity, integrity issues, decentralized data, etc.
§Hierarchical and Network Models were introduced in mid 1960s
§In 1970s, Edgar F. Codd introduced the Relational Model, which
provided a more flexible and intuitive way to organize and access
data using tables and structural
3/5/2024ECEg 4181 - By: Eyob S. 6
§ Scale
§Spreadsheets can hold thousands of records, whereas
databases can hold millions or even billions of records
§ Frequency
§Databases are designed to manage and process frequent
data operations efficiently (realtime updates and queries)
§ Speed
§Databases can perform queries, updates, and other
transactions much faster
3/5/2024ECEg 4181 - By: Eyob S. 7
§Centralized data
§the data is centrally located and it is a collection of persistent
data that can be shared and interrelated
§Persistent – the data resides on stable storage since the
data is repetitively used
§Shared – the database can have multiple users
§Interrelated – data stored as a separate unit can be
connected to provide a whole picture
§Database contains flood of data about many aspects which
are useful for decision making
3/5/2024ECEg 4181 - By: Eyob S. 8
3/5/2024ECEg 4181 - By: Eyob S. 9
§Single-user (PCs) or Multi-user (Workspaces, Enterprises)
§Centralized or Distributed (multi-location)
§Cloud database (MS Azure, Amazon AWS, IBM, Oracle, etc.)
§General purpose / Discipline-specific
§Operational (OLTP, Transactional, Production) or Analytical
§Structured or Unstructured (or Semi-structured)
§eXtensible Markup Language (XML) databases
§Not only SQL (NoSQL)
§Logical data format
§How you visualize your data
§Draw the entity relationship diagram (ERD)
§Physical data format
§Actual database
§Database management system (DBMS)
§Database system environments
§Hardware: electronic devices (server side, client side,…)
§Software: OS (operating systems), DBMS, applications, etc.
§Information: information lives in a database.
§Procedure: how data get into a database or come out from it.
§People: database expert, programmers, end user, etc.
3/5/2024ECEg 4181 - By: Eyob S. 10
§DBMS (Database Management System)
§A software though which you can interact with a database
§DBMS is a software system that enables users to define, create,
maintain, and control access to the database.
§Examples:
§MS Access
§Oracle
§MySQL
3/5/2024ECEg 4181 - By: Eyob S. 11
§ Ingres
§ MariaDB
§ PostgreSQL
§ Snowflake
§ SQLite,…etc.
§ MongoDB
§ DynamoDB
§ ScyllaDB
§ Redis
§ Neo4J
§ ArangoDB
§ Hbase
§ Cassandra,…etc.
§The most common query language is the Structured Query
Language (SQL, pronounced “S-Q-L”, or sometimes “See-
Quel”), which is now both the formal and de facto standard
language for relational DBMSs.
3/5/2024ECEg 4181 - By: Eyob S. 12
§JSON (JavaScript Object Notation) plays a crucial role in NoSQL
databases, especially document-oriented databases like
MongoDB, Couchbase, and Firebase.
3/5/2024ECEg 4181 - By: Eyob S. 13
§Self-describing nature of the database system
§Database contains not only the database itself but also metadata
§Metadata is a complete definition(description) of the database
structure and its constraints
§Insulation between data and program
§which is also called program data independence
§Metadata is stored in the DBMS catalog separately from the
access program
§The characteristics that allow program data independence is
called data abstraction
3/5/2024ECEg 4181 - By: Eyob S. 14
§Support multiple user view of the data
§A view may be a subset of the database, or it contain virtual data
that is derived from the database files, but not explicitly stored
§A database has different users and each of them may require a
different perspective (view) of the database
§Sharing of data and multiple user transaction processing
§Concurrency control software to ensure that multiple users
trying to update the same data, do so in a controlled manner, so
that the result of the update must be correct
§Isolation property ensures that each transaction appears to
execute in isolation from other transactions, even though
hundreds of transactions may be executing concurrently
3/5/2024ECEg 4181 - By: Eyob S. 15
§DBMS has built-in facilities to support concurrent or
parallel execution of database programs
§Sequence of read/write operations considered to be an
atomic unit in the sense that either all operations are
executed or none at all
§Read/write operations can be executed at the same time
by the DBMS
§DBMS should avoid any inconsistencies
3/5/2024ECEg 4181 - By: Eyob S. 16
BACKUP AND RECOVERY FACILITIES
§Backup and recovery facilities can be used to deal with
the effect of loss of data due to hardware or network
errors, or bugs in system or application software
§Backup facilities can either perform a full or incremental
backup
§Recovery facilities allow restoration of the data to a
previous state after loss or damage occurs
17
DATA SECURITY
§Data security can be enforced by the DBMS
§Some users have read access, while others have write
access to the data (role-based functionality)
§Sophisticated granularity is possible
§Data access can be managed via logins and passwords
assigned to users or user accounts
§Each account has its own authorization rules that can be
stored in the catalog
18
PERFORMANCE UTILITIES
§There are three key performance indicators (KPIs) of a DBMS
§Response time, denoting the time elapsed between issuing a
database request and the successful termination thereof
§Throughput rate, representing the transactions a DBMS can
process per unit of time
§Space utilization, referring to the space utilized by the DBMS to
store both raw data and metadata
§DBMSs come with various types of utilities aimed at
improving these KPIs
§e.g., utilities to distribute and optimize data storage, to tune
indexes for faster query execution, to tune queries to
improve application performance, or to optimize buffer
management
19