More Complex SQL and Concurrency ControlModule 4.pptx
bgscseise
8 views
139 slides
Feb 25, 2025
Slide 1 of 139
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
About This Presentation
dsdfdsfds
Size: 9.72 MB
Language: en
Added: Feb 25, 2025
Slides: 139 pages
Slide Content
Module 4 Chapter – 1
COMPARISONS INVOLVING NULL AND THREE-VALUED LOGIC SQL offers several ways to check if an attribute value is NULL. Using the IS NULL operator : This operator is specifically designed to test for NULL values . I t can be used with WHERE clause like this : This query will return all rows where the your_attribute column has a NULL value . It filters the results to include only rows where the email column has a NULL value.
2) Equality comparison with NULL : You can also directly compare an attribute with NULL. However, it's important to note that this method might not work consistently across all database systems . This query attempts to select rows where your_attribute is equal to NULL. While it might work in some databases, it's generally safer and more portable to use the IS NULL operator.
3) IS NOT NULL :- This operator checks if an attribute value is NOT NULL. It essentially means the opposite of IS NULL .
IN and NOT IN operators IN operator Used to select rows where a column's value matches any value in a specified list . Improves readability compared to writing multiple OR conditions . Example:- Imagine a table Colors with a color_name column. You want to find all records where color_name is either 'red', 'green', or 'blue'.
NOT IN operator Used to select rows where a column's value does not match any value in a specified list . Useful for excluding specific values . Example:- Using the same Colors table, you want to find all records where color_name is not 'red' or 'blue'.
Both IN and NOT IN can be used with lists of values or subqueries that return sets of values . They work with various data types like numbers, strings, and dates . Duplicate values within the list are ignored during evaluation.
NESTED QUERIES, TUPLES AND SET/MULTISET COMPARISONS Nested queries Nested queries, also known as subqueries , are queries within a query . They allow for more complex database operations and can be used in various SQL statements like SELECT, INSERT, UPDATE, and DELETE . Types of Nested Queries Single-row subquery : Returns a single row and is used with comparison operators such as =, >, <, etc. Multiple-row subquery : Returns multiple rows and is used with operators like IN, ANY, ALL . Multiple-column subquery : Returns multiple columns and can be used with comparison operators or in the FROM clause .
Single row subquery Find the employee with the highest salary. EmployeeID EmployeeName DepartmentID Salary 1 Alice 1 50000 2 Bob 2 60000 3 Charlie 1 55000 4 David 3 70000 DepartmentID DepartmentName 1 HR 2 IT 3 Finance
2) Multiple-row Subquery Find employees who work in departments with a name starting with 'H '. EmployeeID EmployeeName DepartmentID Salary 1 Alice 1 50000 2 Bob 2 60000 3 Charlie 1 55000 4 David 3 70000 DepartmentID DepartmentName 1 HR 2 IT 3 Finance
3) Multiple-column Subquery Find the employee details who have the same salary and department as Alice. EmployeeID EmployeeName DepartmentID Salary 1 Alice 1 50000 2 Bob 2 60000 3 Charlie 1 55000 4 David 3 70000 DepartmentID DepartmentName 1 HR 2 IT 3 Finance
Example:- Find the employee details who have the same salary and department as Alice.
4) Subquery in the FROM Clause. Get the average salary of each department.
5) Correlated Subquery Find employees who are earning more than the average salary of all employees in the company.
Example :- ( Use a nested query for complex join conditions where direct joins are insufficient or too complex . ) Find employees whose salary is higher than the salary of any employee in the 'Finance' department. SELECT EmployeeName , Salary FROM Employees e1 WHERE Salary > ( SELECT MAX(Salary) FROM Employees e2 WHERE e2.DepartmentID = ( SELECT DepartmentID FROM Departments WHERE DepartmentName = 'Finance')); first get the department ID for 'Finance', find the maximum salary in that department, compare other employees' salaries to this maximum .
When to use nested queries?
‘ALL’ keyword in SQL The ALL keyword in SQL is used to compare a value to a set of values returned by a subquery . It’s typically used with comparison operators such as =, !=, >, <, >=, and <=. The ALL keyword ensures that the condition holds true for all values in the result set of the subquery.
Example:- Find employees with a salary greater than all salaries in the 'HR' department. EmployeeID EmployeeName DepartmentID Salary 1 Alice 1 50000 2 Bob 2 60000 3 Charlie 1 55000 4 David 3 70000 5 Eve 2 65000 DepartmentID DepartmentName 1 HR 2 IT 3 Finance SELECT EmployeeName , Salary FROM Employees WHERE Salary > ALL ( SELECT Salary FROM Employees WHERE DepartmentID = ( SELECT DepartmentID FROM Departments WHERE DepartmentName = 'HR‘ ) ); Resultant Table EmployeeName Salary Bob 60000 Eve 65000 David 70000
Example:- Find employees with a salary equal to the maximum salary of the 'Finance' department. EmployeeID EmployeeName DepartmentID Salary 1 Alice 1 50000 2 Bob 2 60000 3 Charlie 1 55000 4 David 3 70000 5 Eve 2 65000 DepartmentID DepartmentName 1 HR 2 IT 3 Finance SELECT EmployeeName , Salary FROM Employees WHERE Salary = ALL ( SELECT MAX(Salary) FROM Employees WHERE DepartmentID = ( SELECT DepartmentID FROM Departments WHERE DepartmentName = 'Finance')); EmployeeName Salary David 70000 Resultant Table
RENAMING IN SQL In SQL, it is possible to rename any attribute that appears in the result of a query by adding the qualifier AS followed by the desired new name . Hence, the AS construct can be used to alias both attribute and relation names in general, and it can be used in appropriate parts of a query .
EXISTS, NOT EXISTS, and UNIQUE 1) Exists :- The EXISTS keyword checks if a subquery returns any rows .
Find customers who have placed at least one order. The query returns customers Alice, Bob, and Charlie because they have corresponding entries in the ORDERS table. EXISTS clause only checks for the existence of rows and does not use the actual data from those rows , using SELECT 1 can be slightly more efficient than selecting actual columns. SELECT 1 makes it clear that the purpose of the subquery is merely to check for the existence of rows that meet the specified condition , not to retrieve data .
Find customers who have placed orders in 2023.
SELECT ID, NAME FROM CUSTOMERS c WHERE EXISTS ( SELECT 1 FROM ORDERS o WHERE o.CUSTOMER_ID = c.ID AND YEAR( o.ORDER_DATE ) = 2023); O.Cid C.cid Year 1 1 2023 2 2 2023 1 1 2023 3 3 2022 4 4 2023 5 5 2023 Not considered
Find products that have been sold.
2) NOT EXISTS :- The NOT EXISTS keyword in SQL is used to check if a subquery returns no rows . It's particularly useful for finding records that do not have corresponding entries in related tables . NOT EXISTS is used to check if a subquery returns no rows . It's particularly useful for finding records without corresponding entries in related tables . It can help identify gaps or missing relationships in your data . NOT EXISTS is often used in scenarios such as finding records without related entries, optimizing performance, and ensuring data integrity.
Finding Customers with No Orders
Finding Products Not Sold
The query returns the product Headphones because it has no corresponding entries in the SALES table.
3) UNIQUE The UNIQUE keyword in SQL is used to ensure that all values in a column or a set of columns are unique across all rows in the table . It helps maintain data integrity by preventing duplicate values in specified columns .
JOINED TABLES IN SQL AND OUTER JOINS Joins in SQL are a fundamental concept for working with relational databases. It allows to combine data from multiple tables based on a shared relationship between them. This is incredibly useful for retrieving comprehensive information that might be spread across different tables .
1) NATURAL join A NATURAL JOIN is a type of join that automatically joins tables based on columns with the same name and compatible data types in both tables .
The NATURAL JOIN combines the CUSTOMERS and ORDERS tables based on the common column CUSTOMER_ID . It automatically identifies the common column and joins the tables on this column . The result includes all columns from both tables, with rows matched based on the common column values.
2) Inner Join The INNER JOIN keyword selects records that have matching values in both tables . JOIN and INNER JOIN will return the same result . INNER is the default join type for JOIN , so when you write JOIN it actually writes INNER JOIN.
Studentcourse : Write a query to show the names and age of students enrolled in different courses.
3) Left Join The LEFT JOIN keyword returns all records from the left table (table1), and the matching records from the right table (table2 ) . For the rows for which there is no matching row on the right side , the result-set will contain null . LEFT JOIN is also known as LEFT OUTER JOIN .
Write a query to fetch all customers along with their order details if they have any.
4) Right Join It returns all the rows of the table on the right side of the join and matching rows for the table on the left side of the join . It is very similar to LEFT JOIN, for the rows for which there is no matching row on the left side , the result-set will contain null .
Resultant Table
5) Full Join It creates the result-set by combining results of both LEFT JOIN and RIGHT JOIN. The result-set will contain all the rows from both tables . For the rows for which there is no matching , the result-set will contain NULL values .
6) Multiway join A multiway join involves joining more than two tables in a single SQL query. This can be done using various types of joins such as INNER JOIN, LEFT JOIN, RIGHT JOIN, or FULL JOIN.
Views(Virtual Tables)in SQL In SQL, a view is a virtual table based on the result-set of an SQL statement . A view contains rows and columns, just like a real table. The columns in a view are columns from one or more real tables in the database . A view is created with the CREATE VIEW statement . Syntax to create a view:
Any modifications that are done in student table will be reflected all the view tables.
Any modifications that are done in view tables will be reflected to the main student table. Before updating of view table After updating of view table
Working with multiple tables and creating views
Specifying General Constraints as Assertions in SQL Assertions in SQL are constraints that enforce rules on the data across multiple tables or columns , ensuring the database remains in a consistent state . They are typically used to enforce complex business rules that cannot be implemented using standard constraints like PRIMARY KEY, FOREIGN KEY, CHECK, or UNIQUE. We want to enforce a rule that the total salary of employees in any department must not exceed $500,000. This kind of rule could be enforced with an assertion.
This is the command used to create a new assertion . Assertions are used to specify integrity constraints that apply to the entire table. name of the assertion This keyword introduces the condition that the assertion needs to verify. Condition
When to Use Assertions Cross-table constraints: Enforcing rules that involve multiple tables . Complex conditions: Rules that involve calculations or comparisons that aren't feasible with standard constraints . Support for Assertions in MySQL MySQL does not support CREATE ASSERTION. However, you can achieve similar results using: Triggers Stored Procedures Views with CHECK OPTION
Actions as triggers
AGGREGATE FUNCTIONS IN SQL Aggregate functions are used to summarize information from multiple tuples into a single-tuple summary . Grouping is used to create subgroups of tuples before summarization. Grouping and aggregation are required in many database applications, and we will introduce their use in SQL through examples . A number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG . The COUNT function returns the number of tuples or values as specified in a query . The functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and return, respectively, the sum, maximum value, minimum value, and average (mean) of those values . These functions can be used in the SELECT clause or in a HAVING clause .
Example:- Find the sum of the salaries of all employees, the maximum salary, the minimum salary, and the average salary . Example:-
“Group by” and “Having” clauses Group by clause :-
Having clause :- The HAVING clause in SQL acts like a refinement tool specifically for data grouped using GROUP BY . It allows you to filter these groups based on additional conditions applied to the summary values calculated by aggregate functions .
Orders table customer_id count(*) 1 6 Customer1 -> has placed 6 orders Customer 2 -> has placed 3 orders Customer3 -> has placed 3 orders
Resultant Table
Resultant Table
Other SQL Constructs: WITH and CASE WITH clause : The WITH clause, also known as Common Table Expressions (CTE ) the WITH clause enhances the clarity, efficiency, and manageability of SQL queries, making it a valuable tool for database developers and analysts .
Resultant Table
CTEs can be referenced multiple times within the main query, avoiding the need to repeat the same subquery code.
CASE in SQL The CASE statement in SQL is a conditional expression that can be used to add if-else logic to your queries . It allows you to return different values based on certain conditions .
1) Simple CASE expression
Query to give a salary adjustment based on the department
2) Searched CASE Expression You want to give a bonus based on salary:
Query to assign grades based on the score
CHAPTER 2 TRANSACTIONS
Transaction A set of logically related operations is known as transactions. Collection of operations that form a single logical unit of work are called transaction . or Transaction is defined as a logical unit of database processing that includes one or more database access operations.
Introduction to Transaction processing Single user versus multiuser systems One criterion for classifying a database system is according to the number of users who can use the system concurrently . A DBMS is single-user if at most one user at a time can use the system, and it is multiuser if many users can use the system and hence access the database concurrently . Single-user DBMSs are mostly restricted to personal computer systems ; most other DBMSs are multiuser . For example, an airline reservations system is used by hundreds of travel agents and reservation clerks concurrently. Database systems used in banks, insurance agencies, stock exchanges, supermarkets, and many other applications are multiuser systems. In these systems, hundreds or thousands of users are typically operating on the database by submitting transactions concurrently to the system .
Multiple users can access databases and use computer systems simultaneously because of the concept of multiprogramming , which allows the operating system of the computer to execute multiple programs or processes at the same time . A single central processing unit (CPU) can only execute at most one process at a time . M ultiprogramming operating systems execute some commands from one process, then suspend that process and execute some commands from the next process, and so on. A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU again.
Figure shows two processes, A and B , executing concurrently in an interleaved fashion . Interleaving keeps the CPU busy when a process requires an input or output (I/O) operation, such as reading a block from disk. The CPU is switched to execute another process rather than remaining idle during I/O time . If the computer system has multiple hardware processors (CPUs) , parallel processing of multiple processes is possible, as illustrated by processes C and D.
Transactions, Database Items, Read and Write Operations, and DBMS Buffers A transaction is an executing program that forms a logical unit of database processing. A transaction includes one or more database access operations these can include insertion, deletion, modification, or retrieval operations . The database operations that form a transaction can either be embedded within an application program or they can be specified interactively via a high-level query language such as SQL . One way of specifying the transaction boundaries is by specifying explicit begin transaction and end transaction statements in an application program; in this case, all database access operations between the two are considered as forming one transaction.
A single application program may contain more than one transaction if it contains several transaction boundaries. If the database operations in a transaction do not update the database but only retrieve data , the transaction is called a read-only transaction ; otherwise it is known as a read-write transaction . A database is basically represented as a collection of named data items . The size of a data item is called its granularity . A data item can be a database record , but it can also be a larger unit such as a whole disk block , or even a smaller unit such as an individual field (attribute) value of some record in the database.
.
Transaction States and Additional Operations A transaction is an atomic unit of work that should either be completed in its entirety or not done at all. For recovery purposes, the system needs to keep track of when each transaction starts, terminates, and commits or aborts. BEGIN TRANSACTION :- This marks the beginning of transaction execution. READ or WRITE :- These specify read or write operations on the database items that are executed as part of a transaction. END TRANSACTION :- This specifies that READ and WRITE transaction operations have ended and marks the end of transaction execution . At this point it may be necessary to check whether the changes introduced by the transaction can be permanently applied to the database (committed) or whether the transaction has to be aborted because it violates serializability or for some other reason . COMMIT TRANSACTION :- This signals a successful end of the transaction so that any changes (updates) executed by the transaction can be safely committed to the database and will not be undone . ROLLBACK :- This signals that the transaction has ended unsuccessfully, so that any changes or effects that the transaction may have applied to the database must be undone.
Active :- The transaction is currently executing operations. In this state, the transaction is reading or writing data and has not yet finished all its intended operations . Partially Committed :- The transaction has completed its final operation, but the changes are not yet permanent. This state occurs right after the transaction executes its last statement but before the changes are made permanent . Committed :- The transaction has successfully completed all operations, and all changes are now permanent. In this state, the transaction's effects are written to the database, and they are now durable and visible to other transactions.
Failed :- The transaction cannot continue due to some error or issue. This can happen due to system crashes, errors, or violations of constraints. The transaction will not complete successfully . Aborted :- The transaction has been rolled back and undone. All changes made by the transaction are undone, and the database is returned to its previous state before the transaction began. The transaction can be restarted or terminated . Terminated :- The transaction has completed its execution (either committed or aborted) and is no longer active. This is the final state, indicating that the transaction has ended.
Desirable properties of transactions There are three properties that are involved: Atomicity Consistency Isolation Durability
What is concurrency control? Problems that encounter during concurrency control of transactions Concurrency control in a Database Management System (DBMS) refers to the techniques and mechanisms used to ensure that database transactions are executed concurrently without violating the integrity of the database . It ensures that the database remains consistent( standard ) despite the simultaneous execution of multiple transactions. Problems that encounter during concurrency control are as follows: 1) The lost update problem 2) The temporary update (or dirty read) problem 3) The incorrect summary problem 4) The unrepeatable read problem
1) Lost update problem The "lost update problem" is a concurrency control issue in database systems that occurs when two or more transactions read the same data and then update it based on the value they read . If proper concurrency control mechanisms are not in place, one of the updates may be overwritten by another, leading to the loss of data . This problem can compromise the consistency and correctness of the database.
2) The temporary update ( or dirty read) problem
3) The incorrect summary problem "The incorrect summary problem" is a concurrency control issue that occurs when a transaction calculates an aggregate summary of data while other transactions are concurrently modifying the same data . This can lead to inaccurate or inconsistent summary results . Example: AccountID Balance 1 500 2 300 3 700
4) The unrepeatable read problem
Why recovery is needed? Whenever a transaction is submitted to a DBMS for execution , the system is responsible for making sure that either All the operations in the transaction are completed successfully and their effect is recorded permanently in the database . the transaction does not have any effect on the database or any other transactions . In the first case , the transaction is said to be committed , where as in the second case , the transaction is aborted . If the transaction fails after executing some of its operations but before executing all of them, the operations already executed must be undone and have no lasting effect.
Types of failures A computer failure (System crash) :- A hardware, software, or network error occurs in the computer system during transaction execution. Hardware crashes are usually media failures—for example, main memory failure . A transaction or system error :- Some operation in the transaction may cause it to fail, such as integer overflow or division by zero . Transaction failure may also occur because of erroneous parameter values or because of a logical programming error . Local errors or exception conditions detected by the transaction :- During transaction execution, certain conditions may occur that necessitate cancellation of the transaction. For example, data for the transaction may not be found . An exception condition, such as insufficient account balance in a banking database , may cause a transaction, such as a fund withdrawal, to be canceled . This exception could be programmed in the transaction itself.
4) Concurrency control enforcement :- The concurrency control method may abort a transaction because it violates serializability or it may abort one or more transactions to resolve a state of deadlock among several transactions. Transactions aborted because of serializability violations or deadlocks are typically restarted automatically at a later time . 5) Disk failure :- Some disk blocks may lose their data because of a read or write malfunction. This may happen during a read or a write operation of the transaction . 6) Physical problems and catastrophes :- This refers to an endless list of problems that includes power or air-conditioning failure , fire, theft, sabotage(destroy/damage), overwriting disks or tapes by mistake, and mounting of a wrong tape by the operator .
Transaction support in SQL Transaction initiation will be done implicitly. Transaction will end explicitly using commit/rollback. Every transaction has certain characteristics attributed to it. These characteristics are specified by a SET TRANSACTION statement in SQL. Characteristics include access mode, diagnostic area size, isolation level. Access mode :- Access mode can be read-only or read-write. The default access mode can be read-write. When it is in read-write, then select, update, insert, delete and create commands to be executed. When it is in read-only, then it can perform data retrieval. Diagnostic area size :- Sets the size of the diagnostics area to n, which determines how many error or warning conditions can be stored . Isolation level :- This isolation level ensures the highest level of isolation between transactions, preventing dirty reads, non-repeatable reads, and phantom reads .