nullChapter 5 The Database Language SQLChapter 5 The Database Language SQLSQL (sometimes pronounced “sequel”)
--- stands for “Structured Query Language”
Evolution of SQL standard:
SQL89 SQL92 (SQL2) SQL3
SQL is based on set and relational operations with certain modifications and enhancements
Components of SQL:
DDL: Data Definition Language
-CREATE TABLE, DROP TABLE, etc.
DML: Data Manipulation Language
-Query: SELECT
-Modification: INSERT, DELETE, UPDATE
5.1 Simple Queries in SQL5.1 Simple Queries in SQLPrincipal form: SELECT A1, A2, ..., An FROM r1, r2, ..., rm WHERE P
Ais represent attributes
ris represent relations
P is a predicate.
This query is equivalent to the relational algebra expression.
A1, A2, ..., An(P (r1 x r2 x ... x rm))
Formal Semantics of Single-Relation SQL Query
Start with the relation in the FROM clause.
Apply (bag) , using condition in WHERE clause.
Apply (extended, bag) using attributes in SELECT clause.
The result of an SQL query is a relation.5.1 Simple Queries in SQL (cont.)5.1 Simple Queries in SQL (cont.)Example:
What movies are produced by Disney Studios in 1990?
Movie(title, year, length, inColor, studioName, producerC#)
SELECT *
FROM Movie
WHERE studioName = ’Disney’ AND year = 1990
Note:
single quotes for strings.
An asterisk in the select clause denotes “all attributes”
Equivalent Operational Semantics
Imagine a tuple variable ranging over all tuples of the relation. For each tuple:
Check if it satisfies the WHERE clause.
Print the values of terms in SELECT, if so.5.1.1 Projection in SQL -- The select clause 5.1.1 Projection in SQL -- The select clause Star as List of All Attributes
The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query.
Find the titles and length of all movies are produced by Disney Studios in 1990?
Movie(title, year, length, inColor, studioName, producerC#)
SELECT title, length
FROM Movie
WHERE studioName = ’Disney’ AND year = 1990
In the “pure” relational algebra syntax, what would the query be?
Renaming columns
The SQL allows renaming relations and attributes using the as clause: old-name AS new-name
Example:
SELECT title AS name, length AS duration
FROM Movie
WHERE studioName = ’Disney’ AND year = 1990
5.1.1 Projection in SQL -- The select clause (cont.)5.1.1 Projection in SQL -- The select clause (cont.)Expressions AS Values in Columns
Example:
SELECT title AS name, length*0.01667 AS lengthInHours
FROM Movie
WHERE studioName = ’Disney’ AND year = 1990
Trick: If you want an answer with a particular string in each row, use that constant as an expression.
SELECT title AS name, length*0.01667 AS length, ‘hrs.’ AS inHours
FROM Movie
WHERE studioName = ’Disney’ AND year = 1990
Summary: The select clause can contain arithmetic expressions involving the operation, +, –, *, and /, and operating on constants or attributes of tuples.
Remember: SQL is case insensitive, meaning you can use upper/lower case as you like.
Only inside quoted strings does case matter.
5.1.2 Selection in SQL -- The where clause 5.1.2 Selection in SQL -- The where clause The where clause corresponds to the selection predicate of the relational algebra, if consists of a predicate involving attributes of the relations that appear in the from clause.
Conditions in WHERE clause can use logical operators AND, OR, NOT and parentheses in the usual way.
Comparison results can be combined using the six common comparison connectives =, <>, <, >, <=, and >= .
Comparisons can be applied to results of arithmetic expressions.
Example:
Find the titles of all movies made by MGM studios that were either made after 1970 or were less than 90 minutes long?
Movie(title, year, length, inColor, studioName, producerC#)
SELECT title
FROM Movie
WHERE (year > 1970 OR length < 90) AND studioName = ’MGM’
5.1.2 Selection in SQL -- The where clause (cont.)5.1.2 Selection in SQL -- The where clause (cont.)SQL Includes a between comparison operator in order to simplify where clauses that specify that a value be less than or equal to some value and greater than or equal to some other value.
Find the titles of those movies with the length between 90 and 120 (that is, 90 and 120). select title from movie where length between 90 and 1205.1.3 Comparison of strings5.1.3 Comparison of stringsSQL includes a string-matching operator for comparisons on character strings.
“Attribute LIKE pattern” is a condition that is true if the string value of the attribute matches the pattern.
Also NOT LIKE for negation.
Patterns are described using two special characters:
percent % : stands for any string.
Underscore _ : stands for any one character.
Example1
We remember a movie “Star something”, and we remember that the something has four letters. What could this movie be?
SELECT title
FROM movie
WHERE title LIKE ’Star _ _ _ _’
Note patterns must be quoted, like strings.5.1.3 Comparison of strings (cont.)5.1.3 Comparison of strings (cont.)Example2
Find all movies with a possessive (‘s) in their titles.
SELECT title
FROM movie
WHERE title LIKE ’%’’s%’
Note: two single-quotes in a character string represent one single quote.
SQL supports a variety of string operations such as
concatenation (using “||”)
converting from upper to lower case (and vice versa)
finding string length, extracting substrings, etc.
5.1.4 Comparing Dates and Times5.1.4 Comparing Dates and TimesImplementations of SQL generally support dates and times as special data types.
5/4/1948 or 14 May 1948 or DATE ’1948-05-14’
TIME ‘15:00:02.5’
We can compare dates or times using the same comparison operators (<, >, <=, >=, <>, =) we use for numbers or strings.5.1.5 Ordering the output5.1.5 Ordering the outputTo get the movies listed by length, shortest first, and among movies of equal length, alphabetically, we can say:
select * from movie where studioName = ‘Disney’ and year=1990 order by length, title
We may specify desc for descending order or asc for ascending order, for each attribute; ascending order is the default.
E.g. order by length desc
We can use the number of the attributes instead of the names.
order by 3,1
5.2 Queries involving more than one relation5.2 Queries involving more than one relationSometimes, queries involving more than one relation.
List of relations in FROM clause.
Relation-dot-attribute disambiguates attributes from several relations.
Same as for single relation, but start with the product of all the relations mentioned in the FROM clause.5.2.1 Products and Joins in SQL5.2.1 Products and Joins in SQLList each relations in the FROM clause. Then the SELECT and WHERE clauses can refer to the attributes of any the relations in the FROM clause.
Example: Movie(title, year, length, inColor, studioName, producerC#)
MovieExec(name, address, cert#, netWorth)
Find the name of the producer of Star Wars.
SELECT name
FROM Movie, MovieExec
WHERE title = ’Star Wars’ AND producerC# = cert#
Operational Semantics
Consider a tuple variable for each relation in the FROM.
Imagine these tuple variables each pointing to a tuple of their relation, in all combinations (e.g., nested loops).
If the current assignment of tuple-variables to tuples makes the WHERE true, then output the attributes of the SELECT.5.2.2 Disambiguating attributes5.2.2 Disambiguating attributesList of relations in FROM clause.
Relation-dot-attribute disambiguates attributes from several relations.
Example
Find pairs consisting of a star and an executive with the same address.
MovieStar(name, address, gender, birthdate)
MovieExec(name, address, cert#, netWorth)
SELECT MovieStar.name, MovieExec.name
FROM MovieStar, MovieExec
WHERE MovieStar.address = MovieExec.address5.2.3 Tuple Variables5.2.3 Tuple VariablesSometimes we need to refer to two or more copies of a relation.
Use tuple variables as aliases of the relations.
Tuple variables are defined in the from clause via the use of the as clause.
Example: Find pairs of movie stars by the same address.
MovieStar(name, address, gender, birthdate)
SELECT Star1.name, Star2.name
FROM MovieStar AS Star1, MovieStar AS Star2
WHERE Star1.address = Star2.address AND Star1.name < Star2.name
SQL permits AS between relation and its tuple variable; Oracle does not.
Note that Star1.name < Star2.name is needed to avoid producing (Sue, Sue) and to avoid producing a pair in both orders.
-if we used <> as the comparison operator, then we would produce pairs of stars twice.
star1.name star2.name
Alec Baldwin Kim Basinger
Kim Basinger Alec Baldwin
5.2.4 Interpreting Multirelation Queries5.2.4 Interpreting Multirelation QueriesThere are several ways to define the meaning of the select-from-where expressions.
Nested loops.
Consider a tuple variable for each relation in the FROM.
Imagine these tuple variables each pointing to a tuple of their relation, in all combinations (e.g., nested loops).
If the current assignment of tuple-variables to tuples makes the WHERE true, then output the attributes of the SELECT.
For each tuple t1 in relation R1 do
For each tuple t2 in relation R2 do
……
For each tuple tn in relation Rn do
if the where clause is satisfied when the values from t1,t2…tn
are substituted for all attribute references then
evaluate the attributes of the select clause according to
t1,t2…tn and produce the tuple of values that results.null2. Parallel Assignment
We consider in arbitrary order or in parallel all possible assignments of tuples from the appropriate relations to the tuple variables.null3. Conversion to Relational Algebra
We start with the tuple variables in the FROM clause and take their Cartesian produce.
Use a selection operator to it by converting the WHERE clause to a selection condition.
Finally, we create from the select clause a list of attributes for a final projection operation.
Example:
Select star1.naem, star2.name
From Moviestar AS star1, Moviestar AS star2
Where star1.address = star2.address AND star1.name < star2.name
A1,A5 (A2=A6 AND A1
10000000 )
Find the names and addresses of movie stars who are not also movie executives:
(select name, address from MovieStar )
except (select name, address from MovieExec )5.3 Subqueries5.3 Subqueries Result of a select-from-where query can be used in the where-clause of another query.5.3.1 Subqueries that produce scalar values5.3.1 Subqueries that produce scalar valuesSimplest Case: Subquery Returns a Single, Unary Tuple
Example: Movie(title, year, length, inColor, studioName, producerC#)
MovieExec(name, address, cert#, netWorth)
Find the name of the producer of Star Wars.
SELECT name
FROM Movie, MovieExec
WHERE title = ’Star Wars’ AND producerC# = cert#
SELECT name
FROM MovieExec
WHERE cert# =
(SELECT producer#
FROM Movie
WHERE title = ’Star Wars’ )
Notice the scoping rule: an attribute refers to the most closely nested relation with that attribute.
Parentheses around subquery are essential.5.3.2-3 Conditions involving relations and tuples5.3.2-3 Conditions involving relations and tuplesThe IN Operator
“Tuple IN relation” is true iff the tuple is in the relation.
Example
Find the name and manufacturer of beers that Fred likes.
Beers(name, manf)
Likes(drinker, beer)
SELECT *
FROM Beers
WHERE name IN
(SELECT beer
FROM Likes
WHERE drinker = 'Fred’)
Also: NOT IN.5.3.2-3 Conditions involving relations and tuples (cont.)5.3.2-3 Conditions involving relations and tuples (cont.)EXISTS
“EXISTS(relation)” is true iff the relation is nonempty.
Example
Find the beers that are the unique beer by their manufacturer.
Beers(name, manf)
SELECT name
FROM Beers b1
WHERE NOT EXISTS
(SELECT *
FROM Beers
WHERE manf = b1.manf AND
name <> b1.name);
Note scoping rule: to refer to outer Beers in the inner subquery, we need to give the outer a tuple variable, b1 in this example.
A subquery that refers to values from a surrounding query is called a correlated subquery.
5.3.2-3 Conditions involving relations and tuples (cont.)5.3.2-3 Conditions involving relations and tuples (cont.)Quantifiers: ANY and ALL behave as existential and universal quantifiers, respectively.
Beware: in common parlance, “any” and “all” seem to be synonyms, e.g., “I am fatter than any of you” vs. “I am fatter than all of you.” But in SQL:
Example
Find the beer(s) sold for the highest price.
Sells(bar, beer, price)
SELECT beer
FROM Sells
WHERE price >= ALL(
SELECT price
FROM Sells);
Class Problem
Find the beer(s) not sold for the lowest price.5.3.4 Correlated subqueries5.3.4 Correlated subqueriesFind the titles that have been used for two or more movies.
Movie(title, year, length, inColor, studioName, producerC#)
SELECT title
FROM Movie AS Old
WHERE year < ANY
(SELECT year
FROM Movie
WHERE title = Old.title )5.4 Duplicates (Forcing Set/Bag Semantics)5.4 Duplicates (Forcing Set/Bag Semantics)Default for select-from-where is bag; default for union, intersection, and difference is set.
Why? Saves time of not comparing tuples as we generate them.
But we need to sort anyway when we take intersection or difference. (Union seems to be thrown in for good measure!)5.4.1 Eliminating duplicatesForce set semantics with DISTINCT after SELECT.
But make sure the extra time is worth it.
Example:
Find the different prices charged for beers.
Sells(bar, beer, price)
SELECT DISTINCT price
FROM Sells
5.4.1 Eliminating duplicates5.4.2 Duplicates in Unions, Intersection, and Difference5.4.2 Duplicates in Unions, Intersection, and DifferenceForce bag semantics with ALL after UNION, etc.
Example: MovieStar (name, address, gender, birthdate)
MovieExec (name, address, cert#, netWorth)
Find the names all movie stars and movie executives:
(select name from MovieStar)
union all
(select name from MovieExec)
Suppose a tuple occurs m times in r and n times in s, then, it occurs:
m + n times in r union all s
min(m,n) times in r intersect all s
max(0, m – n) times in r except all s5.5 Aggregations5.5 AggregationsSum, avg, min, max, and count apply to attributes/columns. Also, count(*) applies to tuples.
sum: sum of values avg: average value min: minimum value max: maximum value count: number of values
Use these in lists following SELECT.
Example: Find the average price of Bud.
Sells(bar, beer, price)
SELECT AVG(price)
FROM Sells
WHERE beer = 'Bud'5.5.1 Aggregation operators5.5.1 Aggregation operators (cont.)Eliminating Duplicates Before Aggregation
Example:
Find the number of different prices at which Bud is sold.
Sells(bar, beer, price)
SELECT COUNT(DISTINCT price)
FROM Sells
WHERE beer = 'Bud';
DISTINCT may be used in any aggregation, but typically only makes sense with COUNT.5.5.1 Aggregation operators (cont.)5.5.2 Grouping5.5.2 GroupingFollow select-from-where by GROUP BY and a list of attributes.
The relation that is the result of the FROM and WHERE clauses is grouped according to the values of these attributes, and aggregations take place only within a group.
Example
Find the average sales price for each beer.
Sells(bar, beer, price)
SELECT beer, AVG(price)
FROM Sells
GROUP BY beer5.5.2 Grouping (cont.)5.5.2 Grouping (cont.)Example:
Print the bars in the relation Sells.
Sells(bar, beer, price)
SELECT bar
FROM Sells
GROUP BY bar
This query has the same effect as:
SELECT DISTINCT bar
FROM Sells
5.5.3 HAVING Clauses5.5.3 HAVING ClausesHAVING clauses are selections on groups, just as WHERE clauses are selections on tuples.
Example: Find the average price of those beers that are either served in at least 3 bars or manufactured by Anheuser-Busch.
Beers(name, manf) Sells(bar, beer, price)
SELECT beer, AVG(price)
FROM Sells
GROUP BY beer
HAVING COUNT(*) >= 3 OR
beer IN (
SELECT name
FROM Beers
WHERE manf = 'Anheuser-Busch’ )nullOrder of clauses in SQL queries
(1) SELECT
(2) FROM
(3) WHERE
(4) GROUP BY
(5) HAVING
(6) ORDER BY
Only the first two are required.5.6 DB Modifications5.6 DB ModificationsModification = insert + delete + update.5.6.1 InsertionInsertion of a Tuple
INSERT INTO relation VALUES (list of values)
Inserts the tuple = list of values, associating values with attributes in the order the attributes were declared.
Forget the order? List the attributes as arguments of the relation.
Example: Insert the fact that Sally likes Bud.
Likes(drinker, beer)
INSERT INTO Likes(drinker, beer)
VALUES('Sally', 'Bud');5.6.1 Insertion (cont.)5.6.1 Insertion (cont.)Insertion of the Result of a Query
INSERT INTO relation (subquery)
Example: Suppose we want to add to the relation
Likes (drinker, beer)
all beers that are mentioned in the relation
Sells (bar, beer, price)
INSERT INTO Likes (beer)
(SELECT DISTINCT beer
FROM Sells
WHERE beer NOT IN
(SELECT beer
FROM Likes)5.6.2 Deletion5.6.2 DeletionDE