The Relational Algebra Relational Calculus With Examples ...
Transcript of The Relational Algebra Relational Calculus With Examples ...
The Relational Algebra
&
Relational Calculus
With
Examples From All Languages
1. Example Queries of the Company Database.
Q1. Retrieve the name and address of all employees who work for the 'Research' department.
a. The Mathematical Set.
q1.={e.lname, e.address e ∈ employee and e.dno = 5
}
q1 =
e.lname, e.address e ∈ employee and ∃d ∈ department 3:
d.dname = 'research'
and
e.dno = d.dnumber
.
The second form is preferred because it doesn’t rely on the constant 5. Constants we’re told, couldbe changed in the future and we’re told we should write queries to prepare for such possibilities. Thissounds like nonsense because 'research' is a constant substitute for 5 which could change as well.
b. Tuple Relational Calculus. The transition to TRC from mathematical sets is straight forward. Thevariables, which in TRC represent tuples, are to be capitalized in WinRDBI.
q1 =
E.lname, E.address employee(E) and (∃D) 3:
department(D)
and
D.dname = 'research'
and
E.dno = D.dnumber
.
c. Domain Relational Calculus. Here the variables are attributes and in WinRDBI, they must becapitalized.
q1 =
Ln, Addrr (∃Dnum)
employee( , ,Ln, , ,Addr, , ,Dnum)
and
department('Research',Dnum, , )
.
A product of O R B E • an extension of TEX • by R J S
1
d. Relational Algebra. Three approaches:
er:=select dno = 5 (employee);
dn:=project dnumber (select dname = 'Research' (department));
q1:=select dnumber = dno (dn product employee);
dnp(dno):=project dnumber (select dname = 'Research' (department));
q1p:= dnp njoin employee;
e. SQL.
SELECT
x.a1,
y.a2,···
FROM
r1 AS x,
r2 AS y,···
WHERE
sltn cndts
AND / OR
join cndts
ORDER BY
x′.a′1,
y′.a′2,···
GROUP BY
x′′.a′′1,
y′′.a′′2,···
HAVING
p1,
p2,···
The following command, made form the mysql> prompt, worked without any capitalization; this inspite of the fact that attributes of the database were named using capital and character constants werenamed using capitals.
SELECT
e.fname,
e.lname,
e.address
FROM
(employee as e,
department as d
)WHERE
d.dname = 'Research'
and
d.dnumber = e.dno;
;
Another option, to be used if the results of the query are to be used elsewhere, is this:
CREATE VIEW q1 AS SELECT
e.fname,
e.lname,
e.address
FROM
(employee as e,
department as d
)WHERE
d.dname = 'Research'
and
d.dnumber = e.dno;
;SELECT * FROM q1;
Use DROP VIEW a to delete view so created.
2
Q2. For every project located in 'Stafford', list the project number, the controlling department number,and the department manager’s last name, address, and birthdate.
a. The Mathematical Set.
b. Tuple Relational Calculus. I think the lesson here is that any variable not used to the left of mustbe quantified by either ∀ or ∃.
q2:={P.pnumber, D.dnumber, E.lname | projects(P) and department(D) and employee(E)
and P.plocation = 'Stafford' and P.dnum = D.dnumber and D.mgrssn = E.ssn};
Another option is this,
q2p:= {P.pnumber, P.dnum, E.lname | projects(P) and employee(E) and
P.plocation= 'Stafford' and (exists D)(department(D) and
P.dnum=D.dnumber and D.mgrssn=E.ssn)};
A tuple variable is bound if it is quantified by ∀ or ∃; otherwise, it’s free. The first expression for q1contains free variables P,D,E and no bound variables. The first expression for q1 contains free variablesP and E and bound variable D. P.plocation = 'Stafford' is an example of a selection conditionand P.dnum = D.dnumber and D.mgrssn = E.ssn are join conditions.
c. Domain Relational Calculus.
d. Relational Algebra.
ps(pnumber, dnumber):=project pnumber, dnum (select plocation = 'Stafford' (projects));
psmgr(pnumber, dnumber, ssn):=project pnumber, dnumber, mgrssn(ps njoin department);
q2:=project pnumber, dnumber, lname(psmgr njoin employee);
e. SQL.
SELECT
pnumber,
dnum,
lname
FROM
projects,
department,
employee
WHERE
dnum = dnumber
and
mgr ssn = ssn
and
plocation = 'Stafford'
;
mysql> SELECT p.pnumber, d.dnum, e.lname FROM project as p, employee as e,
department as d WHERE p.dnum=d.dnumber and d.Mgr ssn=e.ssn andp.plocation='Stafford';
mysql> SELECT pnumber, dnum, lname FROM ((project JOIN department
ON dnumber=dnum) JOIN employee ON Mgr_ssn = ssn) WHERE plocation =
’Stafford’;
3
Q′′3 . Find the maximum salary of all employees.
a. The Mathematical Set. The maximum is given by the following mathematical set,
q′′3.={e.salary e ∈ employee 3: e′ ∈ employee =⇒ e′.salary ≤ e.salary
}.
There is an implicit ∀ preceding e′ in this statement but I think it reads better without ∀.b. Tuple Relational Calculus. This set is given in the tuple relational calculus by the following set,
q′′3.={e.salary employee(e) and (∀ e′)
(employee(e′) and e′.salary ≤ e.salary
)}. (a)
which, according to the first transformation on p. 180, is logically equivalent to this set,
q′′3.={e.salary employee(e) and not (∃e′)
(not (employee(e′)) or (e′.salary > e.salary)
)}. (b)
The domain of an expression in the relational calculus is the set of values of any tuples of relationsreferenced by the expression or any constant values appearing in the expression. An expression is safe ifall values of its results are from its domain, unsafe if its result begs consideration of values from outsidethat domain. The latter expression (b) is unsafe; an attempt to run its equivalent in WinRDBI’s TRCcrashed my system, my OS forced me to close WinRDBI. The first expression (a) for q′3, the one with∀, met simply with a “check input” error but it too must be regarded as unsafe, being that it’s logicallyequivalent to an unsafe expression. So, how does one arrive at a working expression? The answer to thisquestion is everyone’s first lesson in first order predicate logic. Suppose f(a) and g(a) are statements, onesuch statement for a of some ambient space called the domain or universe of disclosure. The followingmay be viewed as truth preserving transformations of logical expressions,
(∀a)(f(a) ⇒ g(a)
)is false ⇐⇒ (∃a)
(f(a) and not g(a)
)(Lf )
(∀a)(f(a) ⇒ g(a)
)is true ⇐⇒ not (∃a)
(f(a) and not g(a)
)(Lt)
Under Lt, the first expression (a) for q′3 transforms to the following,
q′′3.={e.salary employee(e) and not (∃ e′)
(employee(e′) and e′.salary > e.salary
)}. (c)
This expression is safe and WinRDBI evaluates it, returning the desired maximum. What we’ve ac-complished here is this. Starting with an affirmative expression containing ∀, such as (a), (Affirmativeexpressions are a good place to start because one would expect that the introduction of not into themshould go most routinely.) we’ve arrived at a safe expression, namely (c), that WinRDBI likes. Let’sproceed now to the infamous Q3.
c. Domain Relational Calculus.
d. Relational Algebra. q′3 = FMAXIMUM salary
(employee)
e. SQL.
mysql> SELECT lname, salary FROM employee WHERE salary >= ALL (SELECT salary FROM
employee);
4
Q3. Find the employees who work on all the projects controlled by department number 5. The query hasbeen simplified slightly, eliminating a projection that is both easily handled and irrelevant to the issue athand.
a. The Mathematical Set. As before, let’s start with an affirmative expression in the language ofmathematical set theory,
q3.=
e ∈ employee
p ∈ projects
3:
p.dnum = 5
=⇒ (∃w ∈ works on) 3:
w.essn = e.ssn
and
w.pno = p.pnumber
.
There is an implicit ∀ preceding (p . . .) in this statement but I think it reads better without ∀.b. Tuple Relational Calculus. The direct equivalent of this set in the tuple relational calculus is this,
q3.=
e employee(e) and (∀p)
projects(p)
and
p.dnum = 5
and (∃ w)
works on(w)
and
w.essn = e.ssn
and
w.pno = p.pnumber
.
The expression contains (∀p)(f(p) ⇒ g(p)
), where ⇒ is the and preceding ∃ w, and so, according
to (Lt), the set transforms to this one,
q3.=
e employee(e) and not (∃p)
projects(p)
and
p.dnum = 5
and not (∃ w)
works on(w)
and
w.essn = e.ssn
and
w.pno = p.pnumber
.
This expression, compare to Q3a p. 182, is safe and WinRDBI evaluates it, returning the desired list ofemployees (capitalize the names of variables, which, for the tuple relational calculus, represent tuples),
q3:={E | employee(E) and not (exists P)( ( projects (P) and P.dnum = 5 ) and
not (exists W)( works on(W) and W.essn = E.ssn and W.pno = P.pnumber ) )};
There is just one more step to go to capture everything in section 6.6.6 on p. 180. The expression inthe previous set is, according to the fourth transformation of p. 180 (read backwards), equivalent to thefollowing set,
q3.=
e employee(e) and (∀p)
not
project(p)
and
p.dnum = 5
or (∃ w)
works on(w)
and
w.essn = e.ssn
and
w.pno = p.pnumber
,
5
and by De Morgan’s, this set is equivalent to the following set,
q3.=
e employee(e) and (∀p)
not (project(p))
or
not (p.dnum = 5)
or (∃ w)
works on(w)
and
w.essn = e.ssn
and
w.pno = p.pnumber
.
This expression is safe, it’s equivalent to the expression at the bottom of p.180 (minus a few unnecessary() pairs), and WinRDBI evaluates it, returning the desired list of employees,
q3:={E | employee(E) and (forall P) ( not (projects (P)) or not (P.dnum = 5)
or (exists W)( works on(W) and W.essn = E.ssn and W.pno = P.pnumber ) ) }; .
c. Domain Relational Calculus.
d. Relational Algebra. Queries involving ∀ are implemented in the relational algebra with a division.WinRDBI doesn’t support the division operator so you have to carry it out in terms of elementaryoperations.
r:=project essn, pno(works on);
s(pno):=project pnumber(select dnum=5(projects product department));
%now evaluate r division s; using the layout of my notes.
%R={essn, pno}, S={pno}, R-S={essn}x:=project essn(r);
y:=(x product s) difference r;
z:=project essn(y);
w:=x difference z;
q3:=project lname, ssn(select essn=ssn(w product employee));
6
e. SQL.
q3.=
e employee(e) and not (∃p)
projects(p)
and
p.dnum = 5
and not (∃ w)
works on(w)
and
w.essn = e.ssn
and
w.pno = p.pnumber
.
To display a query with nested queries, define a macro to hold the SELECT FROM WHERE . . . clauses of eachsub-statement. This corresponds to using CREATE VIEW to create a table the results for each sub-query.Then, within such a macro, use a (rows) construction to match the SQL query to its correspondingquery in the tuple relational calculus,
SELECT lname FROM employee E WHERE NOT EXISTS
SELECT * FROM project P
WHERE
P.dnum = 5
and NOT EXISTS
SELECT * FROM works on W
WHERE
W.essn = E.ssn
and
W.pno = P.pnumber
;
mysql> SELECT e.lname FROM employee e WHERE NOT EXISTS
(SELECT * FROM project p
WHERE p.dnumber = 5 AND NOT EXISTS (SELECT * FROM works on w WHERE
w.essn=e.ssn AND w.pno=p.pnumber));
As with WinRDBI’s version of the company database, I inserted the following tuple so the result of thisquery is a nonempty set,
INSERT INTO works on VALUES ('123456789',3,15.0);
7
Q′3. List [the name] of each employee who works on a project controlled by department 5.
a. The Mathematical Set.
b. Tuple Relational Calculus.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. This is an example of a non-correlated nested query,
mysql> SELECT e.lname, e.ssn FROM employee AS e WHERE e.ssn IN (SELECT w.essn
FROM works on AS w, project AS p where w.pno=p.pnumber and p.dnum=5);
8
Q4. Make a list of project numbers for projects that involve an employee whose last name is 'Smith', eitheras a worker or as a manager of the department that controls the project.
a. The Mathematical Set.
b. Tuple Relational Calculus.
q4:={P.pnumber | projects(P) and ( (exists E, W)( employee(E) and
works on(W) and E.lname = ’Smith’ and E.ssn = W.essn)
or (exists Ep, D)(employee(Ep) and department(D) and Ep.lname = ’Smith’
and Ep.ssn = D.mgrssn and D.dnumber = P.dnum) ) };
Smith is not a manager so the following would suffice,
q4p:={P.pnumber | projects(P) and (exists E, W)( employee(E) and works on(W)
and E.lname = 'Smith' and E.ssn = W.essn) };
The work on relation does not include managers their projects, and so there is a difference between thetwo queries. Choose for instance Wong who is a manager to see the difference.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
1. Solution.SELECT DISTINCT pnumber FROM
project,
department,
employee
WHERE
dnum = dnumber
and
mgr ssn = ssn
and
lname = 'wong'
union
SELECT DISTINCT pnumber FROM
project,
works on,
employee
WHERE
pno = pnumber
and
essn = ssn
and
lname = 'wong'
;
2. Solution.
mysql> SELECT * FROM project WHERE
pnumber IN (SELECT pnumber FROM project,department,employee
WHERE dnum = dnumber and mgr ssn = ssn and lname='Smith')
OR pnumber IN (SELECT pno FROM works on, employee
WHERE essn = ssn and lname='Smith');
9
Q5. List [the names] of all employees with two or more dependents.
a. The Mathematical Set.
b. Tuple Relational Calculus.
q5:={E | employee (E) and (exists D, Dp)( dependent(D) and dependent(Dp)
and E.ssn = D.essn and E.ssn = Dp.essn and
D.dependent name <> Dp.dependent name )};
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
mysql> SELECT ssn FROM employee WHERE (SELECT count(*) FROM dependent WHERE ssn=essn) >= 2;
10
Q6. Retrieve [the names of] employees who have no dependents.
a. The Mathematical Set.
b. Tuple Relational Calculus.
q6:={E | employee(E) and not (exists D)(dependent(D) and E.ssn = D.essn)};
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
mysql> select lname from employee where NOT EXISTS (select * from dependent
where ssn = essn);
11
Q7. List [the names of] managers who have at least one dependent.
a. The Mathematical Set.
b. Tuple Relational Calculus.
q7:={E | employee (E) and (exists D, Dp)( department(D) and dependent(Dp)
and E.ssn = Dp.essn and D.mgrssn = E.ssn)};
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
mysql> select lname from employee where EXISTS (select * from dependent
where ssn = essn) AND EXISTS (select * from department where ssn =
mgr_ssn);
mysql> select ssn,lname from employee, department where dno=dnumber and
ssn=mgr ssn and exists (select * from dependent where ssn = essn);
12
Q8.
a. The Mathematical Set. For each employee, retrieve his\her name and the name of his\her immediatesupervisor.
b. Tuple Relational Calculus.
q8:={E.lname,Es.lname | employee(E) and employee(Es) and Es.ssn=E.superssn};
The result of this table is a relation with one fewer tuples than than the employee relation because oneemployee, the top guy, does not have a supervisor. The value of his supersssn attribute is NULL.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. The first query includes a tuple with a null value,
mysql> SELECT e.ssn, e.super ssn FROM employee AS e;
and the second one excludes this tuple,
SELECT
(e.lname,
s.lname
)FROM
(employee as e,
employee as s
)WHERE e.super ssn = s.ssn;
mysql> SELECT e.ssn as essn, s.ssn as sssn FROM (employee as e LEFT OUTER JOIN
employee as s ON e.supersub ssn=s.ssn);
13
Q9.
a. The Mathematical Set.
b. Tuple Relational Calculus.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. In the absence of a WHERE clause, the result contains a tuple for each tuple of the table in the FROM
clause,SELECT ssn FROM employee;
gives the ssn of each employee.
14
Q10.
a. The Mathematical Set.
b. Tuple Relational Calculus.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. You can generate a product with a comma. The following is a meaningless example,
SELECT ssn, dname FROM employee, department;.
15
Q11.
a. The Mathematical Set.
b. Tuple Relational Calculus.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. Compare the following,SELECT salary FROM employee;
SELECT DISTINCT salary FROM employee;
16
Q12. List all employees whose address is in Houston, TX.
a. The Mathematical Set. List all employees born in the 1950s.
b. Tuple Relational Calculus.
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL. SQL uses characters % and for pattern matching; % replaces any number of characters andreplaces a single character.
SELECT lname FROM employee WHERE address like '%Houston, TX%';
SELECT lname FROM employee WHERE bdate like ' 5 ';
17
Q13. Show the salaries that would result from a 10% increase for each employee working on ProductX.
a. The Mathematical Set.
b. Tuple Relational Calculus. The set of employees working on 'ProductX' is given by
q13 :=
E | employee(E) and (exists W, P)
works on(W)
and
projects(P)
and
E.ssn = W.essn
and
W.pno = P.pnumber
and
P.pname = 'ProductX'
;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
SELECT E.lname, 1.1*E.salary FROM
employee as E,
works on as W,
project as P
WHERE
E.ssn = W.essn
and
W.pno = P.pnumber
and
P.pname = ’ProductX’
;
18
Q14. List all employees in department 5 whose salary is between $30,000 and $40,000.
a. The Mathematical Set.
b. Tuple Relational Calculus.
q14 :=
E |
employee(E)
and
E.salary < 40000
and
E.salary > 30000
and (exists W, P)
works on(W)
and
projects(P)
and
W.essn=E.ssn
and
P.pnumber=W.pno
and
P.dnum = 5
;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
SELECT * FROM employee AS e WHERE
e.salary > 30000
and
e.salary < 40000
and
e.dno = 5
;
19
Q15. Retrieve a list of employees and the projects they’re working on, ordered by department and, withineach department, ordered alphabetically by last name and then first name.
a. The Mathematical Set.
b. Tuple Relational Calculus.q15 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.
SELECT
d.dnumber,
e.dno,
e.lname,
e.fname,
p.pname
FROM
department d,
employee e,
project p,
works on w
WHERE
e.ssn = w.essn
and
w.pno=p.pnumber
and
p.dnum = d.dnumber
ORDER BY
d.number,
e.lname,
e.fname
;
20
Q16.
a. The Mathematical Set.
b. Tuple Relational Calculus.q16 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
21
Q17.
a. The Mathematical Set.
b. Tuple Relational Calculus.q17 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
22
Q18.
a. The Mathematical Set.
b. Tuple Relational Calculus.q18 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
23
Q19.
a. The Mathematical Set.
b. Tuple Relational Calculus.q19 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
24
Q20.
a. The Mathematical Set.
b. Tuple Relational Calculus.q20 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
25
Q21.
a. The Mathematical Set.
b. Tuple Relational Calculus.q21 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
26
Q22.
a. The Mathematical Set.
b. Tuple Relational Calculus.q22 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
27
Q23.
a. The Mathematical Set.
b. Tuple Relational Calculus.q23 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
28
Q24.
a. The Mathematical Set.
b. Tuple Relational Calculus.q2 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
29
Q25.
a. The Mathematical Set.
b. Tuple Relational Calculus.q25 := {} ;
c. Domain Relational Calculus.
d. Relational Algebra.
e. SQL.SELECT FROM WHERE ;
30
1. Relation Notation.
Relations. A domain D is a set whose elements, called values, are described in terms of a data type, possiblyformatting construction and constraints on the values. An attribute a of a domain D is an activity, a conceptor a role for D. An attribute a of a domain D is also a variable ranging over D and representing the activity,concept or role for D. A relation schema R of degree or arity n is a collection of n domains Dai togetherwith an attribute ai for each Dai . A relation schema R is thought of as the set of its attributes,
R = {a1, a2, . . . , an}.Let’s use the same notation for both Dai and Dai ∪ {∅̂} and let
Dp(R) =∏a∈R
Da.
An elementt = 〈v1, v2, . . . , vn〉 , vi ∈ Dai or ∅̂, of Dp(R)
is called an n-tuple or record. A relation instance or relation state r of a relation schema R is any subset ofthe product Dp(R),
r=̇{t1, t2, . . . , tµ}.A table is a relation r with an ordering of its elements. A relation schema R may be regarded as the name
and headers of a table and a relation state r then is thought of as populating the table. The ∅̂ value isintroduced for situations in which the value is unknown, a value exists but cannot be determined, a valuethat does not exist or there is no appropriate value such as an attribute that does not apply to a particularrecord.
Alternate Definition of A Relation. Let R = {a1, a2, . . . , an} be a relation schema and let Du(R) be theunion of domains Dai ,
Du(R) = ∪a∈R{a} ×Da.
Let p1 be that natural projection onto the first factor, p1(a, v) = a. A tuple t is a section of R; that is, amapping of R into Du(R) such that p1 ◦ t is the identity of R onto itself,
t : R −→ Du(R) 3: t.a.= t(a) ∈ Da ∀ a ∈ R.
Let T (R) be the set of all such tuples. Then T (R) ≡ Dp(R) and a relation state or relation instance r isany subset of T (R),
r=̇{t1, t2, . . . , tµ};the number µ of tuples, being the number of rows of any table over the relation, is called the order of r, andthe number n or attributes is called its degree (of arity),
µ = |r| and n = deg(r).
This definition does not attribute any significance to an ordering of the columns of a tables.
Dual Functions. Each attribute a ∈ R defines a function
a : T (R) −→ Du(R), a(t).= t(a).
Then the preimage a−1(v), where v ∈ Da, is the set of all tuples t such that t(a) = v. Let A ⊆ R and define
πA : T (R) −→ P (A) by πA(t).= t
A
. Then a−1(v) = π−1a (v).
Relational Database. A relational database schema is a collection of relations,
S = {R1, R2, . . . , Rm}, of relation schema, Ri=̇{ai,1, ai,2, . . . , ai,ni},
together with a set CS of constraints on the data, called the integrity constraints of the schema [1]. Arelational database state of S is a collection
DS = {r1, r2, . . . , rm} where each ri=̇{ti,1, ti,2, . . . , ti,µi}
is a relation state of the relation schema Ri. A relational database state DS is either valid or invaliddepending on whether or not it satisfies the integrity constraints DS.
[1] SQL is the data definition language (DDL) used by most database management systems (DBMS) for definingrelational database schema.
31
2. Relational Algebra. Operations are applied to relations r and the result or return of an operation is anotherrelation. The following set of operations forma a complete set (in some unspecified sense),
{σ, π,∪,∩, \,×}.
a. The SELECT Operation.R = {a1, a2, . . . , an}.
tuples t : R −→ Du =⇒ tp = 〈t(a1), t(a2), . . . , t(an)〉 ∈ Dp
r.= {t1, t2, . . . , tµ} ⊆ P (R) relation
Let a be an attribute and let v ∈ Da,
σa=v(r) = {t ∈ r t(a) = v} = r ∩ a−1(v) (SELECT)
A selection condition is a set of conditions more general than a = c. The result σc(r) of a selectionoperation is relation with these properties,
|σc(r)| ≤ |r| and deg(σc(r)) = deg(r).
|σc(r)||r|
(selectivity)
SELECT * FROM r WHERE c; (SQL)
b. The PROJECT Operation. Let R′ be a subset of R.
πR'(r) = {restricted tuples tR′
t ∈ r}/ = RROJECT
R′ is called the projection or attribute list. The reduction mod = is used here to indicate that if twoor more restrictions agree on R′ then they a represented by a single tuple in πR'(r). With redundanttuples, πR'(r) would be a multiset. For a singleton a you have
πa = a and a−1(v)
= π−1a(v)
where v ∈ Da.
|πR'(r)| ≤ r and deg(πR'(r)) = |R′|.If R′ is a superkey of R then πR'(r) = r.
SELECT [DISTINCT] R′ FROM r; (SQL)
c. The RENAME Operation. Aliases. To rename r to r and the as to bs,
ρb1,b2,...,bn(R), ρr′(r), ρr′(b1,b2,...,bnb)(r)
SELECT ∗ FROM r AS r′; SELECT a1 AS b1,. . .,an AS bn FROM r; (SQL)
SELECT r.a3AS b1, r.a2AS b2 FROM rAS r′; (SQL)
Its often best to rename intermediate results and to rename the attributes of a new relation. This isdone by assigning a name whose arguments are the desired new names,
r′ ← πR'(r) or r′(b1, b2) ← πa3, a5(r).
d. Compositions.r′(b1, b2) ← πa3, a5 ◦σa7=5(r)
SELECT DISTINCT r.a3AS b1, r.abAS b2 // {FROM r AS r′ // {WHERE a7 = 5 // {
(SQL)
Compositions are often best broken down into single operations. This is done by assigning intermediatenames.
πS ◦σc = σx ◦πS, σc1σc2 = σc1 AND c2 = σc2σc1
d. The UNION Operation. r = r1 ∪ r2
32
e. The INTERSECTION Operation. r = r1 ∩ r2f. The SET MINUS Operation. r = r1 − r2g. The CARTESIAN PRODUCT Operation. This operation combines relations of two relation schema,
R = {a1, a2, . . . , an1} and S = {b1, b2, . . . , bn2
}.
(t, u) : R ∪ S −→ Du(R ∪ S), (t, u)(x).=
{t(x) if x ∈ Ru(x) if x ∈ S
}where t ∈ P (R) and u ∈ R(S).
The product of two relations r and s is the following relation,
r × s .= {(t, u) : t ∈ r and u ∈ s}.
|r × s| = |r|·|s|, deg(|r × s|) = deg(r) + deg(s), P (R ∪ S) = P (R)× P (S)
SELECT * FROM r, s; (SQL)
h. The JOIN Operation. Let R,S, r, s as before and suppose further that the relations have attributesa and b that share the same domain. The tables are said to be joined along this domain. The join oftables r and s along a and b is defined to be the following relation,
r ./a=b s.= σa=b(r × s) and in general r ./c s
.= σc(r × s).
SELECT * FROM r JOIN s WHERE a = b ; (SQL p. 157)
This is often followed by a projection to eliminate redundant and unnecessary attributes. The followingoperation retrieves names of department heads,
Departments ./ (mgr_ssn = ssn)Employees,
[In the tables of the company data base, an employee is identified by his\her ssn, the primary keyof the Employees table. You may have occasion to attach names to the data of these tables. This isaccomplished using a JOIN operation.]
i. The NATURAL JOIN Operation. Suppose relations r and s have a common attribute a and a’s nameis the same under both relations. The redundant column of a join involving = is eliminated using thenatural join which is the relation whose notation and definition are given by
r ∗ s = r ./ (a = a)s.
This is implemented in SQL using the following projection,
SELECT A FROM r, s WHERE a = b; (SQL)
where A ⊂ R ∪ S is an ordered list of attributes, possibly renamed and, including a AS b or b AS a ifa and b have different names. Alternatively, rename the appropriate column of one of the table so thenames are the same. Then apply a natural join.
33
j. The DIVISION Operation. Let r and s be relations with relation schema R and S respectively. SupposeS ⊆ R. Let q be the relation with schema R− S defined in the following way,
q.= r ÷ s =
u ∈ P (R− S) for each v ∈ s ∃ t ∈ r 3:
tR− S = u
tS = v
= {u ∈ P (R− S) (u, v) ∈ r for each v ∈ s} (reordering attributes in a convenient way)
= {u ∈ P (R− S) (u, s) ⊆ r}.
r ÷ s is the set of all tuples u on R− S such thatfor each v ∈ S the extension (u, v) belongs to r.
The relation q = r÷ s has these properties: (r÷ s)× s ⊆ r and (r÷ s)× s consists precisely of all tuplest ∈ r whose restriction to S is a tuple of s. Thus the division operation is designed to handle with ∀,
“Find tuples of r which are related to every tuple of s.”
In terms of elementary operations, the quotient is given by
r ÷ s = πR-S(r)− πR-S((πR-S(r)× s
)− r).
Proof: πR-S(r)× s = {(u, v) u ∈ πR-S(r), v ∈ s}(πR-S(r)× s
)− r = {(u, v) u ∈ πR-S(r), v ∈ s, (u, v) 6∈ r}
πR-S
((πR-S(r)× s
)− r)
= {u ∈ πR-S(r) (u, v) 6∈ r for some v ∈ s}
πR-S(r)− πR-S((πR-S(r)× s
)− r)
= {u ∈ πR-S(r) (u, v) ∈ r ∀ v ∈ s}QED
An Alternative Approach. There is another natural projection,
Dp(R)℘S−−−−−→ Dp(R) 3: vp = ℘
S(tp) ⇐⇒ v = πS(t) ⇐⇒ v = t
S.
With this definition we have
vp ∈ r ∩ ℘S
(℘R−S
−1(up))⇐⇒ ∃ t ∈ r 3: t
R− S = u and tS
= v
and thereforeu ∈ r ÷ s ⇐⇒ sp ⊆ r ∩ ℘
S
(℘R−S
−1(up)).
34
k. Generalized Projections. Algebraic Functions of the Attributes.
l. Mathematical Aggregate Functions.
m. Recursive Closure Operations.
n. Outer Join Operations.
o. Outer Union Operation.
p.
q.
3. The Tuple Relational Calculus. Let r be a relation instance, let R = {a1, . . . , an} be its relation scheme andlet θ, called a selection condition be a t-dependent function evaluating to true or false. A query of the tuplerelational calculus takes the following form,
{(
expressions in
tuple variables
)t, u, v, . . .
(range of each
tuple variable
)r(t), s(u), q(v), . . . and
(selection
combinations
)θ(t, u, v, . . .)}
where expressions and θ may involve the attributes of each relation instance.
a. Selection.
b. Projection.
c.
d.
e.
f.
g.
h.
i.
j.
k.
4. The Domain Relational Calculus.
BIBLIOGRAPHY
[1] R. Elmasri & S. B. Navathe, “Fundamental of Database Systems,” 6th Ed. Addison Wesley 2016.[2] P. Dubois, “MySQL Developer’s Library,” 4th Ed. Addison Wesley 2009.[3] S. Dietrich, “Understanding Database Query Language,” Prentice Hall (2001).
35