Relational Model and Relational Algebra Rose-Hulman Institute of Technology Curt Clifton.
Copyright © 2003-2012 Curt Hill The Relational Algebra What operations can be done?
-
Upload
brook-harvey -
Category
Documents
-
view
218 -
download
0
Transcript of Copyright © 2003-2012 Curt Hill The Relational Algebra What operations can be done?
Algebra• There exists a relational algebra• What is an algebra?• What most of us know as Algebra
is actually the Algebra of Real Numbers
• There is also Boolean Algebra among others
Copyright © 2003-2012 Curt Hill
Algebra of Real Numbers• What does this Algebra consist of?• A set of objects to work on
– This is the infinite set of real numbers
• A set of operators– These include , , , – The may be subtraction or negation– Each operator takes one or two real
numbers and computes another real– There are rules on how they must be
applied
Copyright © 2003-2012 Curt Hill
Relational Algebra• Mostly the same• The object that it operates on are
not real numbers but tables– Sets of relations
• It also possesses operators– Each takes one or two tables and
produces another table
• These are considered next
Copyright © 2003-2012 Curt Hill
Relational Operations• Selection• Projection• Cartesian product• Set union• Set intersection• Set difference• First two take one table, the rest two• The above are primitive operations
– There are also composite operations
Copyright © 2003-2012 Curt Hill
Copyright © 2003-2012 Curt Hill
Operations• Relational operations are like
arithmetic operations• They are unary or binary• They take operands and produce a
result• The operands and results are
tables
Copyright © 2003-2012 Curt Hill
Selection• Choose or eliminate tuples based
on a comparison• Unary operation• There is a boolean test which
determines if a row is eliminated or not
• The symbol is the sigma (s)
Copyright © 2003-2012 Curt Hill
Boolean test of selection• The test may include
– Fields compared with constants– Fields compared with other fields in
same record
• The test may be ANDed, ORed or NOTted together
Copyright © 2003-2012 Curt Hill
Selection Example
s cnt > 2(T1)
ID Cnt Src Use Frq
A 1 X 18 5
B 5 X 4 5
C 2 W 16 4
E 3 Z 12 9
ID Cnt Src Use Frq
B 5 X 4 5
E 3 Z 12 9
Copyright © 2003-2012 Curt Hill
Projection• Choose or eliminate columns• Unary operation• We may also rearrange the rows
that are left• The symbol is pi (p)
Copyright © 2003-2012 Curt Hill
Projection Example
p Src, Cnt, ID(T1)
ID Cnt Src Use Frq
A 1 X 18 5
B 5 X 4 5
C 2 W 16 4
E 3 Z 12 9Src Cnt ID
X 1 A
X 5 B
W 2 C
Z 3 E
Copyright © 2003-2012 Curt Hill
More Projection Notes
• Usually projection is used to change degree of relation
• A relation is a set– It must have unique entries before a
projection– This may not be so when an attribute
is removed
• Projection may eliminate duplicates– Not all systems actually do this– Why not?
Copyright © 2003-2012 Curt Hill
Second Projection Example
p Src, Cnt, ID(T1)
ID Cnt Src Use Frq
A 1 X 18 5
A 1 X 4 5
C 2 W 16 4
C 2 W 12 9Src Cnt ID
X 1 A
W 2 C
Copyright © 2003-2012 Curt Hill
Cartesian Product• AKA Cross product• Binary operation• Append to each row in first each
row in the second• The number of tuples in the result is
the product of the number of rows in the operands– This could be large and expensive– Seldom done without optimization
Copyright © 2003-2012 Curt Hill
Cartesian Product ExampleT1 T2
ID Cnt
A 1
B 5
ID Cnt F1 F2 F3
A 1 A 6 1
B 5 A 6 1
A 1 X 4 2
B 5 X 4 2
A 1 B 3 9
B 5 B 3 9
F1 F2 F3
A 6 1
X 4 2
B 3 9
Copyright © 2003-2012 Curt Hill
Cartesian Addendum• Only primitive operator that deals
with two different schemas• If the two tables have a common
field, one of the fields must be renamed– A tuple must be a set with unique
field names
Copyright © 2003-2012 Curt Hill
Cartesian Product ExampleT1 T2
ID Cnt
A 1
B 5
ID Cnt ID2 Cnt2 Src
A 1 A 6 1
B 5 A 6 1
A 1 X 4 2
B 5 X 4 2
A 1 B 3 9
B 5 B 3 9
ID Cnt Src
A 6 1
X 4 2
B 3 9
Copyright © 2003-2012 Curt Hill
Set Union• Binary operation• Two relations must be union
compatible– They must have same schema,
that is the same attributes• New relation has all the
tuples of both tables with duplicates removed
Copyright © 2003-2012 Curt Hill
Union Example
T1 T2ID Cnt
A 1
D 8
B 5
ID Cnt
B 5
E 2
A 1
ID Cnt
A 1
D 8
E 2
B 5
Copyright © 2003-2012 Curt Hill
Set Intersection• Binary operation• Two relations must be union
compatible– They must have same schema,
that is the same attributes
• New relation has only the tuples in both tables
Copyright © 2003-2012 Curt Hill
Intersection Example
T1 T2ID Cnt
A 1
D 8
B 5
ID Cnt
B 5
E 2
A 1
ID Cnt
A 1
B 5
Copyright © 2003-2012 Curt Hill
Set Difference• Binary operation• Two relations must be union
compatible– They must have same schema,
that is the same attributes• New relation has only the
tuples in both tables removed from first table
• Not symmetrical or commutative
Copyright © 2003-2012 Curt Hill
Set Difference Example
T1 - T2
ID Cnt
A 1
D 8
B 5
ID Cnt
B 5
E 2
A 1
ID Cnt
D 8
T2 – T1
ID Cnt
E 2
Copyright © 2003-2012 Curt Hill
Relational Algebra• The algebra only uses these
operations– All of our queries translate into these
• Each operation produces a relation– Starts with one or two relations
• The algebra is closed– Maps from the set of relations back to
the set of relations
• There are also composite operations
Copyright © 2003-2012 Curt Hill
Join• Binary composite operation• It is the composite of three
operations– Cartesian product– Selection– Projection (optional)
• Often the only way cartesian products are done– Thus the DBMS may optimize it
Copyright © 2003-2012 Curt Hill
Join• A join always operates on joining
the two tables through a common field (or fields) in each
• Thus we join on one or more fields that are in common between the two tables– The fields must have the same
format, often have same name
Copyright © 2003-2012 Curt Hill
Join Process• Take the product of the two
tables• Use select to eliminate all
records where the two fields are not equal
• Eliminate one of the redundant fields
• Resulting table as the sum of the two tables field minus one
• The number of rows is dependent on data
Copyright © 2003-2012 Curt Hill
Natural Join Example
T1 ID T2
ID Cnt
A 1
B 5
ID Cnt Src Dst
A 1 6 1
A 1 3 2
B 5 3 9
ID Src Dst
A 6 1
X 4 2
A 3 2
B 3 9
Copyright © 2003-2012 Curt Hill
Types of Joins• The relationship between the two
joined fields may be anything– We specify the fields and comparison– Called a Condition Join– Same schema as product
• When comparison is equality the join is called an Equijoin– Project on equijoin to eliminate
redundant column
• If the join is equijoin on all common fields then it is called a Natural Join
Copyright © 2003-2012 Curt Hill
Condition Join Example
T1 T1.Cnt<T2.Cnt T2
ID Cnt
A 5
B 3
ID Cnt ID2 Cnt2 Dst
A 5 A 6 1
B 3 A 6 1
B 3 X 4 2
ID Cnt Dst
A 6 1
X 4 2
B 3 9
Copyright © 2003-2012 Curt Hill
Join Importance
• The cartesian product is only primitive that may take two different relation types
• Cartesian products are usually inside a join
• Usually only one table in a database has a particular schema
• Almost every multiple table queries will use a join
Copyright © 2003-2012 Curt Hill
Division• Division a composite not primitive
operation• Deals with three relations of
different degree– First table degree m+n– Second of degree n– Result of degree m
• Columns in the second table are eliminated from the first
Copyright © 2003-2012 Curt Hill
Division Process• The columns in the second table
correspond to those in the first• If the values in the first table
match any corresponding values in the second the row is copied to result
• The common columns are eliminated in the result
• Duplicates are then eliminated
Copyright © 2003-2012 Curt Hill
Division Example
T1 /T2
ID Src Cnt Dst
A 2 4 X
B 2 4 X
A 3 4 Y
B 3 4 Y
B 3 5 Y
B 2 7 Y
B 3 9 Y
B 2 5 X
B 3 8 X
B 2 9 X
C 3 8 Z
ID Cnt
A 4
B 4
B 5
B 9
Src Dst
2 X
3 Y
Copyright © 2003-2012 Curt Hill
Division Implementation• Do an equijoin on the two tables
common columns• Project away the remaining
common columns
Copyright © 2003-2012 Curt Hill
Algebra and Calculus• The algebra is procedural
– You specify how to do what needs to be done
– Must use the operations
• The calculus is declarative– You say what you want without saying
how to obtain this
• SQL has elements of both• Calculus must be translated into
the algebra before execution
Copyright © 2003-2012 Curt Hill
Algebra Shortcomings• Algebra is a theoretical support for
database implementation• It lacks most of the niceties needed
for an actual implementation– Reports– Formatting– Counts– Averages
• All it does is deliver the data as a table
Copyright © 2003-2012 Curt Hill
Queries using relational algebra
• Consider the college schema tables:– Course– Students– Grade– Faculty– Faculty_teach– Department– Division
Example• Suppose we want to produce a
grade report for students• This should include information
from two or three relations:– Students– Grade– Courses (depending how much
information is needed)
Copyright © 2003-2012 Curt Hill
Copyright © 2003-2012 Curt Hill
Find student grades: Tables
naid name address
2156 Betty Reynoldson 315 4th Ave
dept number naid scoreCS 160 2067 86CIS 385 2156 94
Find student grades: Operations
• Equi join on ID
– Students NAID Grades
• Project away unwanted fields
–p name, dept, course, score
– Include address if it is to be mailed
• Outside of the algebra– Format score as grade– Sort by zip code– Format into pages
Copyright © 2003-2012 Curt Hill
Copyright © 2003-2012 Curt Hill
Find all the courses taught by faculty members
• Equi join on ID
– Faculty NAID Faculty_teach
• Project away unwanted fields
–p name, dept, course
Copyright © 2003-2012 Curt Hill
Find the departmental chairs for each faculty
member• Equi join on ID
– Faculty NAID Departments
• This connects departments and chairs
– p name, dept
• Equi join on ID
– Faculty Dept Temp
• Project out what is not needed• Two more joins needed to include
divisional chairs
Copyright © 2003-2012 Curt Hill
Find all the students who got a B or better in any CS class • Use selection to trim the grades
relation to just CS acronym, ID and score greater than or equal to 80
• Join this with students based on student ID equality
• Eliminate those columns that you do not want
Copyright © 2003-2012 Curt Hill
Find all the students that each faculty member
has• Join faculty with faculty_teach on NAID
• Join this with grades file based on equality with both dept acronym and course number
• Project out what you don’t want
Copyright © 2003-2012 Curt Hill
Find all the students who got an A in Calculus and
an A in CIS 385• Select out all the grades table
leaving only Calc As• Join this with students on NAID
Call this T1• Select out all the grades table
leaving only CIS 385 As• Join this with students on NAID and
call it T2• Intersect T2
Copyright © 2003-2012 Curt Hill
Find this semesters GPA of all Math students
• What is a Math student?– A Math major or– Any student taking any math course
• Is this the average score of math courses?
– If so do a selection on grades table
• Is this the average score of all students taking any math course?
– Select grades to just find Math– Join this with student on NAID– Join this with original grade file– Use report program to sort by NAID
and then compute and summarize the GPA