lt13-relalg

download lt13-relalg

of 32

Transcript of lt13-relalg

  • 7/29/2019 lt13-relalg

    1/32

  • 7/29/2019 lt13-relalg

    2/32

    Database Systems &Applications

    Lec13Extended Operators of Relational Algebra

  • 7/29/2019 lt13-relalg

    3/32

    Relational Algebra on Bags

    A bag (or multiset) is like a set, but anelement may appear more than once.

    Example: {1,2,1,3} is a bag. Example: {1,2,3} is also a bag that

    happens to be a set.

  • 7/29/2019 lt13-relalg

    4/32

    Why Bags?

    SQL, the most important query languagefor relational databases, is actually a bag

    language. Some operations, like projection, are much

    more efficient on bags than sets.

  • 7/29/2019 lt13-relalg

    5/32

    Operations on Bags

    Selection applies to each tuple, so its

    effect on bags is like its effect on sets.

    Projection also applies to each tuple, but

    as a bag operator, we do not eliminate

    duplicates.

    Products andjoins are done on each pair

    of tuples, so duplicates in bags have noeffect on how we operate.

  • 7/29/2019 lt13-relalg

    6/32

    Example: Bag Selection

    R( A, B )1 2

    5 6

    SELECTA+B

  • 7/29/2019 lt13-relalg

    7/32

    Example: Bag Projection

    R( A, B )1 2

    5 61 3

    PROJECTA(R) = A

    151

  • 7/29/2019 lt13-relalg

    8/32

    Example: Bag Product

    R( A, B ) S( B, C )1 2 3 45 6 7 8

    R * S = A R.B S.B C1 2 3 41 2 7 8

    5 6 3 45 6 7 8

  • 7/29/2019 lt13-relalg

    9/32

    Example: Bag Theta-Join

    R( A, B ) S( B, C )1 2 3 45 6 7 8

    R JOIN R.B

  • 7/29/2019 lt13-relalg

    10/32

    Bag Union

    An element appears in the union of two

    bags the sum of the number of times it

    appears in each bag.

    Example: {1,2,1} UNION {1,1,2,3,1} =

    {1,1,1,1,1,2,2,3}

  • 7/29/2019 lt13-relalg

    11/32

    Bag Intersection

    An element appears in the intersection of

    two bags the minimum of the number of

    times it appears in either. Example: {1,2,1,1} INTER {1,2,1,3} =

    {1,1,2}.

  • 7/29/2019 lt13-relalg

    12/32

    Bag Difference

    An element appears in the difference

    A B of bags as many times as it appears

    inA, minus the number of times it appears

    in B.

    Example: {1,2,1,1} {1,2,3} = {1,1}.

  • 7/29/2019 lt13-relalg

    13/32

    Beware: Bag Laws != Set Laws

    Some, but notall algebraic laws that holdfor sets also hold for bags.

    Example: the commutative law for union(R UNION S = S UNION R ) does hold forbags.

    Since addition is commutative, adding

    the number of timesxappears in R andS doesnt depend on the order ofR andS.

  • 7/29/2019 lt13-relalg

    14/32

    Example of the Difference

    Set union is idempotent, meaning that

    S UNION S=S.

    However, for bags, ifxappears n times in

    S, then it appears 2n times in S UNION S.

    Thus S UNION S!=S in general.

  • 7/29/2019 lt13-relalg

    15/32

    The Extended Algebra

    1 . : eliminate duplicates from bags.1 . : sort tuples.1. Extended projection: arithmetic, duplication of

    columns.

    1 . : grouping and aggregation.1. OUTERJOIN: avoids dangling tuples = tuples that do

    not join with anything.

  • 7/29/2019 lt13-relalg

    16/32

    Example: Duplicate Elimination

    R = A B1 23 41 2

    (R) = A B1 23 4

    R1 := (R2) R1 consists of one copy of each tuple that appears in

    R2 one or more times.

  • 7/29/2019 lt13-relalg

    17/32

    Sorting

    R1 := L (R2). L is a list of some of the attributes of

    R2.

    R1 is the list of tuples of R2 sorted first onthe value of the first attribute on L, thenon the second attribute ofL, and so on.

    is the only operator whose result isneither a set nor a bag.

  • 7/29/2019 lt13-relalg

    18/32

    Example: Sorting

    R = ( A B )1 2

    3 45 2

    B(R) = [(5,2), (1,2), (3,4)]

  • 7/29/2019 lt13-relalg

    19/32

    Example: Extended Projection

    R = A B1 2

    3 4

    A+BC,AA1,AA2 (R) = C A1 A2

    3 1 1

    7 3 3

    Using the same L operator, we allow thelist L to contain arbitrary expressionsinvolving attributes, for example:

    1. Arithmetic on attributes, e.g., A+B.

    2. Duplicate occurrences of the same

    attribute.

  • 7/29/2019 lt13-relalg

    20/32

    Aggregation Operators

    Aggregation operators are notoperators of relational algebra.

    Rather, they apply to entire columns

    of a table and produce a single

    result.

    The most important examples: SUM,

    AVG, COUNT, MIN, and MAX.

  • 7/29/2019 lt13-relalg

    21/32

    Example: Aggregation

    R = ( A, B )1 33 43 2

    SUM(A) = 7COUNT(A) = 3MAX(B) = 4

    AVG(B) = 3

  • 7/29/2019 lt13-relalg

    22/32

    Grouping Operator

    R1 := L (R2)

    L is a list of elements that are either:1. Individual (grouping ) attributes.

    2. AGG(A), where AGG is one of theaggregation operators andA is an

    attribute.

  • 7/29/2019 lt13-relalg

    23/32

    Applying L (R) Group R according to all the

    grouping attributes on list L. i.e., form one group for each distinct list

    of values for those attributes in R.

    Within each group, compute AGG(A )for each aggregation on list L.

    Result has one tuple for each group:

    1. The grouping attributes and

    2. Their groups aggregations.

  • 7/29/2019 lt13-relalg

    24/32

    Example:Grouping/Aggregation

    R = A B C1 2 34 5 6

    1 2 5

    A,B,AVG(C) (R) = ??

    First, group R :A B C

    1 2 31 2 5

    4 5 6

    Then, average C within

    groups:

    A B AVG(C)1 2 44 5 6

  • 7/29/2019 lt13-relalg

    25/32

    Example: Grouping/Aggregation

    SpreePL (Sport, year, Player)

    For each player who has participated in at least threesports give the earliest year in which he or sheparticipated.

    First we group, using Playeras a groupingattribute.

    Then, we compute the MIN(year) for each group.

    Also, we need to compute the COUNT(Sport)aggregate for each group, for filtering out thosePlayers with less than three Sports.

    Player,minYear(ctSport 3( Player,MIN(year)minYear,COUNT(Sport)ctSport(SpreePL)))

  • 7/29/2019 lt13-relalg

    26/32

    Outerjoin

    Motivation Suppose we join R S.

    A tuple of R which doesn't join with any tuple of S is saidto be dangling.

    Similarly for a tuple of S. Problem: We loose dangling tuples.

    Outerjoin

    Preserves dangling tuples by padding them with a specialNULL symbol in the result.

  • 7/29/2019 lt13-relalg

    27/32

    Example: Outerjoin

    R = A B S = B C1 2 2 34 5 6 7

    (1,2) joins with (2,3), but the other two tuplesare dangling.

    R S = A B C

    1 2 34 5 NULLNULL 6 7

  • 7/29/2019 lt13-relalg

    28/32

    R S --- Full outerjoin pad dangling tuples fromboth the tables R and S

    R L S -- This left outerjoin: Only pad dangling

    tuples from the left table.

    R R S -- This right outerjoin: Only pad dangling

    tuples from the right table.

  • 7/29/2019 lt13-relalg

    29/32

    Outer Join Example

    Relation loan

    Relation borrower

    customer_name loan_number

    Abhiram

    Kavitha

    Madhu

    L-170

    L-230

    L-155

  • 7/29/2019 lt13-relalg

    30/32

    Outer Join Example

    Natural Join

    loan Borrower

    Left Outer Join

    loan Borrower

  • 7/29/2019 lt13-relalg

    31/32

    Outer Join Example

    Full Outer Join

    loan borrower

    Right Outer Join

    loan borrower

  • 7/29/2019 lt13-relalg

    32/32

    Problem Consider the relational schemas:

    Email (IDNO, email_id)Student (IDNO, Name, Hostel, Room)

    The EMAIL relation is maintained by ARC, where asthe Student relation is maintained by SWD. It isrequired that the information in the two relations be

    combined into a new relation with schema:Student_email (IDNO, email_id, Name, Hostel, Room)

    It is observed that there are some IDNOs in theStudent relation that are not in the Email relation.Write a Relational Algebra expression to create thenew relation such that no IDNO from the Student

    relation is left out.

    (Student_email, (student left outer join email))