資工所 在職碩一 P96922001 莊浚銘 Temporal Database Paper Reading Report.

Post on 17-Jan-2016

323 views 0 download

Transcript of 資工所 在職碩一 P96922001 莊浚銘 Temporal Database Paper Reading Report.

資工所 在職碩一

P96922001

莊浚銘

Temporal Database

Paper Reading Report

1. Paper Subject & Author

2. Study Motivation

3. Paper Contents

4. Conclusion

Reporting Contents

• Subject : Temporal Aggregates and Temporal Universal Quantification in Standard SQL

• Published Date : 2006/06

• Author : Esteban Zimanyi, Dept. of Computer & Network Engineering, Universite Libre de Bruxelles

• Web Site : http://www.sigmod.org/sigmod/record/issues/0606/p16-article-zimanyi.pdf

Paper Subject & Author

To understand :

• How a SQL database is extended with temporal functionality support

• How temporal operations are implemented in standard SQL

• What problems can be solved by temporal operations

Study Motivation

•No consensus on TSQL standard : Several proposed temporal extensions of SQL, such as “TSQL2” and “SQL/Temporal”, have NOT reached acceptance in the standardization committees.

Paper Contents

•Standard SQL dominant : Nowadays, database practitioners MUST STILL use standard SQL for manipulating time-varying information. Querying and updating time-varying data using standard SQL is STILL a challenging task.

•Little temporal support : Current Database Management Systems, like SQL, provide little support for dealing with time-varying data. They only provide standard data types for encoding dates or timestamps.

Paper Contents (Cont.)

•What this paper aims for : With regard to using standard SQL to manipulate temporal data, there has been studies showing how to define temporal join, projection and difference, but NOT how to deal with temporal aggregates as well as temporal universal quantifiers, this paper is devoted to the later issue.

•Temporal DB Schema : Temporal database design in standard SQL

Paper Contents (Cont.)

Non-temporalNon-temporal

temporaltemporal

temporaltemporal

temporaltemporal

temporaltemporal

All circled columns are with the same data-type named “date” !That means it’s hard to distinguish temporal tables simply by DB data-type.

•Temporal DB Schema(Cont.) :

Paper Contents (Cont.)

A closed-open representationis used!

即 [ d1 , d2 )

Note that a special date ‘3000-01-01’ is used to denote currently-valid rows ! Also, data redundancy is inevitable before we coalesce it.

•Temporal Join : A temporal join is needed when the tables to be combined are temporal.

Paper Contents (Cont.)

•Question1: How do the salary and the affiliation of employees affect each other?

We need to temporal join both tables because they are all temporal!

?

•Solution for Question1: To express a temporal join in SQL, we need FOUR select statements and complex inequality predicates verifying that the validity periods of the rows to be combined intersect.

Paper Contents (Cont.)

Should be FIVE?

Paper Contents (Cont.)

via Temporal Join

DURINGDURING

OVERLAPOVERLAP

OVERLAPOVERLAP

S

A

S

A

S

A

DURINGDURING

S

A

(1) “=” Missing?

To get all intersected ( 即 inter 運算 ) intervals

(3) UNION instead? (2) EQUIVALENT (2) EQUIVALENT Missing?Missing?

S

A

Paper Contents (Cont.)

via Temporal Join

To get all intersected ( 即 inter 運算 ) intervals

Incorrect ? for …

Paper Contents (Cont.)

via Temporal Join

•Another solution for Question1: Note that temporal join can be written in a single statement, either using a Case statement or using functions.

Ensure the two validity periods overlap

Get the intersection of the validity periods by calling predefined SQL functions!

To get all intersected ( 即 inter 運算 ) intervals

•Temporal Coalescing : A complex and costly operation in SQL, similar to temporal Folding,Nesting and Packing proposed by other researchers.

Paper Contents (Cont.)

DNumber is projected out to get employees’ total service life span!

Both can be coalesced!

•Temporal Coalescing Realization : Can be implemented in SQL as follows:

Paper Contents (Cont.)

The Gaps are all 0 in this case, so the 3rd tuple is what we want. But how to erase unwanted tuples?

020101 030601 030801 000101

Gaps = Tuples’ date differences before performing joining

1st Filtering

2nd Filtering

•Temporal Coalescing Realization(Cont.) : Try the SQL

Paper Contents (Cont.)

1st “Not Exists” Filtering (to erase tuples containing the Gap period!)

2nd “Not Exists” Filtering (to erase tuples following or preceding the original ones!)

Final Result !

Gap>0!

020101 030601 030801 790101020601

Gap

•Temporal Aggregation : We need a three-step process to implement temporal aggregation. That process is :

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

•Temporal Aggregation(Cont.) : Suppose we want to find maximum salary among employees. Then the computing way will depend on whether the target table is temporal or not.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

The non-temporal way

The temporal way time

Employee1

Employee2

Employee3

Salary

Employee

•Temporal Aggregation(Cont.) : Implementation of the temporal way to solve the “Maximum Salary Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

1.Create a view named “SalChanges”

2.Create a view named “SalPeriods”

So that we can gather the days in which a salary change may occur !

So that we can construct all “atomic” periods for later processing ! Note that there is no gap between any two of these day periods.

•Temporal Aggregation(Cont.) : Implementation of the temporal way to solve the “Maximum Salary Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

So that we can get maximum salary for every atomic period ! Like this…

3.Create a view named “TempMax”

4.Create a view named “TempCount”

from step (i)

So that we can get employee count for every atomic period ! Like this…

Note that there could be a gap between any two of these periods !(No people on jobs)

Given a zero count when no salary found !

•Temporal Aggregation(Cont.) : Implementation of the temporal way to solve the “Maximum Salary Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

To get maximum salary for each period, we can coalesce the view “TempMax”.

Using the SQL in previous section to Coalesce !

Final result !

Evolve

Time

•Temporal Aggregation(Cont.) : Similar way to solve the “Maximum Salary by Department Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

Divided into 2 layers: Affiliation & Salary

•Temporal Aggregation(Cont.) : Similar way to solve the “Maximum Salary by Department Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

So that we can get a temporal join of affiliation and salary, which yields the days in which a change of max salary of a department may occur !

1.Create a view named “Aff_Sal”

2.Create a view named “SalChangesDep”

Intersected periods only

from step 1So that we can gather the days by department , in which a salary change may occur !

•Temporal Aggregation(Cont.) : Similar way to solve the “Maximum Salary by Department Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

3.Create a view named “SalPeriodsDep”

from step 2

So that we can construct all “atomic” periods by department for later processing ! Note that there is no gap between any two of these day periods. Like this..

•Temporal Aggregation(Cont.) : Similar way to solve the “Maximum Salary by Department Question”.

Paper Contents (Cont.)

(i) Identifying the periods of time in which all values are constant.

(ii) Computing the aggregation over these periods.

(iii) Coalescing the result.

4.Create a view named “TempMaxDep”

Final Result !

5. Query the view

from step 3

from step 1

•Temporal Universal Quantifier : Needed in many usual queries, such as “List the employees that work in all projects controlled by the department to which they are affiliated”.

Paper Contents (Cont.)

The non-temporal version of above query

:inferred entity

Based on the DB schema at the top-right corner,we can construct an E-R model as follows:

Employee Affiliationhas1 1

Salary

hasn

1

WorksOn

hasn

1

Departmenthas1n

Projecthasn 1

Controls

has1

n

hasn

1

Key table

Key table

Key table

That’s, finding supermen!

Recall that we’ve seen the “Not Exists,Not Exits” pattern before. It’s mainly used to erase tuples containing any “GAP” period.

•Temporal Universal Quantifier(Cont.) : Considering whether the tables WorksOn,Affiliation, and Controls are temporal or not. Four cases arise when doing previous query.

Paper Contents (Cont.)

Temporal ?

Temporal ?

Temporal ?

Employee Affiliationhas1 N?

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 1

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

Superman found by using..

The view below:

Pattern found again!

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 1

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

construct all “atomic” periods for later processing !

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 1

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

Pattern found again!

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 n

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

Project independent !

Temporal join

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 n

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

Project independent !

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 n

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

1st Temporal Join

2nd Temporal Join

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 n

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

•Temporal Universal Quantifier(Cont.) :

Paper Contents (Cont.)

Employee Affiliationhas1 n

Salary

hasn1

WorksOnhas

n1

Departmenthas 1n

Projecthasn 1

Controls

has1n

has n1

1. The cost to implement temporal queries in standard SQL is high due to the adoption of constructing various complex “Views”, both in coding and execution time.

2. To reduce execution time, this paper suggest using T-SQL procedures with cursors instead. But the detail belongs to the future work of this paper.

3. The best solution, however, is to have the DBMS provide such time-varying facilities in a native way, since that would increase both database performance and application development productivity.

Conclusion

The End