Time in Databases CSCI 6442 With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236.

24
Time in Databases CSCI 6442 With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236

Transcript of Time in Databases CSCI 6442 With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236.

Time in DatabasesCSCI 6442

With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236

Agenda• Taxonomy of time in databases• Point-in-time databases

2

The Problem• Suppose you are away from your company for 6 months

assigned to work at an outpost in Australia, in the opal mines in Coober Pedy, underground, out of communication.

• You are eligible for a raise in the midst of this. When you return they notice this.

• Now it is approved and entered into the database within a few days, but you get the extra money from 3 months ago

• Which times get recorded in the database?

3

Types of Time• Transaction time—time when the information was stored in

the database• Valid time—the time that the stored information models

reality• User-defined time—a time that is important to and is defined

by the application

4

Static Databases• A conventional database models the current state of a

dynamically changing world• When an update is made, the past values are forgotten

completely• This sort of database is called a static database• It can’t tell you what Jones’s salary was two years ago or how

many raises Jones has had in the past four years

5

Static Rollback Databases• We can provide the capability to roll back a database to a

previous state, such as by using a log• The effect of recent transactions is removed, until the desired

state is reached• Then the intended query can be posed• Past values cannot be changed; only operations on current

values are permitted, because this preserves the series of past operations

• Transaction time must be recorded for every change, to support rollback

• Current operations must be suspended while rollback operations are under way 6

Historical Databases• Historical databases record the history of the values of data as

it is best known• As errors are discovered, they are corrected by changing the

database• Previous states are not retained; at any time, the database

reflects the best-known history of the data• Time represented in historical database is valid time, the time

that the stored information models

7

Temporal Databases• A temporal database allows one to view tuples valid at one

moment as seen from another moment• Such a database completely captures the history of current

and retroactive changes

8

Snodgrass• Wrote “A Taxonomy of Time in Databases” (1985)• Regarded as the authoritative work• Defines three types of time

9

Types of Time• There are three times:• Transaction time—when the change is made to the database• Valid time—when the change was approved• User-defined time—when the raise is effective (application-

specific)

10

Static Database• Database that reflects changing values with a snapshot at any

particular time• Does not necessarily reflect the state of the real world• We discard past states of the database

• Can’t answer questions about what past values were• Can’t show trends in change over time

11

Static Rollback Database• Adds some time sequence• Ability to back out transactions is provided• Can back out one at a time to previous states of the database

• Such a database shows the history of transactions rather than the history of the actual data

12

Snodgrass’s Taxonomy• Is a useful way to talk about time in databases• His levels of databases are theoretical, a basis for research• What if we want to model current and past states of a single

table in a SQL database system?

13

Point in Time Databases• We can model current and former values of data in a single

relational table• We need to add time to the data model so that changes are

captured• It is possible to use a single table to capture current and

former values of data in a single relation• As performance limitations of databases disappear because of

increasing machine power, this becomes practical• In fact, this approach is used in some advanced applications• However, it is not widely known

14

Point In Time Databases• A single database models the present and all past states of the

data as it matters to the organization• Enough information about past states is stored so that the

past state can be obtained using a SQL query• Instead of deleting rows they are marked as no longer valid• When values are updated, the old values are kept as well• This was impractical in 1970 because of cost and efficiency

issues• Is it practical today?

15

What We Store• With each change to a row, we insert a new row, instead of

changing the previous row• Consider EMP(EMPNO, ENAME, SAL)• We add the attributes START and END

• EMP(EMPNO, ENAME, SAL, START, END)• New attributes are start and end times for validity

• When we change a row, that time is the END time for validity of the previous row, START time of validity for the new row

• These times are usually what Snodgrass calls “user-defined” time because that’s the time that matters to the organization• His “valid time” is just a date of transaction approval• Transaction time is when entered into the database

• Let’s call this time “effective time” because it’s the time when the row’s values are effective

16

Point In Time Operations• Insert:• New row is added• Start time is set• Stop time is left null

• Delete:• Stop time is set

• Update:• New row added with new values, start time• Old row is kept and stop time is set

• Read:• Current values: STOP ISNULL• As of values: ASOF >= START AND (ASOF < STOP OR STOP ISNULL) 17

Example: Retroactive Pay Increase• Chen is given a salary increase• The salary increase is approved at date t1

• The increase is effective at date t2.• However, the increase does not get entered into the database

until some other time, t3>t2 (called a retroactive salary increase).

• In this case, t1 is an attribute of the approval process, t2 is the start of effective time and t3 is the transaction time.

• Most important to the organization is the time when Chen is getting the higher pay 18

An Example: EMP

Schema Definition: EMP(EMPNO,ENAME,JOB,SAL,DEPTNO)

Now we add START and STOP for validity dates:EMP(EMPNO,ENAME,JOB,SAL,DEPTNO,START,STOP)

• START gives the first time of validity for the values in the row• STOP gives the last time of validity for the values in the row• A row with no value for STOP is the current value

19

Example

To find the status of EMP on 1/1/2008

SELECT * FROM EMPWHERE START <= to_date( ‘01-JAN-

2008','dd-mmm-yyyy') “1/1/2008” AND (STOP >= to_date( '01-JAN-2008','dd-mmm-yyyy') OR STOP ISNULL);

DELETE statements will now be UPDATEs instead

20

Complexities• Not every column value needs to be tracked• Can simplify by not tracking some column values• This is a semantic issue

• Change in primary key• If primary key changes, hard to relate changed row to original

row• Potential case for use of generated key

• Cascading changes• Other tables may change because of one change• Use trigger instead of automatic foreign key update

• Cascading deletes• Other tables may have rows deleted because of one change• Use trigger instead of automatic foreign key delete

21

Capabilities• These techniques allow you to implement a database that can

show its state as of any time in the past• This can be done while the database is actively being used and

even updated by multiple processes• One process can be looking at the database as of two years

ago while another is doing current processing• You can record retroactive changes• You can implement applications to process retroactive

changes

22

Bottom Line• This is a practical approach that potentially simplifies data

models and processing• One table replaces duplication of data and completely

different functions• One somewhat more complex function replaces two (or more)

separate functions• Use can be presented a unifying interface that simplifies use of

the application• It’s a tradeoff—sometimes use a separate history table

23

Homework• You have the opportunity to explore this• You get to build a point-in-time EMP table• Then you use it to give a retroactive raise• And then compute how much we owe to the person with the

retroactive raise

24