INF5100 Advanced Database Systems

37
1 M. Naci Akkøk, Fall 2004 Page 1 Department of Informatics, University of Oslo, Norway INF5100 – Advanced Database Systems Week 1 Week 1 – Introduction Introduction INF5100 INF5100 Advanced Database Systems Advanced Database Systems (Previously INF3180, also based upon earlier INF312, IN (Previously INF3180, also based upon earlier INF312, IN- MDS and UNIKI 330) MDS and UNIKI 330) Reference: All foils in INF5100 are based upon earlier foils by Vera Goebel (partially also by Earl & Denise Ecklund and Knut Hegna) M. Naci Akkøk, Fall 2004 Page 2 Department of Informatics, University of Oslo, Norway INF5100 – Advanced Database Systems INTRODUCTION INTRODUCTION – Organization of the course Organization of the course Your instructors are: M. Naci Akkøk (NAK) for most of the lectures. Norun Sanderson (NS) for the mandatory exercise. She will also hold two lectures: on ObjectStore (example) and on “knowledge on ad-hoc networks”. Toto Horvli (TH), from NCR, Teradata, shall talk about data-warehousing and a bit about data mining. Knut Omang (KO), from FAST, shall talk about information storage/retrieval and a bit about the Semantic Web The course is 14 weeks, distributed as… 12 x 3 hours (33 hours) of lectures 2 x 3 hours of “space”, 1 for starting up mandatory exercises (term projects), 1 for mid-terms (for all of IfI)

Transcript of INF5100 Advanced Database Systems

Page 1: INF5100 Advanced Database Systems

1

M. Naci Akkøk, Fall 2004 Page 1Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

Week 1 Week 1 –– IntroductionIntroduction

INF5100INF5100Advanced Database SystemsAdvanced Database Systems

(Previously INF3180, also based upon earlier INF312, IN(Previously INF3180, also based upon earlier INF312, IN--MDS and UNIKI 330)MDS and UNIKI 330)

Reference:All foils in INF5100 are based upon earlier foils by

Vera Goebel (partially also by Earl & Denise Ecklund and Knut Hegna)

M. Naci Akkøk, Fall 2004 Page 2Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Organization of the courseOrganization of the course

• Your instructors are:• M. Naci Akkøk (NAK) for most of the lectures.• Norun Sanderson (NS) for the mandatory exercise. She

will also hold two lectures: on ObjectStore (example) and on “knowledge on ad-hoc networks”.

• Toto Horvli (TH), from NCR, Teradata, shall talk about data-warehousing and a bit about data mining.

• Knut Omang (KO), from FAST, shall talk about information storage/retrieval and a bit about the Semantic Web

• The course is 14 weeks, distributed as…• 12 x 3 hours (33 hours) of lectures• 2 x 3 hours of “space”, 1 for starting up mandatory

exercises (term projects), 1 for mid-terms (for all of IfI)

Page 2: INF5100 Advanced Database Systems

2

M. Naci Akkøk, Fall 2004 Page 3Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Exam & other course infoExam & other course info

• Exam:To be announced. May be oral or written, depending upon the number of students taking the exam.

• Information about the course:All information, as well as course materials and relevant messages are on: http://www.uio.no/studier/emner/matnat/ifi/INF5100/h04/

See also the “official” home-page of the course:http://www.uio.no/studier/emner/matnat/ifi/INF5100/index-eng.html

M. Naci Akkøk, Fall 2004 Page 4Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Mandatory exercise (or term project or Mandatory exercise (or term project or ““obligoblig””))

• Goal: Learn and use concepts of advanced database systems. Typically, given a mini-world description and/or a ”problem”:

• Design a solution (like an OO or OR schema, a transactional solution, distribution solution or the like)

• Implement some of the design

• Learn to use various technologies, tool-specific facilities etc.

• Organization: Work in groups of 33 students!

• Delivery date: Wednesday Wednesday 3.Nov.20043.Nov.2004 at 16:00 hrs.16:00 hrs.

• Help: Norun Sanderson ([email protected])

Page 3: INF5100 Advanced Database Systems

3

M. Naci Akkøk, Fall 2004 Page 5Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Goals of the course: What to expectGoals of the course: What to expect

• We shall learn about concepts and design, not so much about concrete systems!

• We shall aim to understand application requirements in order to determine which DBS technologies to use for the specific requirements…

• We shall look at novel but relatively mature DBS technologies, not the latest research approaches.

M. Naci Akkøk, Fall 2004 Page 6Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Contents & organization #2Contents & organization #2

• Introduction, applications & requirements Naci Akkøk (NAK)• Introduction to the mandatory exercise

and Object Store Norun Sanderson (NS)Beyond relational DBS NAK

• OODBS, standards & active databases NAK• Distributed DBS NAK• Heterogeneous DBS NAK• Transaction models NAKSPACE (week 40, starting Monday 27th September 2004)• Transaction management in HDBS NAKSPACE (week 42, starting Monday 11th October 2004)• Change management, XML and the WWW. NAK• Data-warehousing + data-mining intro Toto Horvli (TH)• Information storage/retrieval + Semantic Web Knut Omang (KO)• More on data mining + multimedia DBS NAK• Mobile systems and knowledge on ad-hoc

networks. NS, NAK• Summary and questions NAK

Page 4: INF5100 Advanced Database Systems

4

M. Naci Akkøk, Fall 2004 Page 7Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Literature: Literature: Mandatory syllabus (Mandatory syllabus (““pensumpensum””))

• All slides & handouts !!!

• Elmasri/Navathe: Fundamentals of Database Systems, 3rd Edition, Addison-Wesley, 2000: Chapter 27

• Avi Silberschatz, Michael Stonebraker, Jeff Ullmann (Eds. Special Issue), Database Systems: Achievements and Opportunities, Communications of the ACM, Vol. 34 , No. 10, October 1991, pp. 110-120

See also Communications of the ACM on the ACM Portal (all volumes):http://portal.acm.org/browse_dl.cfm?linked=1&part=magazine&idx=J79&coll=portal&dl=ACM&CFID=24943463&CFTOKEN=15242811

M. Naci Akkøk, Fall 2004 Page 8Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTRODUCTION INTRODUCTION –– Literature: Literature: Recommended reading (Recommended reading (““anbefaltanbefalt””))

• Special Issue on OODBS, IEEE Computer, Vol. 23, No. 12, December 1990 • Berners-Lee, T., Caillian, R., Lautonen, A., Nielsen, H., Secret, A., The World Wide

Web, Communications of the ACM, Vol. 37, No. 8, August 1994, pp. 76-82 • Informix Web Integration Option, 1998, http://www.informix.com/ • Manola, F., Towards a Richer Web Object Model, ACM SIGMOD Records, Vol. 27, No. 1,

March 1998 • Mendelzon, A., Mihaila, G., Milo, T., Querying the World Wide Web, Journal on Digital

Libraries, Vol. 1, No. 1, April 1997 • Fraternali, P., Tools and Approaches for Data Intensive Web Applications: A Survey,

ACM Computing Surveys, Vol. 31, No. 3, September 1999 • Dogac, A. (Editor), Special Section on Electronic Commerce, ACM SIGMOD Records,

Vol. 27, No. 4, December 1998 • Grosky, W., Managing Multimedia Information in Database Systems, Communications

of the ACM, Vol. 40, No. 12, December 1997, pp. 72-80 • Pazandak, P., Srivastava, J., Evaluating Object DBMSs for Multimedia, IEEE Multimedia,

Vol. 4, No. 3, 1995, pp. 34-49 • Imielinski, T., Badrinath, B., Mobile Wireless Computing: Challenges in Data

Management, Communications of the ACM, Vol. 37, No. 10, October 1994, pp. 18-28 • Dunham, M., Helal, A., Mobile Computing and Databases: Anything New?, ACM

SIGMOD Records, Vol. 24, No. 4, December 1995 • Schatz, B., Information Retrieval in Digital Libraries: Bringing Search to the Net,

Science, Vol. 275, 17 January 1997

NOTA BENE (NB)! See also the Web-site of the course!

Page 5: INF5100 Advanced Database Systems

5

M. Naci Akkøk, Fall 2004 Page 9Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

APPLICATIONS & REQUIREMENTS APPLICATIONS & REQUIREMENTS –– We beginWe begin……

We start by looking at “older” vs. “newer”applications and their respective (storage & retrieval) requirements:

• We summarize traditional database systems and the development of technology to see the achievements & limitations of traditional DBS

• We then look at newer application domains and the new requirements imposed upon DBS’ by them

• We also look at classical and newer technical environments to see what they impose of requirements upon storage & retrieval

• Finally, we look at some examples, summarize current and developing DBS technologies and future implications

TraditionalDBS

Traditionalapplications

Traditionaltechnologies

Lead to Lead to

“Newer”DBS

“Newer”applications

“Newer”technologies

Lead to Lead to

??

M. Naci Akkøk, Fall 2004 Page 10Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

APPLICATIONS & REQUIREMENTS APPLICATIONS & REQUIREMENTS –– Traditional DBSTraditional DBS

• DBS based upon the relational data model: • Sybase, Oracle, DB2, Informix, ACCESS, etc.• Classic paper written by Ted Codd, IBM Research, 1970.

• DBS based upon the network data model: • IDS, VAX-DBMS, DMS-1100, SUPRA• CODASYL Data Base Task Force (1971) report (DBTG model)

• DBS based upon the hierarchical data model: • IMS (reference system, late 1960s), System-2000• No original documents describing the hierarchical model• But… do see

ftp://ftp.software.ibm.com/software/data/ims/shelf/presentations/imsoverview.pdf (at IBM) for an overview and a surprise – if you think that relational DBS are the only widely used DBS!

Page 6: INF5100 Advanced Database Systems

6

M. Naci Akkøk, Fall 2004 Page 11Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

APPLICATIONS & REQUIREMENTS APPLICATIONS & REQUIREMENTS –– A list of data modelsA list of data models

For the more curious (not compulsory)…

• For a more-or-less complete list of data models, see for example:

http://unixspace.com/context/databases.html

• For a good understanding of the various data models, you need to start with the hierarchical data model.

See for example lecture notes on the hierarchical data model by George Samaras (University of Cyprus, Dept. of Computer Science) on

http://www2.cs.ucy.ac.cy/~epl242/lectures.html

M. Naci Akkøk, Fall 2004 Page 12Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

APPLICATIONS & REQUIREMENTS APPLICATIONS & REQUIREMENTS –– Database technology timelineDatabase technology timeline

FROM: Nori, A., Databases in Internet Applications: Case Studies,in: Postmodern DBS, UC Berkeley, Spring 1999

NOTE: SLIGHTLY MODIFIED WITH RESPECT TO THE ORIGINAL

Era

DM

&

tech

nolo

gyK

ey a

pp.

& TP

Func

tiona

lity

Page 7: INF5100 Advanced Database Systems

7

M. Naci Akkøk, Fall 2004 Page 13Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

ACHIEVEMENTS and LIMITATIONS of ACHIEVEMENTS and LIMITATIONS of TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGY

OUTLINE:

• From files to databases

• DBMS (database management system) concepts

• Traditional and advanced (modern) DBS applications

• Comparison of classical & new requirements for DBMS concepts

• Conclusions

M. Naci Akkøk, Fall 2004 Page 14Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGY TRADITIONAL DB TECHNOLOGY –– IntroductionIntroduction

PROBLEM: Data management in applications

Page 8: INF5100 Advanced Database Systems

8

M. Naci Akkøk, Fall 2004 Page 15Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGY TRADITIONAL DB TECHNOLOGY –– Properties of database systemsProperties of database systems

Requirements:• Persistent data management• Concurrency control• Recovery• Ad-hoc queries• Data integration• Logical and physical data

independence• Data consistency• Data security and logging• Distribution

Technical environment:• Operating system

DATABASEMANAGEMENT

SYSTEM

APPLICATIONPROGRAM

OPERATING SYSTEM

DATABASE

M. Naci Akkøk, Fall 2004 Page 16Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGY TRADITIONAL DB TECHNOLOGY –– Database Management System (DBMS)Database Management System (DBMS)

Software for the …

• durable, life time of data > duration of creating process

• reliable, integrity, consistency, loss prevention

• independent, mutual modification immunity (AP ↔ DB)

… management and …

• comfortable, "higher" abstract interface

• flexible ad hoc access possibility

… usage of …

• large, data size > main memory size

• integrated, of/for multiple applications, controlled

• multi-user redundancy parallel access

… databases.

Page 9: INF5100 Advanced Database Systems

9

M. Naci Akkøk, Fall 2004 Page 17Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTSDBMS CONCEPTS –– Structures & OperationsStructures & Operations

Aim:• Representation of conceptual entity structures of applications• Consistent manipulation of representations

Determined by:• Entity structure• Entity size and occurrence• Possible operations

Data model:• Data Definition Language (DDL)• Data Manipulation Language (DML)• Integrity constraints• NB! Assume also a Data Query Language (DQL) for later use

MINI WORLD

DATABASE

DATA MODEL

STRUCTURES

OPERATIONS

SCHEMA

DATA

?

M. Naci Akkøk, Fall 2004 Page 18Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTSDBMS CONCEPTS –– ViewsViews

Aim:

• Creation of new facts based on existing facts

• Different perspectives on the same database for different users

Determined by:

• Definition facilities (mostly queries in DML)

• Modification possibilities (often very restricted)

• Degree of materialization

View model

Page 10: INF5100 Advanced Database Systems

10

M. Naci Akkøk, Fall 2004 Page 19Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTSDBMS CONCEPTS –– TransactionsTransactions

Aim:

• Consistency checking

• Synchronization of multi-user mode

• Data integrity (recovery in case ofsystem errors)

Determined by:

• Transaction duration

• Size of processed entities

• Way of cooperation / concurrency

Transaction model

Consistentdatabase

Consistentdatabase

Possiblyinconsistent

database

BEGIN TRANSACTIONop 1op 2::op n

END TRANSACTION

M. Naci Akkøk, Fall 2004 Page 20Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTS DBMS CONCEPTS –– Integrity constraintsIntegrity constraints

Aim:• Actual database state cannot be in conflict with the

constraints formulated in DBS possible kinds of constraints:• Inherent: part of data model• Implicit: can be formulated with concepts of data model• Explicit: assertions, triggers, dynamic integrity

constraints

Determined by:• Size and structure of evolved entities• Kind of constraints (state, state transitions, or state

sequences)• Kind of events that cause state transitions

Integrity model

Page 11: INF5100 Advanced Database Systems

11

M. Naci Akkøk, Fall 2004 Page 21Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTS DBMS CONCEPTS –– AuthorizationAuthorization

Aim:

• Restriction of access to database for individual users (groups)

• Restriction of access on representations of individual entities in the database for specific user groups

Determined by:

• Structure of evolved entities

• Available operations for entities

Authorization model

M. Naci Akkøk, Fall 2004 Page 22Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TRADITIONAL DB TECHNOLOGYTRADITIONAL DB TECHNOLOGYDBMS CONCEPTS DBMS CONCEPTS –– SummarySummary

• Data structures and operations

• Views

• Transactions

• Integrity constraints

• Authorization

• And of course others (like storage structures etc)

Page 12: INF5100 Advanced Database Systems

12

M. Naci Akkøk, Fall 2004 Page 23Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TECHNICAL ENVIRONMENT TECHNICAL ENVIRONMENT –– What is it that influences the DBS?What is it that influences the DBS?

Processor level• Architecture of integrated modules:

arrangement and connection of elementary circuits

Machine level• Computer architecture:

arrangement and connection of modules• Storage media

Operating system/network level• Operating system architecture:

architecture model (processes, storage management)• Network architecture:

arrangement and connection of computers (machines)( data/processing distribution: centralized, decentralized)

M. Naci Akkøk, Fall 2004 Page 24Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

GOALS GOALS –– Once more...Once more...

Comparison of...… requirements of classical application domains with

requirements of newer (nonstandard) application domains

… classical technical environments with newer technical environments

Show that concepts of classical database systems ...… are too restricted for new requirements

(i.e., requirements imposed by newer applications)… new technical environments are not fully exploited (i.e.,

newer technology implies possibilities that are constrained by the limitations in traditional DBS)

We now start comparing classical & newer application domains and their respective requirements upon the database systems!

Page 13: INF5100 Advanced Database Systems

13

M. Naci Akkøk, Fall 2004 Page 25Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CLASSICAL APPLICATION DOMAINS CLASSICAL APPLICATION DOMAINS –– ExamplesExamples

Bookkeeping systems

• Inventory control

• Financial management

• Travel industry

Planning systems

• Projects

• Production

M. Naci Akkøk, Fall 2004 Page 26Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWERNEWER APPLICATION DOMAINS APPLICATION DOMAINS –– ExamplesExamples

• CAD/CAM and CIM• Chemistry/pharmacy• Mechanics• Agronomy• Computer science• Electrical engineering• Geography, geology, geo-physics, etc.

• Office automation• Computer graphics• Multimedia applications• Knowledge representation and processing• Scientific and medical applications• The World Wide Web

Page 14: INF5100 Advanced Database Systems

14

M. Naci Akkøk, Fall 2004 Page 27Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CLASSICALCLASSICAL DOMAINSDOMAINS –– Requirements for DBMS conceptsRequirements for DBMS concepts

Structures and operations:• Relatively simply structured entities• Large amount of relatively small entities • Only generic operations

Example:• Management of accounts in

relational model

CREATE TABLE account ( account_no NUMBER,name CHAR (50),address CHAR (100),limit NUMBER,PRIMARY KEY account_no )

CREATE TABLE accounting ( debit NUMBER,credit NUMBERFOREIGN KEY (account),date DATE,amount NUMBER,FOREIGN KEY debit REFERENCES account,FOREIGN KEY credit REFERENCES account)

Account

Accounting

CREDIT DEBIT

(1,1)

(0,n) (0,n)

(1,1)

M. Naci Akkøk, Fall 2004 Page 28Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER DOMAINS #1NEWER DOMAINS #1 –– Requirements for DBMS conceptsRequirements for DBMS concepts

Structures and operations• Very complexly structured entities• Relatively small amount of large entities• User-defined operations

Example: VLSI design

Page 15: INF5100 Advanced Database Systems

15

M. Naci Akkøk, Fall 2004 Page 29Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER DOMAINS #2NEWER DOMAINS #2 –– Requirements for DBMS conceptsRequirements for DBMS concepts

Structures and operations• Very complexly structured

entities• Relatively small amount of

large entities• User-defined operations

Example: Multimedia entities• Image:

Format 200 x 200 x 8 = 320 KBytes• Video sequence:

Ca. 700 KBytes/s (100x100x8x24), uncompressed and without sound

• Audio sequence:Ca. 190 KBytes/s, uncompressed in CD quality

Important note: History of entitydevelopment leads to versioningconcepts

IMAGE

VIDEO SEQUENCE

SOUND

Skrue McDuckPengebingenGullveien No. 10111 ANDEBY

TEXTUAL DATA

PERSON DESCRIPTION

M. Naci Akkøk, Fall 2004 Page 30Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER DOMAINS #3NEWER DOMAINS #3 –– Requirements for DBMS conceptsRequirements for DBMS concepts

Structures and operations• Very complexly

structured entities• Relatively small amount

of large entities• User-defined operations

Example: CAD object

• Move a rectangle (consisting of 4 points) by a vector z with components dx and dy x

y

Move rectangle: a = (1, 1) → a’ = (5, 5)b = (7, 1) → b’ = (11, 5)c = (1, 3) → c’ = (5, 7)d = (7, 3) → d’ = (11, 7)

1 2 3 4 5 6 7 8 9 10 11

1

2

3

4

5

6

7

1) for each point dofor each coordinate do

modify_value (delta)

2) move_quadrangle (delta_vector)

a b

c d

a' b'

c' d'

Page 16: INF5100 Advanced Database Systems

16

M. Naci Akkøk, Fall 2004 Page 31Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER REQUIREMENTS FOR DBMS CONCEPTS NEWER REQUIREMENTS FOR DBMS CONCEPTS –– A first summaryA first summary

• Current (relational) DBMS technology does not fulfill the new (nonstandard) requirements

• Very complexly structured entities• Fragmentation of entity representation

• Very large entities• Restricted attribute size, no access support

• User-defined operations• Major part of entity semantics in application

program

M. Naci Akkøk, Fall 2004 Page 32Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CLASSICAL REQUIREMENTS FOR DBMS CONCEPTS CLASSICAL REQUIREMENTS FOR DBMS CONCEPTS –– TransactionsTransactions

• Few (small) objects are read or modified

• Short duration

• No cooperation, but concurrency

Example: Debit/credit transactions in relational systemsPROCEDURE transfer (debit_account_no, credit_account_no: INTEGER;

amount: INTEGER);

BEGIN

BEGIN TRANSACTION;

IF NOT (liquid(debit_account_no)) THEN ABORT TRANSACTION;

ELSE EXEC SQL INSERT TABLE accounting VALUES (debit_account_no, credit_account_no, amount);

ENDIF;

END TRANSACTION;

END transfer;

Page 17: INF5100 Advanced Database Systems

17

M. Naci Akkøk, Fall 2004 Page 33Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER REQUIREMENTS FOR DBMS CONCEPTS NEWER REQUIREMENTS FOR DBMS CONCEPTS –– Comparing transactionsComparing transactions

Transactions in for example CAD:• Large (complexly structured) entities are processed in a

complexly structured way• Long duration (hours/days/weeks)

• Much cooperation between usersExamples:

• Machine design, architecture,cooperative/collaborative work

M. Naci Akkøk, Fall 2004 Page 34Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CLASSICAL REQUIREMENTS FOR DBMS CONCEPTS CLASSICAL REQUIREMENTS FOR DBMS CONCEPTS –– Integrity constraintsIntegrity constraints

Integrity constraints:• Database states consist of simply structured, small entities• State transitions by transactions or generic operations• Constraints for arbitrary database states (assertions)• Constraints for specific database states (triggers)

Examples (rarely supported today)

• AssertionCREATE ASSERTION no_overdraw ON balance B, account A

(B.credit >= K.limit) AND (B.account_no = A.account_no)

• TriggerCREATE TRIGGER AFTER UPDATE OF accounting

ASSERT no_overdraw

REACTION printf ("amount not available at this moment!");

ABORT TRANSACTION;

ARBITRARY STATE TRANSITION

CONDITION

balance, account

CONDITION

accounting

update

abort

accounting

Page 18: INF5100 Advanced Database Systems

18

M. Naci Akkøk, Fall 2004 Page 35Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

NEWER REQUIREMENTS FOR DBMS CONCEPTS NEWER REQUIREMENTS FOR DBMS CONCEPTS –– Comparing integrity Comparing integrity constraintsconstraints

Newer integrity constraints:• Database states consist of complexly structured, large

entities• State transitions by transactions and generic operations

and arbitrary events or sequences of events ( active systems)

• Conditions for arbitrary sequences of database states

Example: Cadastre data

• No ”dead” space between parcels• No overlapping parcels

P9

P10

P11

P8

P6 P7

P5

P1 P2

P4P3

ROAD

M. Naci Akkøk, Fall 2004 Page 36Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TECHNICAL ENVIRONMENTS TECHNICAL ENVIRONMENTS –– The classical technical environmentThe classical technical environment

Processor level• CISC CPUs

Machine level• "stupid" terminals• scalable hosts (primary, secondary storage, and processors)• relatively powerful personal computers

Operating system/network level• heterogeneous operating systems for different machine

configurations• personal computers as host terminals

(host-based DBMS can hardly use the capacity of personal computers)

Page 19: INF5100 Advanced Database Systems

19

M. Naci Akkøk, Fall 2004 Page 37Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

TECHNICAL ENVIRONMENTS TECHNICAL ENVIRONMENTS –– The The newernewer technical environmenttechnical environment

Processor level• RISC CPUs• Dedicated modules for image and signal processing,

communication, arithmetic, pattern recognition and storage management

Machine level• Powerful workstations (often multiprocessor systems)• Specialized servers (e.g., DBS machines, symbolic machines, file

servers etc.)

Operating system / network level• More homogeneous operating systems (UNIX) in spite of many

different hardware producers• Client/server configurations of dedicated machines (distribution

aspect)• Very fast networks (FDDI)

M. Naci Akkøk, Fall 2004 Page 38Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CONCLUSIONS CONCLUSIONS –– ReRe--statingstating……

• Existing DBMS technology can be improved• New requirements of new applications domains have to be

fulfilled• New technical environment

Which leads to:DBMS-technology with new or enhanced concepts

With support for new (and more complex) applications

Which provides better runtime, design, implementation, and maintenance efficiency

Page 20: INF5100 Advanced Database Systems

20

M. Naci Akkøk, Fall 2004 Page 39Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

OUR FOCUS OUR FOCUS –– The themes we will take upThe themes we will take up……

newernewer

M. Naci Akkøk, Fall 2004 Page 40Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CURRENT STATE OF DBMSCURRENT STATE OF DBMS’’ –– #1#1

OLTP (on-line analytical processing) applications• Large amounts of data• Simple data, simple queries and updates

• Update statement from debit/credit transaction:UPDATE accounts

SET abalance = abalance + :deltaWHERE aid = :aid;

• Typically update intensive• Large number of concurrent users (transactions)

Data warehousing applications• Large amounts of data• Simple data but complex querying• Typically read intensive• Large number of users

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 21: INF5100 Advanced Database Systems

21

M. Naci Akkøk, Fall 2004 Page 41Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CURRENT STATE OF DBMSCURRENT STATE OF DBMS’’ –– #2#2

These applications require• Support for large number of users/transactions• High performance• High availability (7x24 operations)• Scalability• High levels of security• Administrative support• Good utilities

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 42Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTERNET APPLICATIONS INTERNET APPLICATIONS –– Challenges #1Challenges #1

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 22: INF5100 Advanced Database Systems

22

M. Naci Akkøk, Fall 2004 Page 43Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTERNET APPLICATIONS INTERNET APPLICATIONS –– Challenges #2Challenges #2

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 44Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTERNET APPLICATIONS INTERNET APPLICATIONS –– Challenges #3Challenges #3

Availability• Need near 100% availability• Must be easy to manage• Replication, hot standby, foolproof system?Scalability• Number of users is orders of magnitude higher Security• Global users• Managing millions of users• Encryption• PerformanceInternet user expectations• Speed vs. correctness • Availability vs. correctness

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 23: INF5100 Advanced Database Systems

23

M. Naci Akkøk, Fall 2004 Page 45Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTERNET APPLICATIONS INTERNET APPLICATIONS –– TodayToday’’s architectures architecture

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 46Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

EXAMPLE EXAMPLE –– CNN Custom News, characteristics #1CNN Custom News, characteristics #1

CNN CUSTOM NEWS characteristics

• On-line news service

• Allows users to customize news in a personalized manner

• Offers variety of news items (e.g. national, international, business etc.)

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 24: INF5100 Advanced Database Systems

24

M. Naci Akkøk, Fall 2004 Page 47Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

EXAMPLE EXAMPLE –– CNN Custom News application architectureCNN Custom News application architecture

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

DATA SOURCES

PHYSICAL MIDDLE TIER

CLIENT TIER

HTTP HTTP

APPLICATIONSERVER

WEP SERVER

APPLICATIONSERVER

WEP SERVER

APPLICATIONSERVER

WEP SERVER

BROWSER BROWSER

HARDWARE LOAD BALANCING

OPSORACLEDBMS

ORACLEDBMS

M. Naci Akkøk, Fall 2004 Page 48Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

EXAMPLE EXAMPLE –– CNN Custom News, characteristics #2CNN Custom News, characteristics #2

Back-end• SUN SOLARIS enterprise servers• Oracle Parallel Server 7.3.4

Middle-Tier (9 Machines)• Web Servers• Oracle Application Servers• PL/SQL Cartridges

Load Balancing• Hardware based• DNS router• Round -robin

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 25: INF5100 Advanced Database Systems

25

M. Naci Akkøk, Fall 2004 Page 49Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

EXAMPLE EXAMPLE –– CNN Custom News, characteristics #3CNN Custom News, characteristics #3

ORACLEAPPLICATIONSERVER

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

ADAPTER

CORBA BACK-END

CARTRID

GE

CARTRID

GE

CARTRID

GE

M. Naci Akkøk, Fall 2004 Page 50Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

EXAMPLE EXAMPLE –– CNN Custom News, characteristics #4CNN Custom News, characteristics #4

• Data feeds into the database

• Keeps text in the database

• Images in files

• Images accessed in the middle-tier

• PL/SQL Cartridge

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 26: INF5100 Advanced Database Systems

26

M. Naci Akkøk, Fall 2004 Page 51Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– ObservationsObservations

• Database is being used mostly for storage

• Application in the middle-tier

• Middle-tier also provides:• scalability• load balancing• large number of users

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 52Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Analyzing internet applications #1Analyzing internet applications #1

THEMES

• Web integration

• Web publishing

• Application integration

• e-commerce

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 27: INF5100 Advanced Database Systems

27

M. Naci Akkøk, Fall 2004 Page 53Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Analyzing internet applications #2Analyzing internet applications #2

WEB INTEGRATION

• Heterogeneous data sources

• Heterogeneous data types

• 1000s of data sources

• Dynamic data

• Warehousing

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 54Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Analyzing internet applications #3Analyzing internet applications #3

WEB PUBLISHING

• Problem: Internet placing new requirements on content management

• Heterogeneity: access different types of content from browsers e.g. Email, data warehouses, reports, HTML files

• Personalized: structured, dynamic, customized content• Transactive: content blending with application• Aggregation: portalization via major “gateways”

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 28: INF5100 Advanced Database Systems

28

M. Naci Akkøk, Fall 2004 Page 55Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Analyzing internet applications #4Analyzing internet applications #4

APPLICATION INTEGRATION

• Integrating Multiple Applications (e.g. ERP/Front Office)• Application workflow specification

• Asynchronous communication• Queuing and propagation

• Message tracking• Message warehouse (persistence)

• Message broker/server• Data transformation

• Transforming messages to different application formats (e.g. SAP, CLARIFY, …)

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 56Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Analyzing internet applications #5Analyzing internet applications #5

e-COMMERCE

• Automating business-to-business, business-to-consumer interactions

• Selling and buying• Order management• Product catalogs• Product configuration

• Sales and marketing• Education and training• Service• Communities

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 29: INF5100 Advanced Database Systems

29

M. Naci Akkøk, Fall 2004 Page 57Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Uses of DB technology #1Uses of DB technology #1

Business/workflow transactions• Support across multiple database/ERP systems• Transactional• Tools to generate compensating actions• Transformations

Queuing• Support for heterogeneous messages• Transactional• Querying, e.g. On attribute, value pairs• Indexing, e.g. On attribute, value pairs• Publish/subscribe

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 58Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Uses of DB technology #2Uses of DB technology #2

Rule engines• Complex business processing rules• Customization/profiling rules

• Business domain rules• Presentation rules

Repositories for Application Development• Managing Java objects, interfaces, etc.• Must for application integration• Standardized object models and protocols• Directories vs repositories

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 30: INF5100 Advanced Database Systems

30

M. Naci Akkøk, Fall 2004 Page 59Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Uses of DB technology #3Uses of DB technology #3

XML support• XML schema/storage• XML caching• XML querying• Coexistence with SQL – current efforts seem disjoint

Multiple caches• Consistency of middle-tier and database caches

Data mining• Algorithms need to become more pragmatic

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 60Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Uses of DB technology #4Uses of DB technology #4

Internet user expectations• Speed vs. correctness

(e.g. Search engines vs. blade/cartridge/extender)• Availability vs. correctness

Component Architecture• Caching• XML support• Querying• Transactions• Rule engines• Metadata management• Queuing

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 31: INF5100 Advanced Database Systems

31

M. Naci Akkøk, Fall 2004 Page 61Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– Uses of DB technology #5Uses of DB technology #5

Availability• Need near 100% availability• Must be easy to manage• Replication, hot standby, foolproof system?

Scalability• Number of users is orders of magnitude higher

Security• Global users• Managing millions of users• Encryption• Performance

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 62Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

INTERNET APPLICATIONS INTERNET APPLICATIONS –– A A ““modernmodern”” architecturearchitecture

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

XML enabled ORDBS

OLE/DBdata source

XML DATABASE

X

XML ENABLEDAuthoring Tools etc.

BROWSER BROWSER

Other documents on the Web, like

HTML, WORD

WEP/APP SERVER

DATA SOURCES

LOGICAL MIDDLE TIER

CLIENT TIER

XML enabledApplication messages

XML Integration andQuery Server;

Data-warehouse Server

XML TRANSFORMER and GATEWAY

XML XML

XML documents on

the Web

Page 32: INF5100 Advanced Database Systems

32

M. Naci Akkøk, Fall 2004 Page 63Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #1XML in the database arena #1

XML has the potential to impact four important markets

• Web integration

• Web publishing

• Application integration

• e-commerce

HINT: XML-enable the DBMS!

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 64Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #2XML in the database arena #2

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

XML-enabling the database system means:

• Storing XML data/documents the database server

• Querying and searching of structured and unstructured XML

• Generating XML data from the database server

• Adding XML capabilities in supporting database facilities

Page 33: INF5100 Advanced Database Systems

33

M. Naci Akkøk, Fall 2004 Page 65Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #3XML in the database arena #3

STORE XML DATA!

• Enhance XML storage facilities in the database with support in utilities

• Facilities to load XML data into the database• Provide more efficient database storage (componentized

storage, compression, indexing,…)• XML export facilities from the server

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 66Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #4XML in the database arena #4

SEARCH AND QUERY XML DATA!

• Search XML data efficiently • Special SQL queries over structured + unstructured XML• Content-based indexing (e.g. Text indexes) for searching

XML data efficiently• Support for XML query languages (e.g. XQL) on XML

data

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 34: INF5100 Advanced Database Systems

34

M. Naci Akkøk, Fall 2004 Page 67Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #5XML in the database arena #5

GENERATE XML!

• Generate XML from the database server• Map SQL92, SQL3 and PL/SQL datatypes to XML• Provide mappings between java, SQL and XML types

• Script XML content from the database• Allow SQL queries to return XML results• Provide embedded XML in stored procedures • Java scripting: support embedded XML in java • Common APIs to access any XML content in databases

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 68Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

CASE STUDIES CASE STUDIES –– XML in the database arena #6XML in the database arena #6

PROVIDE XML CACHING!

• Need to temporarily cache it, index it, update the cached copy, transact it

• Need to query XML caches

• Also requires a store for managing it in the middle-tier

• Provides XML logical views

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 35: INF5100 Advanced Database Systems

35

M. Naci Akkøk, Fall 2004 Page 69Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

MODERN DBMS MODERN DBMS –– Architecture #1Architecture #1

DBMS architecture for Internet applications:

• Monolithic architecture• Enhance the DBMS with all the features necessary for

supporting internet applications

• Component architecture• Provide components for supporting internet applications• Components can reside in the DBMS or in the middle-tier

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 70Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

MODERN DBMS MODERN DBMS –– Architecture #2Architecture #2

The MONLITHIC approach:

+ Database is the platform+ Leverage DBMS infrastructure+ Uniform management

- Not flexible- Forces 2-tier architecture- May not be suitable for high-end configurations- Not suitable for heterogeneous application integration

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 36: INF5100 Advanced Database Systems

36

M. Naci Akkøk, Fall 2004 Page 71Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

MODERN DBMS MODERN DBMS –– Architecture #3Architecture #3

The COMPONENT approach:

+ Flexible+ Accommodates multi-tier architecture - components can be

deployed in the middle or database tier+ Facilitates heterogeneous integration of applications

- Need to manage multiple components

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

M. Naci Akkøk, Fall 2004 Page 72Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

DBMS DBMS –– Staying Staying ““modernmodern”” #1#1

LOOKING AHEAD:

• Database Technology has lot to offer for building internet applications!

• Componentized databases maybe?

Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999

Page 37: INF5100 Advanced Database Systems

37

M. Naci Akkøk, Fall 2004 Page 73Department of Informatics, University of Oslo, NorwayINF5100 – Advanced Database Systems

DBMS DBMS –– Staying Staying ““modernmodern”” #2#2

… AND LOOKING BEYOND DATA:

• Database Technology has lot to offer (not only to Internet but) to any other kind of application where data needs to be represented for a purpose, stored, retrieved, distributed, exchanged, re-integrated etc.

• What about unstructured data? Take a look at the Web-site of (for example) FAST Search and Transfer (http://www.fastsearch.com/). What are they selling?

• What about information or knowledge (as contrasted with data)? How does one store and retrieve information? How does one represent information – i.e., what does an “information-base”look like as contrasted with a “data-base”?

• What does a “Semantic” Web require?