Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... ·...
Transcript of Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... ·...
![Page 1: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/1.jpg)
1
Introduction to Database Systems
CSE 444
Lecture #1
September 27, 2006
![Page 2: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/2.jpg)
2
About Me
Dan Suciu:
• Joined the department in 2000
• Before that: Bell Labs, AT&T Labs
Research:
• Past: XML and semi-structured data:– Query language: XML-QL (later XQuery)
– Compressor: XMill
– Theory: XPath containment, XML typechecking
• Present: Probabilistic databases: MystiQ
![Page 3: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/3.jpg)
3
Staff
• Instructor: Dan Suciu– Allen, Room 662, [email protected]
Office hours: Wednesdays 11:30 (appointment strongly recommended)
• TAs:
– Jue Wang, [email protected] hours: Fridays 1:00-2:00, Room TBA
![Page 4: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/4.jpg)
4
Communications
• Web page: http://www.cs.washington.edu/444/– Lectures will be available here
– Homeworks will be posted here (HW1 is posted)
– The project description will be here
• Mailing list:– Announcements, group discussions
– Please subscribe
![Page 5: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/5.jpg)
5
Textbook(s)
Main textbook, available at the bookstore:
• Database Systems: The Complete Book,
Hector Garcia-Molina,
Jeffrey Ullman,
Jennifer Widom
Most chapters are good. Some are not (functional dependecies).
COME TO CLASS ! ASK QUESTIONS ! READ SLIDES !
![Page 6: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/6.jpg)
6
Other Texts
Available at the Engineering Library (not on reserve):
• Database Management Systems, Ramakrishnan
• XQuery from the Experts, Katz, Ed.
• Fundamentals of Database Systems, Elmasri, Navathe
• Foundations of Databases, Abiteboul, Hull, Vianu
• Data on the Web, Abiteboul, Buneman, Suciu
![Page 7: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/7.jpg)
7
Outline of Today’s Lecture
1. Overview of DBMS
2. DBMS through an example
3. Course outline
4. Assignment 1, Homework 1
![Page 8: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/8.jpg)
8
Database
What is a database ?
Give examples of databases
![Page 9: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/9.jpg)
9
Database
What is a database ?
• A collection of files storing related data
Give examples of databases
• Accounts database; payroll database; UW’s
students database; Amazon’s products
database; airline reservation database
![Page 10: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/10.jpg)
10
Database Management System
What is a DBMS ?
Give examples of DBMS
![Page 11: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/11.jpg)
11
Database Management System
What is a DBMS ?
• A big C program written by someone else that
allows us to manage efficiently a large database
and allows it to persist over long periods of time
Give examples of DBMS
• DB2 (IBM), SQL Server (MS), Oracle, Sybase
• MySQL, Postgres, …
![Page 12: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/12.jpg)
12
Market Shares
From 2004 www.computerworld.com
• IMB: 35% market with $2.5BN in sales
• Oracle: 33% market with $2.3BN in sales
• Microsoft: 19% market with $1.3BN in sales
![Page 13: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/13.jpg)
13
An Example
The Internet Movie Databasehttp://www.imdb.com
• Entities:Actors (800k), Movies (400k), Directors, …
• Relationships:who played where, who directed what, …
Want to store and process locally; what functions do we need ?
![Page 14: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/14.jpg)
14
What the Database Systems Does
1. Create/store large datasets
2. Search/query/update
3. Change the structure
4. Concurrent access to many user
5. Recover from crashes
6. Security (not here, but in other apps)
![Page 15: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/15.jpg)
15
Possible Organizations
• Files
• Spreadsheets
• DBMS
![Page 16: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/16.jpg)
16
1. Create/store Large Datasets
• Files
• Spreadsheets
• DBMS
Yes, but…
Not really…
Yes
![Page 17: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/17.jpg)
17
2. Search/Query/Update
• Simple query:– In what year was ‘Rain man’ produced ?
• Multi-table query:– Find all movies by ‘Coppola’
• Complex query:
– For each actor, count her/his movies
• Updating– Insert a new movie; add an actor to a movie; etc
![Page 18: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/18.jpg)
18
2. Search/Query/Update
• Files
• Spreadsheets
• DBMS
Simple queries
Multi-table queries
(maybe)
All
Updates: generally OK
![Page 19: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/19.jpg)
19
3. Change the Structure
Add Address to each Actor
• Files
• Spreadsheets
• DBMS
Very hard
Yes
Yes
![Page 20: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/20.jpg)
20
4. Concurrent Access
Multiple users access/update the data concurrently
• What can go wrong ?
• How do we protect against that in OS ?
• This is insufficient in databases; why ?
![Page 21: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/21.jpg)
21
4. Concurrent Access
Multiple users access/update the data concurrently
• What can go wrong ?
– Lost update; resulting in inconsistent data
• How do we protect against that in OS ?
– Locks
• This is insufficient in databases; why ?
– A logical action consists of multiple updates
![Page 22: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/22.jpg)
22
5. Recover from crashes
• Transfer $100 from account #4662 to #7199:
X = Read(Account, #4662);X.amount = X.amount - 100;Write(Account, #4662, X);
Y = Read(Account, #7199);Y.amount = Y.amount + 100;Write(Account, #7199, Y);
X = Read(Account, #4662);X.amount = X.amount - 100;Write(Account, #4662, X);
Y = Read(Account, #7199);Y.amount = Y.amount + 100;Write(Account, #7199, Y);
CRASH !
What is the problem ?
![Page 23: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/23.jpg)
23
Enters a DMBS
Data files
Database server
(someone else’s
C program) Applications
connection
(ODBC, JDBC)
“Two tier system” or “client-server”
![Page 24: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/24.jpg)
24
DBMS = Collection of Tables
Still implemented as files,
but behind the scenes can be quite complex
Directors: Movie_Directors:
Movies:
“data independence”
. . .
CoppolaFrancis Ford15901
lNamefNameid
. . .
1972The Godfather130128
YearTitlemid
. . .
13012815901
midid
![Page 25: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/25.jpg)
25
1. Create/store Large Datasets
Use SQL to create and populate tables:
CREATE TABLE Actors (Name CHAR(30)DateOfBirth CHAR(20)
) . . .
CREATE TABLE Actors (Name CHAR(30)DateOfBirth CHAR(20)
) . . .
INSERT INTO ActorsVALUES(‘Tom Hanks’, . . .)
INSERT INTO ActorsVALUES(‘Tom Hanks’, . . .)
Size and physical organization is handled by DBMS
We focus on modeling the database
Will study data modeling in this course
![Page 26: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/26.jpg)
26
2. Searching/Querying/Updating
• Find all movies by ‘Coppola’
• What happens behind the scene ?
SELECT titleFROM Movies, Directors, Movie_DirectorsWHERE Directors.lname = ‘Coppola’ and
Movies.mid = Movie_Directors.mid andMovie_Directors.id = Directors.id
SELECT titleFROM Movies, Directors, Movie_DirectorsWHERE Directors.lname = ‘Coppola’ and
Movies.mid = Movie_Directors.mid andMovie_Directors.id = Directors.id
We will study SQL in gory details in this course
We will discuss the query optimizer in class.
![Page 27: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/27.jpg)
27
3. Changing the Structure
Add Address to each Actor
ALTER TABLE ActorADD address CHAR(50)DEFAULT ‘unknown’
ALTER TABLE ActorADD address CHAR(50)DEFAULT ‘unknown’
Lots of cleverness goes on behind the scenes
![Page 28: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/28.jpg)
28
3&4 Concurrency&Recovery:
Transactions
• A transaction = sequence of statements that
either all succeed, or all fail
• E.g. Transfer $100BEGIN TRANSACTION;
UPDATE AccountsSET amount = amount - 100WHERE number = 4662
UPDATE AccountsSET amount = amount + 100WHERE number = 7199
COMMIT
BEGIN TRANSACTION;
UPDATE AccountsSET amount = amount - 100WHERE number = 4662
UPDATE AccountsSET amount = amount + 100WHERE number = 7199
COMMIT
![Page 29: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/29.jpg)
29
Transactions
• Transactions have the ACID properties:
A = atomicity
C = consistency
I = isolation
D = durability
![Page 30: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/30.jpg)
30
4. Concurrent Access
• Serializable execution of transactions
– The I (=isolation) in ACID
We study three techniques in this course
Locks
Timestamps
Validation
![Page 31: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/31.jpg)
31
5. Recovery from crashes
• Every transaction either executes
completely, or doesn’t execute at all
– The A (=atomicity) in ACID
We study three types of log files in this course
Undo log file
Redo log file
Undo/Redo log file
![Page 32: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/32.jpg)
32
Course Outline
Part I
• SQL, Relational model, database design
• XML, XPath, Xquery
• Midterm: Friday, October 27 (in class)
Part II
• Database security, Transactions
• Concurrency control and recovery
• Query execution and optimization
Final: Monday, December 11, 8:30-10:20(this room)
![Page 33: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/33.jpg)
33
Grading
• Homework: 30%
• Project: 25%
• Midterm: 15%
• Final: 25%
• Intangibles: 5%
![Page 34: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/34.jpg)
34
The Project
• Models data management needs of a company
• Will have four phases
• We use SQL Server, C#, .NET
• First phase: handed out next week
![Page 35: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/35.jpg)
35
Assignment 1, Homework 1
• Reading assignment for Friday:– Introduction from SQL for Web Nerds,
by Philip Greenspun, http://philip.greenspun.com/sql/
• Login SQL Server– User name = your U email address
– Password = "studentID" + "!A"
• Homework 1: due Wednesday, October 11 http://www.cs.washington.edu/education/courses/cse444/CurrentQtr/hw/index.htm
![Page 36: Introduction to Database Systems CSE 444courses.cs.washington.edu/courses/cse444/06au/... · Database Management System What is a DBMS ? • A big C program written by someone else](https://reader035.fdocuments.net/reader035/viewer/2022063013/5fcc103840a6b450d90065af/html5/thumbnails/36.jpg)
36
Accessing SQL Server
SQL Server Management Studio
• Server Type = Database Engine
• Server Name = IISQLSRV
• Authentication = SQL Server Authentication– Login = your email address
– Password = 11111111
Change your password !!
Then play with IMDB