01 intro
Transcript of 01 intro
![Page 1: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/1.jpg)
CPS 216: Advanced Database Systems
Shivnath Babu
Fall 2006
![Page 2: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/2.jpg)
Outline for Today
• What this class is about: Data management• What we will cover in this class• Logistics
![Page 3: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/3.jpg)
Data Management
Data
QueryQuery Query
App
licat
ion
DataBase Management System (DBMS)
![Page 4: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/4.jpg)
Example: At a Company
ID Name DeptID Salary …
10 Nemo 12 120K …
20 Dory 156 79K …
40 Gill 89 76K …
52 Ray 34 85K …
… … … … …
ID Name …
12 IT …
34 Accounts …
89 HR …
156 Marketing …
… … …
Employee Department
Query 1: Is there an employee named “Nemo”?Query 2: What is “Nemo’s” salary?Query 3: How many departments are there in the company?
Query 5: What is the name of “Nemo’s” department?Query 4: How many employees have Salary >= 80K?
Query 6: How many employees are there in the “Accounts” department?
![Page 5: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/5.jpg)
DataBase Management System (DBMS)
High-levelHigh-levelQuery QQuery Q
DBMS
Data
Answer
Translates Q intobest execution plan
for current conditions,runs plan
![Page 6: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/6.jpg)
Example: Store that Sells Cars
Make Model OwnerID
Honda Accord 12
Toyota Camry 34
Mini Cooper 89
Honda Accord 156
… … …
ID Name Age
12 Nemo 22
34 Ray 42
89 Gill 36
156 Dory 21
… … …
Cars Owners
Filter (Make = Honda andModel = Accord)
Join (Cars.OwnerID = Owners.ID)
Make Model OwnerID ID Name Age
Honda Accord 12 12 Nemo 22
Honda Accord 156 156 Dory 21
Owners ofHonda Accords
who are <=23 years old
Filter (Age <= 23)
![Page 7: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/7.jpg)
DataBase Management System (DBMS)
High-levelHigh-levelQuery QQuery Q
DBMS
Data
Answer
Translates Q intobest execution plan
for current conditions,runs plan
Keeps data safe and correct
despite failures, concurrent
updates, online processing, etc.
![Page 8: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/8.jpg)
DBMS is multi-user
• ExampleGet account balance from database;If balance > amount of withdrawal then balance = balance - amount of withdrawal; dispense cash; store new balance into database;
• Homer at ATM1 withdraws $100• Marge at ATM2 withdraws $50• Initial balance = $400, final balance = ?
– Should be $250 no matter who goes first
![Page 9: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/9.jpg)
Final balance = $300
read balance; $400
if balance > amount then balance = balance - amount; $300 write balance; $300
read balance; $400if balance > amount then balance = balance - amount; $350 write balance; $350
Homer withdraws $100:Marge withdraws $50:
![Page 10: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/10.jpg)
Final balance = $350
read balance; $400
if balance > amount then balance = balance - amount; $300 write balance; $300
read balance; $400
if balance > amount then balance = balance - amount; $350 write balance; $350
Homer withdraws $100:Marge withdraws $50:
![Page 11: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/11.jpg)
Concurrency control in DBMS
• Similar to concurrent programming problems– But data is not all in main-memory
• Appears similar to file system concurrent access?– Approach taken by MySQL initially; now
MySQL offers better alternatives• But want to control at much finer granularity
• Or else one withdrawal would lock up all accounts!
![Page 12: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/12.jpg)
Recovery in DBMS
• Example: balance transferdecrement the balance of account X by $100;increment the balance of account Y by $100;
• Scenario 1: Power goes out after the first instruction
• Scenario 2: DBMS buffers and updates data in memory (for efficiency); before they are written back to disk, power goes out
• Log updates; undo/redo during recovery
![Page 13: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/13.jpg)
DataBase Management System (DBMS)
High-levelHigh-levelQuery QQuery Q
DBMS
Data
Answer
Translates Q intobest execution plan
for current conditions,runs plan
Keeps data safe and correct
despite failures, concurrent
updates, online processing, etc.
![Page 14: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/14.jpg)
Summary of modern DBMS features
• Persistent storage of data• Logical data model; declarative queries and
updates ! physical data independence• Multi-user concurrent access• Safety from system failures• Performance, performance, performance
– Massive amounts of data (terabytes ~ petabytes)– High throughput (thousands ~ millions
transactions per minute)– High availability (¸ 99.999% uptime)
![Page 15: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/15.jpg)
Modern DBMS Architecture
Disk(s)
Applications
OS
Parser
Query Optimizer
Query Executor
Storage Manager
Logical query plan
Physical query plan
Access method API calls
SQL
File system API callsStorage system API calls
DBMS
![Page 16: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/16.jpg)
Course Outline
• 50% of the class is about core DBMS concepts– Query execution, query optimization, transactions,
recovery, etc.– Textbook material
• 50% of the class is on “what is happening today in data management”– Data streams– Web search – Google, Yahoo!– XML and data integration– Data mining– Sensor data management
![Page 17: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/17.jpg)
Using a Traditional DBMS
User/ApplicationUser/Application
LoaderLoader
QueryQuery ResultResult
Table R
Table S
ResultResult……
QueryQuery……
![Page 18: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/18.jpg)
New Approach for Data Streams
User/ApplicationUser/Application
Register Register Continuous QueryContinuous Query(Standing Query)(Standing Query)
Stream QueryProcessorInput streams
ResultResult
![Page 19: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/19.jpg)
Example Continuous (Standing) Queries
• Web – Amazon’s best sellers over last hour
• Network Intrusion Detection– Track HTTP packets with destination address
matching a prefix in given table and content matching “*\.ida”
• Finance – Monitor NASDAQ stocks between $20 and
$200 that have moved down more than 2% in the last 20 minutes
![Page 20: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/20.jpg)
Course Outline
• 50% of the class is about core DBMS concepts– Query execution, query optimization, transactions,
recovery, etc.– Textbook material
• 50% of the class is on “what is happening today in data management”– Data streams– Web search – Google, Yahoo!– XML and data integration– Data mining– Sensor data management
![Page 21: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/21.jpg)
New Challenges in DBMSs
High-levelHigh-levelQuery QQuery Q
DBMS
Answer
Data
TeraBytes PetaBytes<CD> <TITLE>Empire B.</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY><PRICE>10.90</PRICE></CD>
![Page 22: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/22.jpg)
Course Logistics
• Recommended reference: Database Systems: The Complete Book, by H. Garcia-Molina, J. D. Ullman, and J. Widom
• Web site: http://www.cs.duke.edu/education/courses/fall06/cps216
• Grading:– Homework Assignments 15%– Project 25%– Midterm 25%– Final 35%
![Page 23: 01 intro](https://reader036.fdocuments.net/reader036/viewer/2022062513/55506290b4c905ae3f8b54fa/html5/thumbnails/23.jpg)
Summary: Data Management is Important
• Core aspect of most sciences and engineering today
• Core need in industry
• Cool mix of theory and systems
• Chances are you will find something interesting even if you primary interest is elsewhere