@ Carnegie Mellon Databases 1 Invalidation Clues for Database Scalability Services Amit Manjhi* 1,...
-
date post
21-Dec-2015 -
Category
Documents
-
view
222 -
download
0
Transcript of @ Carnegie Mellon Databases 1 Invalidation Clues for Database Scalability Services Amit Manjhi* 1,...
1 @Carnegie MellonDatabases
Invalidation Clues for Database Invalidation Clues for Database Scalability ServicesScalability ServicesInvalidation Clues for Database Invalidation Clues for Database Scalability ServicesScalability Services
Amit Manjhi*1, Phillip B. Gibbonsz, Anastassia Ailamaki*, Charles Garrod*, Bruce M. Maggs*y, Todd C. Mowry*z, Christopher Olston©*, Anthony Tomasic*, Haifeng Yux
* Carnegie Mellon University 1 Buxfer, Inc.z Intel Research Pittsburgh y Akamai Technologies© Yahoo! Research x National University of Singapore
2 @Carnegie MellonDatabases
Typical Architecture of Dynamic Typical Architecture of Dynamic Web ApplicationsWeb Applications
Home server
Web Server
App Server
DB
Users Request
Response
Execute code
Access DB
Internet
Dynamic Web applications need to provision for variable and unpredictable load
3 @Carnegie MellonDatabases
Content Delivery NetworksContent Delivery Networks
Users
• Scales central web server• Works well for static content
CDN nodes
Internet
4 @Carnegie MellonDatabases
CDN Application ServicesCDN Application Services
Users
CDN nodes
Database server is still a bottleneck
Internet
5 @Carnegie MellonDatabases
Database Scalability Service Database Scalability Service (DBSS) Architecture(DBSS) Architecture
Users
User queries answered from DB cache
Internet
How to guarantee privacy of data?
6 @Carnegie MellonDatabases
Privacy concerns dictate that:Privacy concerns dictate that:
UsersInternet
Home server maintains master copy and handles updates directly
DBSS is provided encrypted data• Cache base tables: does not work• Cache query results – invalidate on
updates
7 @Carnegie MellonDatabases
A Simple ExampleA Simple Example
Empty
Home server database
Q:SELECT id FROM comments WHERE story=“Wintel” AND rating>0
DBSS nodeQ
Q:id=11,15
U
Empty
Q
Nothing is encrypted
Results are encrypted
No Invalidations
Q:
Q:
U
Invalidate
More encryption can lead to more invalidations
comments (id, rating, story)
Result
Result
U:UPDATE comments SET rating=2 WHERE id=15
Q: id=11,15
11 1 Wintel
15 1 Wintel
11 1 Wintel
15 1 Wintel
11 1 Wintel
15 2 Wintel
11 1 Wintel
15 2 Wintel
8 @Carnegie MellonDatabases
Privacy-Scalability Space for Query Privacy-Scalability Space for Query Result CachingResult Caching
Sca
labi
lity
Privacy
(Maximum privacy, read-only scalability)
No encryption
Encrypt everything
Encrypt data not useful for invalidation (Our prior work, SIGMOD 2006)
Want solutions in this space Want solutions in this space
No Prior
Full
9 @Carnegie MellonDatabases
Our Approach: Invalidation CluesOur Approach: Invalidation Clues
Home serverDBSS
Database
Query
Update
Emptyquery clue
ResultQuery
query clue
ResultQueryResultQuery
Updateupdate clue
Invalidations (query clue, update clue)
Invalidation clues offer a more general, flexible framework
• Limit unnecessary invalidation• Limit revealed information
Limit home server overhead
10 @Carnegie MellonDatabases
UPDATE comments SET rating=? WHERE id=?
Example Bulletin-Board Example Bulletin-Board ApplicationApplication
Invalidation clues enable more precise invalidations than the “No” encryption scenario
1. Extra invalidation in no encryption scenario: results with rating_param<2 and no id=5 in result
2.Example clue: :
• story of comment being updated (update clue)
2 5
SELECT id FROM comments WHERE story=? AND rating>?
11 @Carnegie MellonDatabases
Privacy-Scalability Space for Query Privacy-Scalability Space for Query Result CachingResult Caching
Sca
labi
lity
Privacy
(Maximum privacy, read-only scalability)
No encryption
Encrypt everything
Encrypt data not useful for invalidation (Our prior work, SIGMOD 2006)
Want solutions in this space Want solutions in this space
No Prior
Full
Database
(Code-analysis privacy, maximum scalability)
clues offer fine-grained tradeoff
12 @Carnegie MellonDatabases
OutlineOutline
Introduction to invalidation clues framework Improving scalability in the clues framework Improving privacy in the clues framework Evaluation results Related work and summary
13 @Carnegie MellonDatabases
Improving Scalability in the Improving Scalability in the Clues FrameworkClues Framework
Fewer invalidations More scalability
What is the “most precise” invalidation that can be done?
As a first cut,
Database Inspection Strategy: Invalidate as if
using the database
Extra data (database clues) can either be attached to query results (query result clue) or updates (update clue)
14 @Carnegie MellonDatabases
Database Clues and BeyondDatabase Clues and Beyond
SELECT id FROM comments WHERE story=? AND rating>?
UPDATE comments SET rating=? WHERE id=?
Query Clue: Story of ALL comments
Auxiliary viewid story
Update Clue: Story of the comment being updated
On-the-fly1. Consistency2. Privacy
Still better: Opportunistic Strategy – use database clues only when benefit exceeds overhead
15 @Carnegie MellonDatabases
OutlineOutline
Introduction to invalidation clues framework Improving scalability in the clues framework Improving privacy in the clues framework Evaluation results Related work and summary
16 @Carnegie MellonDatabases
Attack Model of the DBSSAttack Model of the DBSS
UsersInternet
2. DBSS can pose as a user – chosen-plaintext attack
1. DBSS learns from query clues, update clues, and invalidations – ciphertext-only attack
17 @Carnegie MellonDatabases
Results on Improving PrivacyResults on Improving Privacy
Invalidation decision involves equality on id and story; order comparison on rating
Needless invalidations can improve privacy
SELECT id FROM comments WHERE story=? AND rating>?
UPDATE comments SET rating=? WHERE id=?
Key idea
Paper has details on improving privacy for equality and order comparisons
Extreme: If all query results are always invalidated, DBSS can’t distinguish between any two query results
18 @Carnegie MellonDatabases
OutlineOutline
Introduction to invalidation clues framework Improving scalability in the clues framework Improving privacy in the clues framework Evaluation results Related work and summary
19 @Carnegie MellonDatabases
Benchmark ApplicationsBenchmark Applications
Auction (RUBiS, from Rice)
Bulletin board (RUBBoS, from Rice)
Bookstore (TPC-W, from UW-Madison)
20 @Carnegie MellonDatabases
Evaluation MethodologyEvaluation Methodology
Home serverCDN and DBSSUsers
5 ms 100 ms
Scalability: max # concurrent users with acceptable response times
21 @Carnegie MellonDatabases
0
300
600
900
Auction Bboard Bookstore
No clues Clues (no DB clues)
Clues (incl. DB clues) Opportunistic
Sca
labi
lity
(num
ber
of
conc
urre
nt u
sers
sup
port
ed)
Benchmark Applications
0
1. Clues help2. Opportunistic has the best scalability
22 @Carnegie MellonDatabases
Related WorkRelated Work
Outsource database: [Hacigumus+ 2002], [Hacigumus+ 2002], [Agrawal+ 2004]
Outsource database scalability: DBCache [Luo+ 2002, Altinel+ 2003], DBProxy [Amiri+ 2003], NEC cache portal [Li+ 2003], MTCache [Larson+ 2004], [Manjhi+ 2006]
23 @Carnegie MellonDatabases
Related WorkRelated Work
View invalidation strategies: [Levy and Sagiv 1993], [Candan+ 2002], [Choi and Luo 2004]
Privacy: [Agrawal+ 2004], [Hore+ 2004], [Manjhi+ 2006]
24 @Carnegie MellonDatabases
SummarySummary
Invalidation clues: general framework for limiting Unnecessary invalidation Revealed information Home server overhead
Fine-grained tradeoff between privacy and scalability Database clues
Update clues better than query clues Opportunistic use of database clues best scalability
Evaluation on three application benchmarks
25 @Carnegie MellonDatabases
Back-up slides….Back-up slides….