CryptDB : A Practical Encrypted Relational DBMS

21
CryptDB: A Practical Encrypted Relational DBMS Raluca Ada Popa, Nickolai Zeldovich, and Hari Balakrishnan MIT CSAIL New England Database Summit 2011

description

CryptDB : A Practical Encrypted Relational DBMS. Raluca Ada Popa , Nickolai Zeldovich , and Hari Balakrishnan MIT CSAIL. New England Database Summit 2011. Problem: data leaks from DBs. Hackers Curious DB administrators Physical attacks Both on public clouds and private data centers - PowerPoint PPT Presentation

Transcript of CryptDB : A Practical Encrypted Relational DBMS

Page 1: CryptDB :  A Practical Encrypted Relational DBMS

CryptDB: A Practical Encrypted Relational DBMS

Raluca Ada Popa, Nickolai Zeldovich, and Hari BalakrishnanMIT CSAIL

New England Database Summit 2011

Page 2: CryptDB :  A Practical Encrypted Relational DBMS

Hackers Curious DB administrators Physical attacks Both on public clouds and private data centers Regulatory laws

Problem: data leaks from DBs

Page 3: CryptDB :  A Practical Encrypted Relational DBMS

Perform SQL query processing on encrypted data

Approach

Clientfrontend

Database serveruser queries

Trusted Stores schema, master key No query execution Stores the database and

processes SQL queries Not trusted to keep data

private

1. Support standard SQL queries on encrypted data2. Process queries completely at the DB server3. No change to existing DBMS

Page 4: CryptDB :  A Practical Encrypted Relational DBMS

?

Example

rank name salary

emp

SELECT * FROM emp WHERE salary = 100

x5a8c34

x934bc1x5a8c34

x5a8c34

x84a21c

x5a8c34

x638e54

x638e54x922eb4

x1eab81

SELECT * FROM table1 WHERE col1 = x5a8c34≥

Frontend

60100800100

?x5a8c34x5a8c34x5a8c34

x638e54x922eb4x638e54

x4be219

x95c623

x2ea887

x17cea7

x638e54

Page 5: CryptDB :  A Practical Encrypted Relational DBMS

1. SQL-aware encryption strategy– Different encryption schemes provide different

functionality2. Adjustable query-based encryption

– Adapt encryption of data based on user queries

Two techniques

Page 6: CryptDB :  A Practical Encrypted Relational DBMS

1. SQL-aware encryption

Privacy

e.g., =, !=, GROUP BY, IN, COUNT, DISTINCT

HighestScheme Operation Details

RND None AES in UFE

HOM +, *

AES in CTR DET equality

e.g., Paillier

SEARCH

join new JOIN

ILIKE Song et al.’00

OPE orderBoldyreva et al.

’09

e.g., >, <, ORDER BY, SORT, MAX, MIN

first practical implementation

Page 7: CryptDB :  A Practical Encrypted Relational DBMS

Any valueJOIN

SEARCHDET

RND

Any valueOPE-JOIN

OPERND

int valueHOM

Each column has the same key in a given layer of an onion

Onion 1 Onion 2 Onion 3

Onions of encryptions

Page 8: CryptDB :  A Practical Encrypted Relational DBMS

2. Adjustable query-based encryption

Start out the database with the most secure encryption scheme

Adjust encryption dynamically Strip off levels of the onions: frontend gives key to

server using a UDF

Page 9: CryptDB :  A Practical Encrypted Relational DBMS

Example

SELECT * FROM emp WHERE salary = 100000

UPDATE table1 SET col3onion1 = DecryptRND(key, col3onion1)

Any valueJOIN

SEARCHDETRND

SELECT * FROM table1 WHERE col3onion1 = x5a8c34

DETemp:

rank name salary

Page 10: CryptDB :  A Practical Encrypted Relational DBMS

JOIN needs new crypto Challenge: do not know which columns will be joined

Col2Col1

ClientFrontend

Join key Col1-Col2

Data items not revealed, cannot join without join key

= -

Page 11: CryptDB :  A Practical Encrypted Relational DBMS

Further components Inserts, updates, deletes, nested queries Indexes Transactions, auto-increments Optimizations to speed up performance Not supported: A.a + A.b > B.c

Page 12: CryptDB :  A Practical Encrypted Relational DBMS

Security converges… … to maximum privacy for query mix Onion levels stripped only when new operations

needed

Steady State: no decryptions at server

Practical: typical SQL processing on enlarged tuples

Page 13: CryptDB :  A Practical Encrypted Relational DBMS

• aggregation on salary nothing• no filter on a column nothing

• order predicate on name order

Privacy Guarantees

emp:

rank name salary

If query has • equality predicate on name

repeats

• Never reveal plaintext• Server cannot compute unrequested queries

requiring new relationships

Formal privacy definition and proof Implications:

Page 14: CryptDB :  A Practical Encrypted Relational DBMS

Privacy (cont’d) DB owner can specify minimum security level

for some fields CREATE TABLE emp (SSN text ≥ DET, name text, …)

Page 15: CryptDB :  A Practical Encrypted Relational DBMS

Implementation

Frontend

Unmodified DBMS

CryptDB PK tables

CryptDB UDFs (user-defined

functions)

Server

Query

Results

Encrypted Query

Encrypted Results

SQL Interface

No change to the DBMS Should work on most SQL DBMS

Page 16: CryptDB :  A Practical Encrypted Relational DBMS

Portability Ported CryptDB from Postgres to MySQL with

86 lines of code No change to MySQL Code changed was to connect to server, UDF

declarations

Page 17: CryptDB :  A Practical Encrypted Relational DBMS

Low overhead on TPC-C

Throughput loss 27%

• Supports all queries in TPC-C without change

Page 18: CryptDB :  A Practical Encrypted Relational DBMS

Microbenchmarks from TPC-C

Page 19: CryptDB :  A Practical Encrypted Relational DBMS

Adjustable encryption Steady state of columns for TPC-C:

71% of columns remain encrypted with RND

Importance of adjustable query-based encryption to privacy

In practice, we expect most sensitive fields to remain at RND or DET (e.g., credit cards)

Page 20: CryptDB :  A Practical Encrypted Relational DBMS

Theoretical approaches [Gennaro et al., ’10]

– Inefficient Search on encrypted data (e.g., [Chang, Mitzenmacher ‘05],

[Evdokimov, Guenther ’07])

– Restricted set of queries, inefficient Systems proposals (e.g., [Hacigumus et al., ’02])

– Lower degree of security, rewrite the DBMS, client-side processing

Related work

Page 21: CryptDB :  A Practical Encrypted Relational DBMS

Conclusions CryptDB is the first practical DBMS for running

most standard queries on encrypted data– Runs queries completely at server– Provides provable privacy guarantees– Modest overhead– Does not change the DBMS or client applications

Thanks!