Encrypted Search - csrc.nist.gov

44
Encrypted Search Seny Kamara

Transcript of Encrypted Search - csrc.nist.gov

Page 1: Encrypted Search - csrc.nist.gov

Encrypted SearchSeny Kamara

Page 2: Encrypted Search - csrc.nist.gov

2

4%

14,717,618,286*

* since 2013

Page 3: Encrypted Search - csrc.nist.gov

Why so Few?

3

“…because it would have hurt Yahoo’s ability to index and search message data…”

— J. Bonforte in NY Times

Page 4: Encrypted Search - csrc.nist.gov

Q: can we search on encrypted data?

4

Page 5: Encrypted Search - csrc.nist.gov

5

Can we? [SWP00]

O(#docs) [Goh03,CM05]

sec. defs [Goh03,CM05]

OPT time [CGKO06]

adaptive sec. defs [CGKO06]

dynamic in OPT time [KPR12,NPG14,CJJJKRS14]

forward private [SPS14,B16,…]

dual secure [AKM19]

I/O efficient [CJJJKRS14,CT14,…]

parallel [KPR13]

multi-user [CGKO06,JJKRS13,PPY18,…]

snapshot secure [AKM19]

graphs [CK10,MKNK15]

relational DBs [HILI02,KC05, PRZB11,KM19] beyond search

[CK10]

attacks [IKK12,CGPR15,ZKP16,BKM19] Boolean in sub-linear

[CJJJ+13,PKVK+14,KM17]

ranges [PBP16,…]

range attacks [NKW15,KKNO17,LMP18,…]

leakage suppression [KMO18,KM19]

Page 6: Encrypted Search - csrc.nist.gov

Interdisciplinary

6

Cryptography

Databases

Graph Algorithms

OptimizationStatistics

Information Retrieval

Data Structures

Distributed Systems

Page 7: Encrypted Search - csrc.nist.gov

Real-World Problem

• Major companies • Microsoft, SAP • Cisco, Google

Research • Hitachi, Fujitsu • more…

7

• Funding agencies • NSF • IARPA • DARPA

• Startups • too many to list

Page 8: Encrypted Search - csrc.nist.gov

Q: what about real-world customers?

8

Page 9: Encrypted Search - csrc.nist.gov

Is this Real?• Banks • Government agencies (US & Europe) • Fintech companies • Tech companies • Healthcare • Biotech • …

9

Page 10: Encrypted Search - csrc.nist.gov

Encrypted Search

10

Page 11: Encrypted Search - csrc.nist.gov

Encrypted Search• Sub-field focused on designing

• sub-linear algorithms over encrypted data • search engines & databases

• Searchable (symmetric) encryption (SSE) • keyword search over collection of encrypted files/documents • ElasticSearch, Lucene, …

• Encrypted databases (EDBs) • encrypted NoSQL & SQL (relational) databases • Postgres, SQL Server, MongoDB, CouchDB, …

11

Page 12: Encrypted Search - csrc.nist.gov

Encrypted Search (Building Blocks)

12

Property-Preserving Encryption (PPE)

Fully-Homomorphic Encryption (FHE)

Functional Encryption

Oblivious RAM (ORAM)

Structured Encryption (STE)Very leaky

Ω(n)

O(n)

Page 13: Encrypted Search - csrc.nist.gov

Efficiency

LeakageFunctionality

13

Page 14: Encrypted Search - csrc.nist.gov

Core Primitive: Structured Encryption• Schemes that • encrypt data structures (e.g., multi-maps, dictionaries, …) • support private queries on encrypted structures

• Applications • sub-linear searchable encryption (i.e., index-based SSE) • encrypted NoSQL & SQL databases • encrypted graph algorithms • secure multi-party computation

14

Page 15: Encrypted Search - csrc.nist.gov

Structured Encryption [Chase-K.10]

15

Setup(1k, DS) ⟶ (K, EDS)

Token(K, q) ⟶ tk

Query(EDS, tk) ⟶ ans

DS EDS

ans

tk

Page 16: Encrypted Search - csrc.nist.gov

Desiderata

16

Setup leakage

Query leakage

Size of EDS

Size of state

Size of tokenQuery time

ans

EDS

tk

Page 17: Encrypted Search - csrc.nist.gov

Structured Encryption [Chase-K.10]

• Many variants of STE • response-revealing • EDS query reveals answer in plaintext

• response-hiding • EDS query reveals encrypted answer

• non-interactive queries • clients sends single message called a token

• interactive queries • client and server execute multi-round protocol

17

Page 18: Encrypted Search - csrc.nist.gov

Background: Data Structures• Dictionaries map labels to values

• Put: DX[ℓ2] := v2 • Get: DX[ℓ2] returns v2

• Multi-Maps map labels to tuples

• Put: MM[ℓ3]:= (v2,v4) • Get: MM[ℓ3] returns (v2,v4)

18

DX

ℓ1 v1

ℓ2 v2

ℓ3 v3

MM

ℓ1 v1

ℓ2 v3

ℓ3 v2

v3 v4

v4

Page 19: Encrypted Search - csrc.nist.gov

Structured Encryption: Encrypted Dictionary [Chase-K.10]

19

Setup(1k, DS) ⟶ (K, EDX)

Token(K, q) ⟶ tk

Query(EDX, tk) ⟶ ans

DX EDX

ans

tk

Page 20: Encrypted Search - csrc.nist.gov

Structured Encryption: Encrypted Multi-Map [Chase-K.10]

20

Setup(1k, DS) ⟶ (K, EMM)

Token(K, q) ⟶ tk

Query(EMM, tk) ⟶ ans

MM EMM

ans

tk

Page 21: Encrypted Search - csrc.nist.gov

Adversarial Models

21

EDS0

ans ans

EDS0

EDS0

EDS1

EDS2

Persistent Snapshot

tk

u

tk

u

EDS0

tk

u

View View

Page 22: Encrypted Search - csrc.nist.gov

Persistent (Adaptive) Security [Curtmola-Garay-K.-Ostrovsky06,Chase-K.10]

• An STE scheme is (ℒS, ℒQ)-secure vs. a persistent adv. if

• it reveals no information about the structure beyond ℒS • it reveals no information about the structure and query beyond

ℒQ

22

Page 23: Encrypted Search - csrc.nist.gov

Snapshot (Adaptive) Security [Amjad-K.-Moataz19]

• We say that an STE scheme is ℒSnp-secure vs. a snapshot adv. if

• it reveals no information about the structure beyond ℒSnp

23

Page 24: Encrypted Search - csrc.nist.gov

Efficiency vs. Persistent Security

24

Query Time

Structured Property-preserving

Fully-homomorphic

Oblivious

Functional (sk) Functional (pk)

Leakage

Not Scientific!

Page 25: Encrypted Search - csrc.nist.gov

Efficiency vs. Snapshot Security

25

Query Time

Fully-homomorphic

Leakage

Not Scientific!

StructuredProperty-preserving

Oblivious

Functional (sk) Functional (pk)

Page 26: Encrypted Search - csrc.nist.gov

Leakage

26

Page 27: Encrypted Search - csrc.nist.gov

Leakage-Parameterized Definitions [Curtmola-Garay-K.-Ostrovsky, Chase-K.10]

• This area is about tradeoffs • but traditional cryptographic definitions don’t capture tradeoffs

• in 00’s, different approaches were proposed to capture leakage • #1: limit adversary’s power in the proof • #2: make assumptions on data (e.g., high entropy)

• Original motivations for leakage-parameterized definitions • Approaches #1 & #2 are misleading (sweep leakage under the rug) • Leakage should be made explicit and not be implicit

• gives clear target for cryptanalysis • makes it (somewhat) easier to compare schemes

27

Page 28: Encrypted Search - csrc.nist.gov

Modeling Leakage

• Each scheme has a leakage profile: 𝚲 = (ℒS, ℒQ, ℒU) • where ℒS = (patt1, …, pattn) is the Setup leakage • ℒQ = (patt1, …, pattn) is the Query leakage • ℒU = (patt1, …, pattn) is the Update leakage

• Each “operational” leakage is composed of leakage patterns • (patt1, …, pattn )

28

Page 29: Encrypted Search - csrc.nist.gov

Common Leakage Patterns

• qeq: query equality • a.k.a. search pattern

• rid: response identity • a.k.a. access pattern

• qlen: query length• trlen: total resp. length • rlen/vol: response length • a.k.a. volume pattern

29

• req: response equality • mqlen: max query length • mrlen: max resp. length • srlen: sequence resp. length • dsize: data size • usize: update size • did: data identity

Page 30: Encrypted Search - csrc.nist.gov

Example Leakage Profiles• The “Baseline” leakage profile for response-revealing EMMs • 𝚲 = (ℒS, ℒQ, ℒU) = (dsize, (qeq, rid), usize)

• The “Baseline” leakage profile for response-hiding EMMs • 𝚲 = (ℒS, ℒQ, ℒU) = (dsize, qeq, usize)

• Several new constructions have better leakage profiles • AZL and FZL [K.-Moataz-Ohrimenko18] • VLH and AVLH [K.-Moataz19]

30

Page 31: Encrypted Search - csrc.nist.gov

Structured Encryption vs. Other Primitives• Encrypted structures appear implicitly throughout crypto

• Oblivious RAM can be viewed as a • response-hiding encrypted array • with leakage profile 𝚲ORAM = (ℒS, ℒQ, ℒU) = (dsize, vol, vol)

• Garbled gates can be viewed as • response-revealing 2x2 arrays • 𝚲GG = (ℒS, ℒQ) = (dsize, qeq)

31

Page 32: Encrypted Search - csrc.nist.gov

How do we Deal with Leakage?• Our definitions allow us to prove that our schemes • achieve a certain leakage profile • but doesn’t tell us if a leakage profile is exploitable?

• We need more than proofs

32

Page 33: Encrypted Search - csrc.nist.gov

The Methodology

33

Leakage Analysis Proof of SecurityLeakage Attacks/

Cryptanalysis

• Leakage analysis: what is being leaked? • Proof: prove that scheme leaks no more • Cryptanalysis: can we exploit this leakage?

Page 34: Encrypted Search - csrc.nist.gov

Leakage Attacks

34

Page 35: Encrypted Search - csrc.nist.gov

Leakage Attacks• Target

• query recovery: recovers information about query • data recovery: recovers information about data

• Adversarial model • persistent: needs EDS and tokens • snapshot: needs EDS

• Auxiliary information • known sample: needs sample from same distribution • known data: needs actual data

• Passive vs. active • injection: needs to inject data

35

Page 36: Encrypted Search - csrc.nist.gov

Leakage Attacks• Leakage cryptanalysis is crucial but…

• …unfortunately much of the attack literature • lacks experimental rigor • is just plain wrong • overhyped

• there is a need for higher standards

36

Page 37: Encrypted Search - csrc.nist.gov

Leakage Attacks• IKK attack • highly cited but doesn’t work • too few keywords, auxiliary & test data correlated, …

• Count attack • based on strong assumptions • adversary needs to know ≥ 75% of client’s data!

• Some target very niche applications & rely on strong assumptions

37

Page 38: Encrypted Search - csrc.nist.gov

Leakage Attacks• Should we discount attacks? Of course not • More rigorous • Less hyperbolic • More upfront about attack limitations & assumptions

• [Blackstone-K.-Moataz’20]: Revisiting Leakage-Abuse Attacks

• [KKMSTY’21]: re-implementation & re-evaluation of most known attacks

38

Page 39: Encrypted Search - csrc.nist.gov

How Should we Handle Leakage?• Approach #1: ORAM simulation • Store and simulate data structure with ORAM • polylog overhead per read/write on top of simulation • still leaks information that is exploitable

• [Kellaris-Kollios-O’neill-Nissim’16, Blackstone-K.-Moataz’20]

• Approach #2: Custom oblivious structures

39

Page 40: Encrypted Search - csrc.nist.gov

How Should we Handle Leakage?• Approach #3: Rebuild [K.14] • Rebuild encrypted structure after t queries • Set t using cryptanalysis • Open question: can you rebuild encrypted structures? • Yes [K.-Moataz-Ohrimenko’18, George-K.-Moataz’21]

• Approach #4: Leakage suppression • Suppression compilers • Suppression transforms

40

Page 41: Encrypted Search - csrc.nist.gov

Leakage Suppression• Techniques to reduce/eliminate leakage

• Suppressing query equality (aka access pattern) • general compiler [K.-Moataz-Ohrimenko’18, Geoge-K.-Moataz’21]

• Suppressing co-occurrence (needed by IKK and Count attacks) • see appendix in [Blackstone-K.-Moataz19]

41

Page 42: Encrypted Search - csrc.nist.gov

Leakage Suppression• Suppressing volume (aka response size) • padding & clustering techniques [Bost-Fouque17] • computational techniques

[K.-Moataz19, Patel-Persiano-Yeo-Yung’20]

• “General-purpose” suppression • worst-case vs. average-case leakage [Agarwal-K.1’9] • distributing data [Agarwal-K.’19]

42

Page 43: Encrypted Search - csrc.nist.gov

Leakage Suppression• New tradeoffs to explore • leakage vs. correctness [K.-Moataz19] • leakage vs. latency [K.-Moataz-Ohrimenko18]

43

Page 44: Encrypted Search - csrc.nist.gov

Thanks!

44