Post on 22-Jun-2020
Tiny ORAM: A Low-Latency, Low-Area Hardware
Oblivious RAM Controller
Chris Fletcher1, Ling Ren1, Albert Kwon1, Marten V. Dijk2,
Emil Stefanov3, Dimitrios Serpanos4, Srinivas Devadas1
1 Massachusetts Institute of Technology 2 University of Connecticut
3 University of California, Berkeley 4 QCRI, Doha
Want: cloud does work, doesn’t learn Data
Enc(Data)
Enc(Result)
Computation outsourcing
F(Data) = Result
Want: cloud does work, doesn’t learn Data
Piece of silicon trusted
DRAM
Tamper-resistant silicon (FPGA)
Design, Program
Data
Enc(Data)
Enc(Result)
Computation outsourcing
Want: cloud does work, doesn’t learn Data
Piece of silicon trusted
Rest of cloud untrusted
DRAM
Tamper-resistant silicon (FPGA)
Design, Program
Data
Enc(Data)
Enc(Result)
Computation outsourcing
Want: cloud does work, doesn’t learn Data
Piece of silicon trusted
Rest of cloud untrusted
DRAM
Tamper-resistant silicon (FPGA)
Design, Program
Data
Enc(Data)
Enc(Result)
Even if data is encrypted, address
is plaintext and leaks privacy!
Computation outsourcing
Want: cloud does work, doesn’t learn Data
Piece of silicon trusted
Rest of cloud untrusted
DRAM
Tamper-resistant silicon (FPGA)
Design, Program
Data
Enc(Data)
Enc(Result)
Computation outsourcing
Oblivious RAM = complete
address, data, op obfuscation
• Oblivious RAM (ORAM) tutorial
• How you can use our HW ORAM prototype
Outline
• Oblivious RAM (ORAM) tutorial
• How you can use our HW ORAM prototype
• Technical issues: read paper
Outline
• Oblivious RAM (ORAM) tutorial
• How you can use our HW ORAM prototype
• Technical issues: read paper
Outline
Oblivious RAM (ORAM)
ORAM
Algorithm
Untrusted
storage
ORAM
Controller
External
memory
(DRAM)
ORAM
Controller
FPGA
design
FPGA chip
FPGA design wants to hide its main memory accesses
Oblivious RAM (ORAM)
ORAM
Algorithm
Untrusted
storage
Input access pattern (op, addr, data)
Write a0 d0Read a0Read a1Read a2…
ORAM
Controller
External
memory
(DRAM)
ORAM
Controller
FPGA
design
FPGA chip
FPGA design wants to hide its main memory accesses
Oblivious RAM (ORAM)
ORAM
Algorithm
Untrusted
storage
Input access pattern (op, addr, data)
Write a0 d0Read a0Read a1Read a2…
Obfuscated addr & ciphertext:
Read a652 %&XRead a113 #$*(Write a431 @$^Read a059 *&@…
ORAM
Controller
External
memory
(DRAM)
ORAM
Controller
FPGA
design
FPGA chip
FPGA design wants to hide its main memory accesses
Oblivious RAM (ORAM)
ORAM
Algorithm
Untrusted
storage
Input access pattern (op, addr, data)
Write a0 d0Read a0Read a1Read a2…
Obfuscated addr & ciphertext:
Read a652 %&XRead a113 #$*(Write a431 @$^Read a059 *&@…
ORAM
Controller
External
memory
(DRAM)
ORAM
Controller
FPGA
design
FPGA chip
FPGA design wants to hide its main memory accesses
Oblivious RAM (ORAM)
ORAM
Algorithm
Untrusted
storage
Input access pattern (op, addr, data)
Write a0 d0Read a0Read a1Read a2…
Obfuscated addr & ciphertext:
Read a652 %&XRead a113 #$*(Write a431 @$^Read a059 *&@…
ORAM
Controller
External
memory
(DRAM)
ORAM
Controller
FPGA
design
FPGA chip
Eavesdropper learns nothing about
addresses, data, read/write
FPGA design wants to hide its main memory accesses
FPGAs are great platforms for single chip, secure clouds
ORAM in the FPGA ecosystem
Platform Reconfigurability /
Patching
On-die performance Memory bandwidth
CPU
ASIC
FPGA
FPGAs are great platforms for single chip, secure clouds
ORAM in the FPGA ecosystem
Platform Reconfigurability /
Patching
On-die performance Memory bandwidth
CPU
ASIC
FPGA
FPGAs are great platforms for single chip, secure clouds
ORAM in the FPGA ecosystem
Platform Reconfigurability /
Patching
On-die performance Memory bandwidth
CPU
ASIC
FPGA
FPGAs are great platforms for single chip, secure clouds
ORAM in the FPGA ecosystem
Platform Reconfigurability /
Patching
On-die performance Memory bandwidth
CPU
ASIC
FPGA
FPGAs are great platforms for single chip, secure clouds
• ORAM drop in security solution for existing designs
ORAM in the FPGA ecosystem
Platform Reconfigurability /
Patching
On-die performance Memory bandwidth
CPU
ASIC
FPGA
DRAMUser
design
FPGA
Memory controller
ORAMController... = doesn’t change
ORAM in the FPGA ecosystem
Remaining issues
Area.
Performance.
DRAMUser
design
FPGA
Memory controller
ORAMController
How ORAM Works
7
• [SDSFRYD, CCS’13]
• Efficient and simple
Path ORAM
• [SDSFRYD, CCS’13]
• Efficient and simple
• External DRAM structured as a binary tree
DRAM
ORAM controller
Position mapStash
Path ORAM
root
path 0 1 2 3
𝑍 = 2
• [SDSFRYD, CCS’13]
• Efficient and simple
• External DRAM structured as a binary tree
DRAM
ORAM controller
Position mapStash
Path ORAM
root
path 0 1 2 3
𝑍 = 2
(B4, 1)
Path ORAM
• Position Map: Map each block to a random path
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
𝑍 = 1
(B4, 1)
Path ORAM
• Position Map: Map each block to a random path
• Invariant: If block is mapped to a path,
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
𝑍 = 1
(B4, 1)
Path ORAM
• Position Map: Map each block to a random path
• Invariant: If block is mapped to a path,
it is on that path or in the stash
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
𝑍 = 1
(B4, 1)
Path ORAM
• Position Map: Map each block to a random path
• Invariant: If block is mapped to a path,
it is on that path or in the stash
• Stash: Temporarily hold some blocks
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
𝑍 = 1
(B4, 1)
Path ORAM
• Position Map: Map each block to a random path
• Invariant: If block is mapped to a path,
it is on that path or in the stash
• Stash: Temporarily hold some blocks
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
𝑍 = 1
Unused spots dummy blocks
Blocks in tree = encrypted
Dummies look like real blocks
(B4, 1)
Path ORAM Operation
• Access Block 1
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
(B4, 1)
Path ORAM Operation
• Access Block 1
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
(B4, 1)
Path ORAM Operation
• Access Block 1
– Read all blocks on path 3
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0) (B2, 3) (B1, 3)
Stash
ORAM controller DRAM
dummy dummy dummy
(B4, 1)
Path ORAM Operation
• Access Block 1
– Read all blocks on path 3
– Remap B1 to a new random path
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0) (B2, 3) (B1, 1)
Stash
ORAM controller DRAM
dummy dummy dummyX 1
(B4, 1)
Path ORAM Operation
• Access Block 1
– Read all blocks on path 3
– Remap B1 to a new random path
– Write as many blocks as possible back to path 3 (keep the invariant)
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
(B2, 3)
(B1, 1)
Stash
ORAM controller DRAM
dummy dummy dummy
dummy
X 1
(B4, 1)
Path ORAM Analysis
• Security: each access R/W random path
• R/W path = 𝑂 𝑍 log𝑁 blocks 𝑁 = record count
• |Stash| = 𝑂(log𝑁) for 𝑍 ≥ 4 Small!
• Position map = 𝑂(𝑁) Huge!
root
path 0 1 2 3
Block Path
B0 0
B1 3
B2 3
B3 0
B4 1
Position Map
(B0, 0)
(B3, 0)
dummy
(B2, 3)
(B1, 1)
Stash
ORAM controller DRAM
dummy dummy dummyX 1
𝑍 = 1
Tiny ORAM:Our Hardware Design
• Making the position map small, integrity verification
– Sister paper (ASPLOS’15)
Paper partitioning
• Making the position map small, integrity verification
– Sister paper (ASPLOS’15)
• Heavy lifting: Given (leaf, address) read/write path
– This paper
Paper partitioning
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
• Pipeline stall free
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
• Pipeline stall free
• Area: 6% logic, 14% BRAM
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
• Pipeline stall free
• Area: 6% logic, 14% BRAM
• ORAM capacity: arbitrary*
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
• Pipeline stall free
• Area: 6% logic, 14% BRAM
• ORAM capacity: arbitrary*
• AXI interface
Xilinx VC707 (485T), 12.8 GB/s DRAM
Integrity verification
Encryption units
Stash
Position map
Xilinx MIG
Design highlights
• Design clock: 200, 300 MHz
• Access 64 Bytes: 250 cycles
• Pipeline stall free
• Area: 6% logic, 14% BRAM
• ORAM capacity: arbitrary*
• AXI interface
Open Source(Link in paper)
Xilinx VC707 (485T), 12.8 GB/s DRAM
Metric Phantom (CCS’13) Tiny ORAM No ORAM
Access 64 Byte block ~12000 cycles 250 cycles ~30 cycles
Access 4 KiloByte block ~12000 cycles 17000 cycles ~100 cycles
LUTs 22K (7%) 18K (6%)
BRAMs (w/o position map) 516 129
BRAMs (w/ position map) … multiple FPGAs … 146
Comparison to prior work
All figures normalized to 200 Mhz clock, XC7VX485T FPGA
Metric Phantom (CCS’13) Tiny ORAM No ORAM
Access 64 Byte block ~12000 cycles 250 cycles ~30 cycles
Access 4 KiloByte block ~12000 cycles 17000 cycles ~100 cycles
LUTs 22K (7%) 18K (6%)
BRAMs (w/o position map) 516 129
BRAMs (w/ position map) … multiple FPGAs … 146
Comparison to prior work
All figures normalized to 200 Mhz clock, XC7VX485T FPGA
This paper, ASPLOS paper
Metric Phantom (CCS’13) Tiny ORAM No ORAM
Access 64 Byte block ~12000 cycles 250 cycles ~30 cycles
Access 4 KiloByte block ~12000 cycles 17000 cycles ~100 cycles
LUTs 22K (7%) 18K (6%)
BRAMs (w/o position map) 516 129
BRAMs (w/ position map) … multiple FPGAs … 146
Comparison to prior work
All figures normalized to 200 Mhz clock, XC7VX485T FPGA
This paper, ASPLOS paper
This paper
Metric Phantom (CCS’13) Tiny ORAM No ORAM
Access 64 Byte block ~12000 cycles 250 cycles ~30 cycles
Access 4 KiloByte block ~12000 cycles 17000 cycles ~100 cycles
LUTs 22K (7%) 18K (6%)
BRAMs (w/o position map) 516 129
BRAMs (w/ position map) … multiple FPGAs … 146
Comparison to prior work
All figures normalized to 200 Mhz clock, XC7VX485T FPGA
This paper, ASPLOS paper
This paper
ASPLOS paper
• Compact hardware Oblivious RAM
• IP: Drop-in security for existing designs
• Small, reasonably fast and scalable
Conclusion
Code athttp://kwonalbert.github.io/oram