Trusted Execution Environments on GPUs...•Execution isolated from system software and other...

21
Graviton Trusted Execution Environments on GPUs Stavros Volos, Kapil Vaswani, and Rodrigo Bruno Microsoft Research University of Lisbon

Transcript of Trusted Execution Environments on GPUs...•Execution isolated from system software and other...

Page 1: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

GravitonTrusted Execution Environments on GPUs

Stavros Volos,† Kapil Vaswani,† and Rodrigo Bruno‡

†Microsoft Research ‡University of Lisbon

Page 2: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Trends in Cloud Computing

Accelerators play pivotal role in cloud

• CPUs running out of steam due to End of Moore’s Law

• GPUs, FPGAs, custom silicon deliver 10-100x higher performance

Cloud privacy important but challenging

• Customers operate on sensitive data (e.g., patients, transactions)

• Increasing frequency and sophistication of data breaches

2Need strong security mechanisms for preserving data privacy in cloud

Page 3: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

3Undesirable trade-off between performance and security

Confidential Cloud Computing

CPU

Hypervisor

App

OS

App

Trusted Execution Environments (TEE)

• Execution isolated from privileged attackers

• Remote attestation for establishing trust

• Examples: Intel SGX, ARM TrustZone

• Supported by major cloud providers (e.g. Azure Confidential Computing)

But, CPU TEEs cannot be used in apps that utilize accelerators

Code

Data

Page 4: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Our Proposal: Graviton

Graviton: Trusted Execution Environments on GPUs

• Execution isolated from system software and other co-tenants

• Remote attestation for establishing trust

Contributions

• Graviton architecture with minimal hardware extensions

• Extensions to CUDA runtime for end-to-end security

• Graviton implementation for demonstrating low performance overheads

4

Page 5: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Outline

• Introduction

• GPUs & Threat Model

• Graviton

• Evaluation

• Conclusion

5

Page 6: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

GPU 101: System Stack

6

Code

On-package Memory

Command

Processor (CP)

DMA

Engines

Co

mp

ute

Eng

.

cudaMemcpy

cudaLaunch

PCIe

Ctrl.

CPU

Hypervisor

PCI Bus

Application

Runtime

GPU

OS / Driver

GPU engines controlled via group of commands

• Generated by runtime and fetched by command processor

Data

CORE CORE CORE CORE

CORE CORE CORE CORE

CORE CORE CORE CORE

CORE CORE CORE CORE

GPU commands

Page 7: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

GPU 101: Execution Model

7

CPU

Hypervisor

PCI Bus

Application

Runtime

OS / Driver

Contexts supported by channels

• Implement virtual memory abstraction

• Expose command queues to runtime

Context

Code

GPU

Data

Page 8: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Context

GPU 101: Cat Classifier Example

8

cudaMalloc1

cudaMemcpy22

3 cudaLaunch

2

3

Application

Runtime

OS / Driver

CPU

Hypervisor

PCI Bus

Code

GPU

Data1

12

2 3

2

MMIO

1

1

Page 9: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

GPU 101: Tampering with Commands & Data

Context 1Malicious

OS

9

Page 10: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

GPU 101: Violating Context Isolation

Context 1 Context 2

10

Malicious

OS

Page 11: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Threat Model

Trusted computing base

• GPU package including on-package memory

• CPU package including TEE implementation

• GPU runtime hosted in CPU TEE

Goal: Confidentiality and integrity of computation and data

Out of scope: side channels and package assembly attacks

11

Page 12: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Outline

• Introduction

• GPUs & Threat Model

• Graviton

• Evaluation

• Conclusion

12

Page 13: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Graviton: Overview

CPU

Hypervisor

PCI Bus

Application

Runtime

OS / Driver

Code

GPU

Data

Hardware primitives in GPU

• Remote attestation for establishing trust

• Context isolation

• Secure command submission

Runtime abstractions

• Secure memory management

• Secure memory copy and task launch

13

Key concept: Redefined interface between hardware and software

Page 14: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

MMIO

Graviton: Context Isolation

Protected Memory

Command

Processor

Runtime

Untrusted MemoryProtected memory

• Hosts VM structures, code, and data

• CPU’s MMIO accesses are blocked

Virtual memory management via CP

• Ensures use of protected memory

• Exclusive use of context’s memory resources

Secure command submission

• Session key during context creation

• Only owner runtime can execute tasks

com

mand

s

Driver

14

Page 15: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Protected Memory

Command

Processor

Runtime

com

mand

s

Driver

Graviton: Secure Memory Copy

15

Key concept

• Data/code plaintext only inside TEEs

• Data/code ciphertext outside TEE (DMA buffer)

Protocol

Untrusted Memory

Page 16: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Protected Memory

Command

Processor

Runtime

com

mand

s

Driver

Graviton: Secure Memory Copy

16

Key concept

• Data/code plaintext only inside TEEs

• Data/code ciphertext outside TEE (DMA buffer)

Protocol

• Secure submission of copy task

Untrusted Memory

Page 17: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Protected Memory

Command

Processor

Runtime

com

mand

s

Driver

Graviton: Secure Memory Copy

17

Key concept

• Data/code plaintext only inside TEEs

• Data/code ciphertext outside TEE (DMA buffer)

Protocol

• Secure submission of copy task

• Secure submission for authenticated decryption

Untrusted Memory

Page 18: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Graviton in a Nutshell

18

On-package Memory

Command

Processor

DMA

Engines

Co

mp

ute

Eng

.

PCI

Ctrl.

CORE CORE CORE CORE

CORE CORE CORE CORE

CORE CORE CORE CORE

CORE CORE CORE CORE

Sec.

Mod.

Endorsement key

ECDSA

Blocks MMIO

to Protected

Memory

Remote attestation

VM mgmt. commands

AES-GCM engineLow hardware complexity

• Changes limited to peripheral components

• No changes to CPU, GPU cores and memory

Transparent to developers

• GPU runtime abstractions

• Hidden behind GPU programming model

Page 19: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Implementation

NVIDIA GTX Titan Black

• 2880 CUDA cores, 6GB of memory, peak performance 5.6 TFLOPS

Prototype

• GPU runtime: secure task submission and secure memory management

• Device driver: address-space mgmt. command submission

• Hardware primitives: emulation of new commands and crypto in device driver

Benchmarks: Cifar10-CNN and MNIST-autoencoder

19

Page 20: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Implications on System Performance

20

0%

25%

50%

75%

100%

125%

150%

Cifar10 MNIST BlackScholes

End

-to

-End

Runtim

e

(No

rmaliz

ed

to

Inse

cure

)

Baseline Isolation Secure Copy

Overhead correlates with ratio between computation and I/O

Insecure

Isolation

• Secure context management

• Secure command submission

Secure copy

• Host-side authenticated encryption

• GPU-side authenticated encryption

Page 21: Trusted Execution Environments on GPUs...•Execution isolated from system software and other co-tenants ... Cat Classifier Example 8 1 cudaMalloc 2 cudaMemcpy 2 3 cudaLaunch 2 3 Application

Concluding Remarks

Cloud trends in collision

• Confidentiality and hardware acceleration

• But, confidential computing restricted to CPUs

Graviton: Trusted Execution Environments on GPUs

• Low hardware complexity

• Low performance overheads

• Hardware complexity hidden by GPU programming model

21