X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces...
Transcript of X-Containers: Breaking Down Barriers to Improve ...delimitrou/slides/... · Linux Kernel namespaces...
X-Containers: Breaking Down Barriers to Improve Performance and Isolation
of Cloud-Native ContainersZhiming Shen
Cornell University
Joint work with Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert Van Renesse, Hakim Weatherspoon
Software Containers
2
3Img src: https://pivotal.io/cloud-native
Cloud-Native Container Platforms
4Img src: https://pivotal.io/cloud-native
• Single Concern Principle:Every container should address a single concern and do it well.
• Making containers easier too Replace, reuse, and upgrade
transparentlyo Scale horizontallyo Debug and troubleshoot
Cloud-Native Container Platforms
Container
Proc
ess
Proc
ess
5
Hardware
Linux Kernelnamespaces cgroups SELinux
ContainerPr
oces
s
Proc
ess
Shared kernel attack surface and TCB
Not allowed to install kernel modules
The Problem
Hard to tune or optimize for a specific container
6
Existing Solutions
Linux
Container
Process
Process
Container
Linux
VM
Linux
Process
Process
Clear Container
KVMLinux
gVisor
Container
Process
Process
gVisor
Require nested hardware virtualization support in the cloud
Ptrace mode: high overheadKVM mode: require nested virtualization
IsolationCustomizationOptimizationPortabilityPerformance
X-Containers achieve• VM-level • Support of Kernel• Support of Kernel • Good (without the need of hardware-assisted virtualization)• High
AND• Backward Compatibility
7
IsolationCustomizationOptimization
PortabilityPerformance
OS Kernel
8
X-Containers
OS Kernel
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
9
X-Containers
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
OS Kernel OS Kernel
Exokernel
10
X-Containers
Container
Process
Process
Container
Process
Process
User mode
Kernel mode
OS Kernel OS Kernel
Exokernel
X-Container X-Container
11
X-Containers
Process
Process
Process
Process
User mode
Kernel mode
X-LibOS X-LibOS
X-Kernel
• A new security paradigm for cloud-native containers
• X-Kernel: an exokernel with a small attack surface and TCB• X-LibOS: a LibOS that decouples security isolation from the process model
12
X-Kernel
X-Container
X-Container
X-LibOSProcess
Process
X-Containers
Linux
Container
Process
Process
Container
Linux
VM
Linux
Process
Process
Clear Container
KVMLinux
gVisor
Container
Process
Process
gVisor
Threat Model and Design Trade-offs
• Threat model
• Trade-offs• Reduced intra-container isolation• Improved inter-container isolation and performance• Process isolation and kernel-supported security features are not effective
13
X-Kernel
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
X-Container
X-LibOS
Process
Process
Process
Implementation
• X-LibOS from Linux kernel• Binary compatibility• Highly customizable
• X-Kernel from Xen• Para-virtualization interface• Concurrent multi-processing
• Limitations• Memory management• Spawning time
14
X-Kernel
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
Process
X-Container
X-LibOS
Process
Process
User mode
Kernel mode
Optimizing System Calls
• Existing solutions• Patch source code• Link to another library
• Our solution• Automatic Binary Optimization
Module (ABOM)• Binary level equivalence• Position-independence
15
Kernel Mode X-Kernel
User Mode
X-Container
X-LibOS
Process
Process
System calls Function calls
For many applications, more than 90% of syscalls are turned into function calls
Evaluation Setup
• Testbed• Amazon EC2• Google Compute Engine
• Compared container runtimes• Docker• gVisor (Ptrace in Amazon, and KVM in Google)• Clear-Container (only in Google)• Xen-Container• X-Container
• Configurations• Patched for Meltdown
16
System Call Performance
17
0
5
10
15
20
25
30
Amazon Google
Norm
alize
d Pe
rform
ance
Docker Clear-Container gVisor Xen-Container X-Container
Up to 27X of Docker (patched) and 1.6X of Clear-Container
Real Application Performance
18
00.5
11.5
Amazon Google
NGINX
Norm
alize
d Th
roug
hput
1.21x~1.27x
0
2
4
Amazon Google
Memcached 2.64x~3.08x
00.5
11.5
Amazon Google
Redis 1x~1.2x
00.5
11.5
Amazon Google
Apache 0.64x~0.72x
Spawning Time and Memory Footprint
19
1.00 1.933.56
11.16
8.80
2.10
0
5
10
15
20
25
30
Docker X-Container
Mem
ory
Foot
prin
t (M
B)
Memory Footprint
FreeX-LibOSExtramicropython
3.66
0.46
0.28
0.280.56
0.29
0.29
0
1
2
3
4
5
Docker X-Container X-Container'
Tim
e (S
)
Spawning Time
User ProgramX-LibOS BootingXen Tool Stack
Reduced to 460ms. Can be further reduced to <10ms.
More Evaluations in the Paper
• More micro/macro benchmarks• Patched and unpatched for Meltdown• Comparing to Unikernel and Graphene• Scalability (up to 400 containers on a single host)
20
Conclusion
• X-Containers: a new security paradigm for isolating single-concerned cloud-native containers• X-Kernel: an exokernel with a small attack surface and TCB• X-LibOS: A LibOS that decouples security isolation from the process model• Trade-off: intra-container isolation vs. inter-container isolation
• Implemented with Xen and Linux• Binary compatibility• Concurrent multi-processing
• More at http://x-containers.org
21
Thank You. Questions?
Backup Slides
22
Pros and Cons of the X-Container Architecture
Container gVisor Clear-Container LightVM X-ContainerInter-container isolation Poor Good Good Good GoodSystem call performance Limited Poor Limited Poor GoodPortability Good Good Limited Good GoodCompatibility Good Limited Good Good GoodIntra-container isolation Good Good Good Good ReducedMemory efficiency Good Good Limited Limited LimitedSpawning time Short Short Moderate Moderate ModerateSoftware licensing Clean Clean Clean Clean Need
discussion
23
Comparing Isolation Boundaries
24
X-Kernel
X-Container
X-LibOS
Process
Process
Kernel
Container
Process
Process
Hypervisor
VM
Kernel
Process
Process
Hypervisor
VM
Process
Exokernel
ProcessLibOSLibOS Process
Microkernel
L4Linux
X-ContainerContainer Virtual Machine
Unikernel, Dune, EbbRT, OSv
Library OS (Exokernel)
L4Linux (Microkernel)
Kernel
Process
Process
Automatic Binary Optimization Module (ABOM)
25
00000000000eb6a0 <__read>: eb6a9: b8 00 00 00 00 mov $0x0,%eax eb6ae: 0f 05 syscall
00000000000eb6a0 <__read>: eb6a9: ff 14 25 08 00 60 ff callq *0xffffffffff600008
0000000000010330 <__restore_rt>: 10330: 48 c7 c0 0f 00 00 00 mov $0xf,%rax 10337: 0f 05 syscall
0000000000010330 <__restore_rt>: 10330: ff 14 25 80 00 60 ff callq *0xffffffffff600080 10337: 0f 05 syscall
7-Byte Replacement (Case 1)
9-Byte Replacement (Phase-1)
0000000000010330 <__restore_rt>: 10330: ff 14 25 80 00 60 ff callq *0xffffffffff600080 10337: eb f7 jmp 0x10330
9-Byte Replacement (Phase-2)
000000000007f400 < syscall.Syscall>: 7f41d: 48 8b 44 24 08 mov 0x8(%rsp),%eax 7f422: 0f 05 syscall
000000000007f400 < syscall.Syscall>: 7f41d: ff 14 25 08 0c 60 ff callq *0xffffffffff600c08
7-Byte Replacement (Case 2)
The Exokernel Approach
• Separating protection and management
26
Exokernel
Hardware
Library OS
Process
Library OS
Process
Exokernel
Operating System Kernel
Hardware
Process Process
Monolithic OS Kernel