Emmett Witchel Krste Asanovic MIT Lab for Computer Science Mondriaan Memory Protection.
Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett...
Transcript of Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett...
![Page 1: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/1.jpg)
Coordinated and Efficient Huge Page Management with Ingens
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel
1
![Page 2: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/2.jpg)
High address translation cost• Modern applications: large memory footprint, low memory access locality • TLB coverage using base pages is insufficient
2
Cpu
cyc
les
0%
10%
20%
30%
40%
50%
60%
70%
429.mcf Graph analytics SVM MongoDB
% of cpu cycles spent by page walkVirtual address
Physical address
Page table
![Page 3: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/3.jpg)
High address translation cost
3
Cpu
cyc
les
0%
10%
20%
30%
40%
50%
60%
70%
429.mcf Graph analytics SVM MongoDB
Guest page table walkHost page table walk
% of cpu cycles spent by page walkVirtual address
Guest physical address
Host physical address
Guest page table
Host page table
• Virtualization requires additional address translation
![Page 4: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/4.jpg)
Huge pages improve TLB coverage• Architecture supports larger page size (e.g., 2MB page)
• Intel: 0 to 1,536 entries in 2 years (2013 ~ 2015)
• Operating system has the burden of better huge page support
4
0%
1%
2%
3%
4%
5%
Sandy Bridge Ivy Bridge Haswell Skylake
4KB page 2MB page
TLB coverage proportional to 64 GB DRAM
0.11%0.05%
4.6%
0.01%0.01% 0.1%
3.2%
0.1%
2013 201520142011
![Page 5: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/5.jpg)
Operating system support for huge pages• OS transparently allocates/deallocates huge pages
• Huge pages in both guest and host
5
Linux
FreeBSD
LWN.net, 2011
![Page 6: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/6.jpg)
Huge pages improve performance• Application speed up over using base pages only
6
Spe
ed u
p
0%
10%
20%
30%
40%
50%
60%
429.m
cf
(Spe
c CPU)
Canne
al
(PARSEC)SVM
(Libli
near)
Graph a
nalyt
ics
(Pow
erGrap
h)
Machin
e lea
rning
(Spa
rk MLli
b)
Web se
rver
(Clou
dston
e) Redis
Mongo
DB
Bette
r
Average
![Page 7: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/7.jpg)
Are huge pages a free lunch?
7
![Page 8: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/8.jpg)
Are huge pages a free lunch?
8
![Page 9: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/9.jpg)
Are huge pages a free lunch?
8
![Page 10: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/10.jpg)
Are huge pages a free lunch?
8
![Page 11: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/11.jpg)
Are huge pages a free lunch?
8
![Page 12: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/12.jpg)
Huge page pathologies in Linux
• High page fault latency
• Memory bloating
• Unfair huge page allocation
• Uncoordinated memory management
9
![Page 13: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/13.jpg)
Huge page pathologies in Linux
• High page fault latency
• Memory bloating
• Unfair huge page allocation
• Uncoordinated memory management
10
![Page 14: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/14.jpg)
Ingens Efficient huge page management system
11
Linux Ingens
Synchronous allocation
Asynchronous allocation
Greedy allocation Spatial utilization based allocation
How to allocate huge pages?
Problems
High page fault latency
Memory bloating
![Page 15: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/15.jpg)
High page fault latency
12
![Page 16: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/16.jpg)
Huge page allocation increases page fault latency
• Page allocation path of both base and huge page
13
Allocate page(s) Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Application pause
Application resume
Page fault latency • 4KB page : 3.6 us • 2MB page : 378.0 us (mostly from page zeroing) • Increases tail latency
![Page 17: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/17.jpg)
Huge page allocation might require extra memory copying
• Page allocation path of huge page
14
Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Application pause
Application resume
Allocate page(s)
![Page 18: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/18.jpg)
Huge page allocation might require extra memory copying
• Page allocation path of huge page
14
Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Not enough contiguous memory
Application pause
Application resume
Allocate page(s)
![Page 19: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/19.jpg)
External fragmentation
15
Not enough contiguous memory
![Page 20: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/20.jpg)
External fragmentation
15
Virtual address
Physical address
Huge page boundary B
B
B
B
B
B Allocated Base page
Not enough contiguous memory
B• As system ages, physical memory is
fragmented
• 2 minutes to fragment 24 GB
• All memory sizes eventually fragment
• Linux compacts physical memory to create contiguous pages
![Page 21: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/21.jpg)
External fragmentation
15
Virtual address
Physical address
Huge page boundary
B
B
B
B Allocated Base page
Not enough contiguous memory
B
B• As system ages, physical memory is
fragmented
• 2 minutes to fragment 24 GB
• All memory sizes eventually fragment
• Linux compacts physical memory to create contiguous pages
![Page 22: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/22.jpg)
External fragmentation
• As system ages, physical memory is fragmented
• 2 minutes to fragment 24 GB
• All memory sizes eventually fragment
• Linux compacts physical memory to create contiguous pages
Virtual address
Physical address
Huge page boundary
B
B
B
B
B
B Allocated Base page
Not enough contiguous memory
H
![Page 23: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/23.jpg)
External fragmentationNot enough
contiguous memory
![Page 24: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/24.jpg)
Huge page allocation might require extra memory copying
• Page allocation path of huge page includes memory compaction
17
Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Not enough contiguous memory
Application pause
Application resume
Allocate page(s)
![Page 25: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/25.jpg)
Huge page allocation might require extra memory copying
• Page allocation path of huge page includes memory compaction
17
Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Not enough contiguous memory
Compact physical memory
Application pause
Application resume
Allocate page(s)
![Page 26: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/26.jpg)
Huge page allocation might require extra memory copying
• Page allocation path of huge page includes memory compaction
17
Get page(s) from free page list
Zero the page(s)Map the page(s) to page table
Page fault handler Physical memory manager
Not enough contiguous memory
Compact physical memory
Compaction may or may not succeed
Application pause
Application resume
Allocate page(s)
![Page 27: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/27.jpg)
Ingens: asynchronous allocation
18
Page fault handler
Asynchronous promotion
• Page fault handler only allocates base pages
• Huge page allocation in background
• Memory compaction in background
• No extra page fault latency • No huge page zeroing • No compaction
bit vector 1 bit per
base page
Read/update on each base page fault
Promotion Kernel thread
Fast page fault handling
![Page 28: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/28.jpg)
Page fault latency experiment• Machine specification
• Two Intel Xeon E5-2640 2.60GHz CPUs
• 64GB memory and two 250 MB SSDs
• Cloudstone workload (latency sensitive)
• Web service for social event planning
• nginx/PHP/MySQL running in virtual machines
• 85% read, 10% login, 5% write workloads
• 2 of 7 web pages modified to use modern web page sizes • The average web page is 2.1 MB
https://www.soasta.com/blog/page-bloat-average-web-page-2-mb/
19
![Page 29: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/29.jpg)
Cloudstone result
• Memory is highly fragmented
• Ingens reduces • average latency up to 29.2% • tail latency up to 41.4%
• Linux page fault handler performs 461,383 memory compactions
20
Linux Ingens
922.3 1091.9 (+18%)
Throughput (requests/s)
Latency (millisecond)
0
100
200
300
400
500
600
Avg. 90th Avg. 90th
LinuxIngens
View event Visit home page
Bette
r
![Page 30: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/30.jpg)
Cloudstone result
• Memory is highly fragmented
• Ingens reduces • average latency up to 29.2% • tail latency up to 41.4%
• Linux page fault handler performs 461,383 memory compactions
20
Linux Ingens
922.3 1091.9 (+18%)
Throughput (requests/s)
Latency (millisecond)
0
100
200
300
400
500
600
Avg. 90th Avg. 90th
LinuxIngens
View event Visit home page
Bette
r
![Page 31: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/31.jpg)
Memory bloating
21
Application occupies more memory than it uses
![Page 32: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/32.jpg)
Internal fragmentation
• Greedy allocation in Linux
• Allocate a huge page on first fault to huge page region
• The huge page region may not be fully used
• Greedy allocation causes severe internal fragmentation
• Memory use often sparse
22
Virtual address
Physical address
Huge page boundary
H
H
Used virtual address
Unused virtual address
Huge page region
H
![Page 33: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/33.jpg)
Memory bloating experiment
• Redis
• Delete 70% objects after populating 8KB objects
• MongoDB
• 15 million get requests for 1KB object with YCSB
23
Using huge page
Using only base page
Redis 20.7GB (+69%)
12.2GB
MongoDB 12.4GB (+23%)
10.1GB
Physical memory consumption
Bloating makes memory consumption unpredictable Memory-intensive applications can’t provision to avoid swap
![Page 34: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/34.jpg)
Ingens: Spatial utilization based allocation
• Ingens monitors spatial utilization of each huge page region
• Utilization-based allocation
• Page fault handler requests promotion when the utilization is beyond a threshold (e.g., 90%)
• Bounds the size of internal fragmentation
24
Virtual address
Physical address
H
B
B
B
B
100% utilization
75% utilization
25% utilization
![Page 35: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/35.jpg)
Redis memory bloating experiment
25
Physical memory consumption
GET throughput
12.2 GB
Linux (base only)
20.7 GB
Linux (huge)Ingens
Linux (base only) Linux (huge)
21.7K19.0K
Ingens
12.3 GB
20.9K
- 4%+ 10%
Better
Better
Huge : 2MB page Base : 4KB page
![Page 36: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/36.jpg)
Ingens overhead• Overhead for memory intensive application
• Overhead for non-memory intensive application
26
429.mcf Graph Spark Canneal SVM Redis MongoDB
0.9% 0.9% 0.6% 1.9% 1.3% 0.2% 0.6%
Kernel build Grep Parsec 3.0 Benchmark
0.2% 0.4% 0.8%
Ingens overhead is negligible
![Page 37: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/37.jpg)
27
Linux Ingens
Synchronous allocation
Asynchronous allocation
Greedy allocation Spatial utilization based allocation
Ingens Make huge pages widely used in practice
Source code is available at https://github.com/ut-osa/ingens
Advantages
No extra page fault latency
Bound memory bloating
![Page 38: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/38.jpg)
Backup slides
28
![Page 39: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/39.jpg)
Other operating systems• Window, MacOS
• Does not support transparent huge page
• FreeBSD
• Very conservative approach
• No memory compaction functionality
• Performance speedup in Linux and FreeBSD
29
SVM Canneal RedisFreeBSD 1.28 1.13 1.02Linux 1.30 1.21 1.15Ingens 1.29 1.19 1.15
![Page 40: Coordinated and Efficient Huge Page Management with Ingens · Christopher J. Rossbach, and Emmett Witchel 1. High address translation cost • Modern applications: large memory footprint,](https://reader033.fdocuments.net/reader033/viewer/2022060600/605397bf5c338c4e012399c9/html5/thumbnails/40.jpg)
• User-controlled huge page management • Admin reserves huge page in advance • New APIs for memory allocation/deallocation • It could fail to reserve huge pages when memory is
fragmented
• Transparent huge page management • Developers do not know about huge page • OS Transparently allocates/deallocates huge pages • OS manages memory fragmentation
30
Operating system support for huge pages