HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish...
Transcript of HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish...
![Page 1: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/1.jpg)
Ashish Panwar1, Sorav Bansal2, K. Gopinath1
Indian Institute of Science (IISc), Bangalore1
Indian Institute of Technology, Delhi 2
1Architectural Support for Programming Languages and Operating Systems (ASPLOS) - 2019.
HawkEye: Efficient Fine-grained OS Support for Huge Pages
![Page 2: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/2.jpg)
2
Virtual address space
![Page 3: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/3.jpg)
3
Physical address space
Virtual address space
![Page 4: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/4.jpg)
4
Physical address space
Virtual address space
![Page 5: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/5.jpg)
5
Physical address space
Virtual address space
![Page 6: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/6.jpg)
6
Too much TLB pressure!
Physical address space
Virtual address space
![Page 7: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/7.jpg)
7
Physical address space
Virtual address space
![Page 8: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/8.jpg)
8
Physical address space
Virtual address spaceHuge
pages
Fewer misses
![Page 9: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/9.jpg)
OS Challenges
11
❑ Complex trade-offs
• Memory bloat vs. performance
• Page fault latency vs. the number of page faults
❑ Challenges due to (external) fragmentation• How to leverage limited memory contiguity
• Fairness in huge page allocation
![Page 10: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/10.jpg)
Memory bloatvs.
performance
13
![Page 11: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/11.jpg)
Internal fragmentation
14
Virtual memory
Physical memory
huge page mapping
aggressive allocation
![Page 12: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/12.jpg)
15
Virtual memory
Physical memory
huge page mapping
aggressive allocation conservative allocation
Internal fragmentation
![Page 13: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/13.jpg)
16
Virtual memory
Physical memory
huge page mapping
aggressive allocation conservative allocation
unused pages
Internal fragmentation
![Page 14: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/14.jpg)
17
Virtual memory
Physical memory
huge page mapping
aggressive allocation conservative allocation
unused pages
bloat
Internal fragmentation
![Page 15: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/15.jpg)
18
Virtual memory
Physical memory
huge page mapping
aggressive allocation conservative allocation
Lower TLB reach (impacts performance)bloat
Internal fragmentation
unused pages
![Page 16: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/16.jpg)
Bloat vs. performance
Aggressive
Higher perf
Higher bloat
Conservative
Lower perf
Lower bloat
![Page 17: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/17.jpg)
20
Latencyvs.
# page faults
![Page 18: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/18.jpg)
21
▪ Find a page
pre4-KB
![Page 19: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/19.jpg)
22
▪ Find a page, zero-fill
pre zero-fill post4-KB
![Page 20: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/20.jpg)
23
▪ Find a page, zero-fill, map
pre zero-fill post4-KB
![Page 21: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/21.jpg)
24
▪ Find a page, zero-fill, map
pre zero-fill post4-KB
25%
![Page 22: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/22.jpg)
25
▪ Find a page, zero-fill, map
pre zero-fill post
pre
zero-fill
post
4-KB
2-MB
25%
![Page 23: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/23.jpg)
26
▪ Find a page, zero-fill, map
pre zero-fill post
pre
zero-fill
post
4-KB
2-MB
25%
![Page 24: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/24.jpg)
27
▪ Find a page, zero-fill, map
pre zero-fill post
pre
zero-fill
post
4-KB
2-MB
25%
![Page 25: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/25.jpg)
28
▪ Find a page, zero-fill, map
pre zero-fill post
pre
zero-fill
post
4-KB
dominated by zero-filling (97%)
2-MB
25%
![Page 26: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/26.jpg)
Latency vs. # page faults
32
Aggressive
High latency
Fewer faults
Conservative
Low latency
Higher faults
![Page 27: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/27.jpg)
33
FreeBSD Linux
Memory bloat Low High
Performance Low High
Allocation latency Low High
# page faults High Low
conservative vs. aggressive
Tradeoff-1:
Tradeoff-2:
Current systems favor opposite ends of the design spectrum
• FreeBSD is conservative (compromise on performance)
• Linux is throughput-oriented (compromise on latency and bloat)
![Page 28: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/28.jpg)
Ingens (OSDI’16)
34
▪ Asynchronous allocation
• Huge pages allocated in the background
▪ Utilization-threshold based allocation
• Tunable bloat vs. performance
• Adaptive based on memory pressure
▪ Fairness driven by per-process fairness metric
• Heuristic based on past behavior
![Page 29: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/29.jpg)
Ingens (OSDI’16)
35
▪ Asynchronous allocation
• Huge pages allocated in the background
▪ Utilization-threshold based allocation
• Tunable bloat vs. performance
• Adaptive based on memory pressure
▪ Fairness driven by per-process fairness metric
• Heuristic based on past behavior
low latency
too many page faults
![Page 30: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/30.jpg)
Ingens (OSDI’16)
36
▪ Asynchronous allocation
• Huge pages allocated in the background
▪ Utilization-threshold based allocation
• Tunable bloat vs. performance
• Adaptive based on memory pressure
▪ Fairness driven by per-process fairness metric
• Heuristic based on past behavior
low latency
too many page faults
manual
![Page 31: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/31.jpg)
Ingens (OSDI’16)
37
▪ Asynchronous allocation
• Huge pages allocated in the background
▪ Utilization-threshold based allocation
• Tunable bloat vs. performance
• Adaptive based on memory pressure
▪ Fairness driven by per-process fairness metric
• Heuristic based on past behavior
low latency
too many page faults
manual
weak correlation with page walk overhead
![Page 32: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/32.jpg)
Current state-of-the-art
38
FreeBSD Linux Ingens
Memory bloat Low High Tunable
Performance Low High Tunable
Allocation latency Low High Low
# page faults High Low High
Tradeoff-1:
Tradeoff-2:
▪ Hard to find the sweet-spot for utilization-threshold in Ingens
• Application dependent, phase dependent
![Page 33: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/33.jpg)
HawkEye
39
![Page 34: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/34.jpg)
Key Optimizations
40
➢ Asynchronous page pre-zeroing[1]
➢ Content deduplication based bloat mitigation
➢ Fine-grained intra-process allocation
➢ Fairness driven by hardware performance counters
[1] Optimizing the Idle Task and Other MMU Tricks, OSDI'99
![Page 35: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/35.jpg)
Asynchronous page pre-zeroing
41
▪ Pages zero-filled in the background
▪ Potential issues:
• Cache pollution – leverage non-temporal writes
• DRAM bandwidth consumption – rate-limited
o Limit CPU utilization (e.g., 5%)
![Page 36: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/36.jpg)
Asynchronous page pre-zeroing
42
Enables aggressive allocation with low latency
✓ 13.8x faster VM spin-up
✓ 1.26x higher throughput (Redis)
![Page 37: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/37.jpg)
Mitigating bloat
43
![Page 38: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/38.jpg)
Mitigating bloat
44
Virtual memory
Physical memory
huge page mapping
![Page 39: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/39.jpg)
Mitigating bloat
45
Virtual memory
Physical memory
huge page mapping
unused
![Page 40: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/40.jpg)
Mitigating bloat
46
Virtual memory
Physical memory
huge page mapping
unused
zero-filled
![Page 41: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/41.jpg)
Mitigating bloat
47
▪ Observation: Unused base pages remain zero-filled
▪ Identify bloat by scanning memory
▪ Dedup zero-filled base pages to remove bloat
Virtual memory
Physical memory
huge page mapping
unused
zero-filled
![Page 42: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/42.jpg)
Mitigating bloat
48
67.555.4
115.5
3.9 2.8 1.2 1 6.63
27.49.11
0
30
60
90
120
dis
tan
ce (
byte
s)▪ Ease of detecting non-zero pages
offse
t (b
yte
s)
![Page 43: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/43.jpg)
Mitigating bloat
49
✓ Automated "bloat vs. performance" management
0
8
16
24
32
40
48
1101
201301
401501
601701
801901
10011101
12011301
14011501
1601
RSS
(GB
)
Time (seconds)
Linux Ingens HawkEye
out-of-memory successout-of-memory successP
1
P2
P3
Redis
P1: insert
P2: delete
P3: insert
![Page 44: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/44.jpg)
50
FreeBSD Linux Ingens HawkEye
Memory bloat Low High Tunable Automated
Performance Low High Tunable Automated
Allocation latency Low High Low Low
# page faults High Low High Low
Tradeoff-1:
Tradeoff-2:
![Page 45: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/45.jpg)
Fine-grained (intra-process) allocation
51
▪ Maximizing performance with limited contiguity
![Page 46: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/46.jpg)
Fine-grained (intra-process) allocation
52
hot regions
access-coverage: # base pages accessed per second
❖ A good indicator of TLB-contention due to a region
▪ Maximizing performance with limited contiguity
XSBench
acce
ss-c
ove
rage
![Page 47: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/47.jpg)
Fine-grained (intra-process) allocation
53
access_map
▪ Track access-coverage (access_map)
▪ Allocate in the sorted order
(top to bottom)
✓ Yields higher profit per allocation
![Page 48: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/48.jpg)
Fine-grained (intra-process) allocation
54
0
10
20
30
40
50
1 101 201 301 401 501
MM
U O
verh
ead
(%)
Time (seconds)
Linux Ingens HawkEye
Workload: XSBench
Page
Walk
Ove
rhead (
%)
access-c
overa
ge
![Page 49: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/49.jpg)
Fine-grained (intra-process) allocation
55
0
300
600
900
1200
Graph500 XSBench NPB_CG.D
ms
save
d p
er h
uge
pag
e Linux Ingens HawkEyeE
xe
cu
tio
n tim
e (
ms)
sa
ve
dp
er
hu
ge
pa
ge
allo
ca
tio
n
![Page 50: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/50.jpg)
Fair (inter-process) allocation
56
▪ Prioritize allocation to the process with highest
expected improvement
▪ How to estimate page walk overhead
• Profile hardware performance counters
• Low cost, accurate!
![Page 51: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/51.jpg)
Fair (inter-process) allocation
57
-10
0
10
20
30
40
50
60
70
cactusADM tigr Graph500 lbm_s SVM XSBench CG.D
% s
pe
ed
up
Linux Ingens HawkEye
Workloads running alongside a TLB-insensitive process
% s
pe
ed
up
![Page 52: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/52.jpg)
Summary
58
▪ OS support for huge pages involves complex tradeoffs
▪ Balancing fine-grained control with high performance
▪ Dealing with fragmentation for efficiency and fairness
![Page 53: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/53.jpg)
Summary
59
▪ OS support for huge pages involves complex tradeoffs
▪ Balancing fine-grained control with high performance
▪ Dealing with fragmentation for efficiency and fairness
HawkEye: Resolving fundamental conflictsfor huge page optimizations
https://github.com/apanwariisc/HawkEye
![Page 54: HawkEye: Efficient Fine-grained OS Support for Huge Pagessbansal/pubs/HawkEye_Slides.pdf · Ashish Panwar 1, Sorav Bansal2, K. Gopinath1 Indian Institute of Science (IISc), Bangalore1](https://reader034.fdocuments.net/reader034/viewer/2022050102/5f40f15db387d77eb2223db0/html5/thumbnails/54.jpg)
60