ReViveI/O: Efficient Handling of I/O in Highly...
Transcript of ReViveI/O: Efficient Handling of I/O in Highly...
![Page 1: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/1.jpg)
ReViveI/O:
Efficient Handling of I/O in Highly-Available Rollback-Recovery
ServersJun Nakano✢, Pablo Montesinos, Kourosh Gharachorloo❆, Josep Torrellas
University of Illinois at Urbana-Champaign✢IBM❆Google
![Page 2: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/2.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Building Fault-Tolerant Systems
2
![Page 3: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/3.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Building Fault-Tolerant Systems
2
RedundantSelf-Checking
HW
High OverheadSmall/Null MTTR
HP NonstopIBM S/390
![Page 4: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/4.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Building Fault-Tolerant Systems
2
RedundantSelf-Checking
HW
High OverheadSmall/Null MTTR
HP NonstopIBM S/390
Plain HW +SW-based
Checkpointing
High OverheadSignificant MTTR
KeyKOSFT Mach
![Page 5: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/5.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Building Fault-Tolerant Systems
2
RedundantSelf-Checking
HW
High OverheadSmall/Null MTTR
HP NonstopIBM S/390
Plain HW +SW-based
Checkpointing
High OverheadSignificant MTTR
KeyKOSFT Mach
HW-based High-frequency Checkpointing
Low overheadSmall MTTR
ReVive, SafetyNet
![Page 6: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/6.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
A glimpse of ReVive (ISCA 2002)
3
![Page 7: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/7.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
A glimpse of ReVive (ISCA 2002)
• Goal: recover the memory state of a shared-memory machine in < 1s
3
![Page 8: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/8.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
A glimpse of ReVive (ISCA 2002)
• Goal: recover the memory state of a shared-memory machine in < 1s
• HW-assisted very frequent checkpoint (20ms < T < 100ms)
• During checkpoints:
• Write-back dirty data from caches
• Main memory is the checkpoint state
• Between checkpoints:
• HW logs overwritten data in memory when is modified for the first time
3
![Page 9: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/9.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
A glimpse of ReVive (ISCA 2002)
• Goal: recover the memory state of a shared-memory machine in < 1s
• HW-assisted very frequent checkpoint (20ms < T < 100ms)
• During checkpoints:
• Write-back dirty data from caches
• Main memory is the checkpoint state
• Between checkpoints:
• HW logs overwritten data in memory when is modified for the first time
• Entire main memory protected by distributed parity
• Like RAID-5, but in memory
• Can tolerate the loss of a node
3
![Page 10: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/10.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
HW-based High-frequency Checkpointing
Low overheadSmall MTTR
ReVive,SafetyNet
Building Fault-Tolerant Systems
4
RedundantSelf-Checking
HW
High OverheadSmall/Null MTTR
HP NonstopIBM S/390
Plain HW +SW-based
Checkpointing
High OverheadSignificant MTTR
KeyKOSFT Mach
![Page 11: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/11.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
RedundantSelf-Checking
HW
Expensive
Building Fault-Tolerant Systems
5
Plain HW +SW-based
Checkpointing
High OverheadSignificant MTTR
KeyKOSFT Mach
HW-based High-frequency Checkpointing
Low overheadSmall MTTR
ReVive,SafetyNet
![Page 12: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/12.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
RedundantSelf-Checking
HW
Expensive
Building Fault-Tolerant Systems
6
Plain HW +SW-based
Checkpointing
Significant MTTR
HW-based High-frequency Checkpointing
Low overheadSmall MTTR
ReVive,SafetyNet
![Page 13: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/13.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
RedundantSelf-Checking
HW
Expensive
Building Fault-Tolerant Systems
7
Plain HW +SW-based
Checkpointing
Significant MTTR
HW-based High-frequency Checkpointing
Modest HWNo I/O support
![Page 14: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/14.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
RedundantSelf-Checking
HW
Expensive
Building Fault-Tolerant Systems
8
Plain HW +SW-based
Checkpointing
Significant MTTR
HW-based High-frequency Checkpointing
Modest HWNo I/O support
(until today)
![Page 15: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/15.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Why is I/O undo/redo hard?
9
![Page 16: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/16.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
Why is I/O undo/redo hard?
9
![Page 17: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/17.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back
• st
Why is I/O undo/redo hard?
9
![Page 18: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/18.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back
Why is I/O undo/redo hard?
10
![Page 19: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/19.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back
Why is I/O undo/redo hard?
10
Server
Disk
Network
Checkpoint n Checkpoint n+1
![Page 20: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/20.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back
Why is I/O undo/redo hard?
10
Server
Disk
Network
Checkpoint n Checkpoint n+1
write
send
![Page 21: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/21.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back
Why is I/O undo/redo hard?
10
Server
Disk
Network
Checkpoint n Checkpoint n+1
write
send
X
![Page 22: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/22.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
• Output Commit Problem:
• An output to the external world cannot be rolled back:
• How do you undo the I/O? fjkshalkjfhlkjsdfhkldjhklsdhkljhkhlkhklhk
Why is I/O undo/redo hard?
11
Server
Disk
Network
Checkpoint n Checkpoint n+1
write
send
X
![Page 23: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/23.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
12
![Page 24: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/24.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
12
![Page 25: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/25.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
12
![Page 26: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/26.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
• Disk and network only
12
![Page 27: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/27.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
• Disk and network only
• Transparent to OS and applications
12
![Page 28: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/28.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
• Disk and network only
• Transparent to OS and applications
• Low overhead during fault-free operation (<1% over ReVive)
12
![Page 29: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/29.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
• Disk and network only
• Transparent to OS and applications
• Low overhead during fault-free operation (<1% over ReVive)
• Fast recovery (< 1s even if processor lost)
12
![Page 30: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/30.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Contribution: ReViveI/O
• Support I/O undo/redo in rollback-recovery SMP servers
• Targeted to throughput-oriented workloads
• Disk and network only
• Transparent to OS and applications
• Low overhead during fault-free operation (<1% over ReVive)
• Fast recovery (< 1s even if processor lost)
• Small impact on hardware cost
12
![Page 31: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/31.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
13
• Masubuchi et al, FTCS-97
![Page 32: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/32.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
13
Device
• Masubuchi et al, FTCS-97
![Page 33: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/33.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
13
Device
Device Driver
• Masubuchi et al, FTCS-97
![Page 34: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/34.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
13
Device
Device Driver
Kernel
• Masubuchi et al, FTCS-97
![Page 35: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/35.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
13
Device
Device Driver
Kernel
Pseudo Device-Driver(PDD)
• Masubuchi et al, FTCS-97
![Page 36: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/36.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Approach: Pseudo-Device Driver
• Delays outputs until next checkpoint
• Commits outputs after next checkpoint
13
Device
Device Driver
Kernel
Pseudo Device-Driver(PDD)
• Masubuchi et al, FTCS-97
• No need to modify OS nor applications
• Network and Disk PDDs
![Page 37: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/37.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
![Page 38: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/38.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
Fault!
![Page 39: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/39.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
MemoryRecoverable by ReVive
(MR)
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 40: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/40.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
ReVive restores memory from previous checkpoint
MemoryRecoverable by ReVive
(MR)
PDD recovers I/O transparently
Yes
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 41: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/41.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
ReVive restores memory from previous checkpoint
MemoryRecoverable by ReVive
(MR)
PDD recovers I/O transparently
Yes No
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 42: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/42.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
ReVive restores memory from previous checkpoint
MemoryRecoverable by ReVive
(MR)
Fix HW and Reboot
PDD recovers I/O transparently
Yes No
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 43: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/43.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
ReVive restores memory from previous checkpoint
MemoryRecoverable by ReVive
(MR)
PDD makes the I/O state consistent
Fix HW and Reboot
PDD recovers I/O transparently
Yes No
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 44: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/44.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Fault Model
14
ReVive restores memory from previous checkpoint
MemoryRecoverable by ReVive
(MR)
PDD makes the I/O state consistent
Conventional recovery by Application (e.g. DB)
Fix HW and Reboot
PDD recovers I/O transparently
Yes No
MR=Transients or 1 permanent with at most 1 node loss
Fault!
![Page 45: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/45.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
![Page 46: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/46.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 47: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/47.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 48: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/48.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
PKTi
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 49: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/49.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
PKTi
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 50: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/50.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
PKTi
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 51: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/51.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
PKTi
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
x
![Page 52: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/52.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
x
![Page 53: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/53.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
![Page 54: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/54.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
![Page 55: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/55.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
![Page 56: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/56.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
![Page 57: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/57.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
![Page 58: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/58.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
![Page 59: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/59.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
![Page 60: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/60.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
![Page 61: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/61.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
ACKi+1
![Page 62: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/62.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
ACKi+1x
![Page 63: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/63.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
x
![Page 64: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/64.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
![Page 65: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/65.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
• TCP eliminates duplicate packages
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
![Page 66: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/66.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
15
ACKi
• ReViveI/O leverages the properties of TCP to support network undo/redo
• TCP avoids saving packages for inputs: the sender timeouts and retransmits
• TCP eliminates duplicate packages
Network PDD
Kernel, App
Client
CKPTi CKPTi+1 CKPTi+2 CKPTi+3 CKPTi+4 CKPTi+5
Time
PKTi
ACKi
PKTi+1
ACKi+1
ACKi+1
![Page 67: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/67.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
![Page 68: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/68.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
![Page 69: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/69.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
![Page 70: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/70.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
![Page 71: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/71.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
LoggingCopy old data elsewhere in disk
Write data in placeDelete pointer to old data
![Page 72: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/72.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
LoggingCopy old data elsewhere in disk
Write data in placeDelete pointer to old data
Buffering Write data to disk buffer and mem Copy data from mem to disk
![Page 73: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/73.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
16
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
LoggingCopy old data elsewhere in disk
Write data in placeDelete pointer to old data
Buffering Write data to disk buffer and mem Copy data from mem to disk
RenamingWrite data to renamed disk location
Save old logical → physical mapping Delete old mapping
![Page 74: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/74.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
17
Scheme On I/O write issue After next checkpoint
Stall Stall the write Write data to disk
LoggingCopy old data elsewhere in disk
Write data in placeDelete pointer to old data
Buffering Write data to disk buffer and mem Copy data from mem to disk
RenamingWrite data to renamed disk location
Save old logical → physical mapping Delete old mapping
![Page 75: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/75.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Memory Buffer
Disk Buffer
Target Disk
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 76: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/76.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
update request
Memory Buffer
Disk Buffer
Target Disk
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 77: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/77.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
update request
Memory Buffer
Disk Buffer
Target Disk
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 78: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/78.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
update request
Memory Buffer
Disk Buffer
Target Disk
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 79: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/79.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
update request
Memory Buffer
Disk Buffer
Target Disk
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 80: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/80.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
update request
Memory Buffer
Disk Buffer
Target Disk
x
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 81: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/81.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Memory Buffer
Disk Buffer
Target Disk
x
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
![Page 82: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/82.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 83: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/83.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 84: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/84.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 85: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/85.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 86: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/86.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 87: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/87.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
response
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 88: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/88.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
response
x
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 89: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/89.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
x
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 90: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/90.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
![Page 91: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/91.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request response
![Page 92: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/92.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request response
x
Non MR Fault
![Page 93: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/93.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Memory Buffer
Disk Buffer
update request
x
Non MR Fault
![Page 94: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/94.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Disk Buffer
Non MR Fault
![Page 95: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/95.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD
18
Target Disk
MR Fault
Time
Network PDD
Kernel, App
Client
CKPTi CKPTi+3 CKPTi+5
Disk PDD
CKPTi+2 CKPTi+4
Disk Buffer
Non MR Fault
![Page 96: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/96.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
19
![Page 97: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/97.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
19
Useful Work
CKP
Time
![Page 98: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/98.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
19
Useful Work
CKP
Time 100ms
![Page 99: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/99.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
Useful Work
CKP
Time 100ms
X
![Page 100: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/100.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
Useful Work
CKP
Time 100ms
Detection
80ms
X
![Page 101: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/101.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
X
![Page 102: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/102.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
ReVive Recovery
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Log
~100ms
X
![Page 103: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/103.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
ReVive Recovery
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Log
~100ms
Rollback
~490ms
X
![Page 104: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/104.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
20
Lost Work
ReVive Recovery
Machine Unavailable
Degraded Execution
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Data
~seconds
ReconstructLost Log
~100ms
Rollback
~490ms
X
![Page 105: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/105.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
21
Lost Work
ReVive Recovery
Machine Unavailable
Degraded Execution
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Data
~seconds
ReconstructLost Log
~100ms
Rollback
~490ms
X
ReVive I/O Recovery
![Page 106: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/106.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
21
Lost Work
ReVive Recovery
Machine Unavailable
Degraded Execution
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Data
~seconds
ReconstructLost Log
~100ms
Rollback
~490ms
X
ReVive I/O Recovery
Reset Device(~1ms)
![Page 107: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/107.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
21
Lost Work
ReVive Recovery
Machine Unavailable
Degraded Execution
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Data
~seconds
ReconstructLost Log
~100ms
Rollback
~490ms
X
ReVive I/O Recovery
Reset Device(~1ms)
Reset Device Driver(~1ms)
![Page 108: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/108.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Recovery: ReVive + ReViveI/O
21
Lost Work
ReVive Recovery
Machine Unavailable
Degraded Execution
Useful Work
CKP
Time 100ms
Detection
80ms
•Self-Check•Rerouting
50ms
ReconstructLost Data
~seconds
ReconstructLost Log
~100ms
Rollback
~490ms
X
ReVive I/O Recovery
Reset Device(~1ms)
Reset Device Driver(~1ms)
Re-issue Outputs in theBackground (<100ms)
![Page 109: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/109.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Experiments
22
![Page 110: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/110.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Experiments
22
![Page 111: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/111.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Experiments
22
Clients
•Dual 1.5 GHz Athlon•Gigabit Ethernet•1 GB RAM
![Page 112: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/112.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Experiments
22
Clients
•Dual 1.5 GHz Athlon•Gigabit Ethernet•1 GB RAM
Server
•Dual 1.5 GHz Athlon•Gigabit Ethernet•2 GB RAM
+ ReVive PDD
![Page 113: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/113.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Throughput-oriented workload
23
1.11.00.90.80.70.60.50.40.30.20.10
Checkpoint interval (ms)
Norm
aliz
ed T
hro
ughput
(%)
TPC-C on Oracle9i with 30 Clients
![Page 114: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/114.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Throughput-oriented workload
23
20 40 80 120 160 200 240
1.11.00.90.80.70.60.50.40.30.20.10
Checkpoint interval (ms)
Norm
aliz
ed T
hro
ughput
(%)
TPC-C on Oracle9i with 30 Clients
![Page 115: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/115.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Throughput-oriented workload
23
20 40 80 120 160 200 240
StallReViveI/O
1.11.00.90.80.70.60.50.40.30.20.10
Checkpoint interval (ms)
Norm
aliz
ed T
hro
ughput
(%)
TPC-C on Oracle9i with 30 Clients
![Page 116: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/116.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Throughput-oriented workload
• Stalling scheme (simply delaying writes) incurs too much overhead
23
20 40 80 120 160 200 240
StallReViveI/O
1.11.00.90.80.70.60.50.40.30.20.10
Checkpoint interval (ms)
Norm
aliz
ed T
hro
ughput
(%)
TPC-C on Oracle9i with 30 Clients
![Page 117: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/117.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Throughput-oriented workload
• Stalling scheme (simply delaying writes) incurs too much overhead
• ReViveI/O’s overhead is only 0.1% for T<120 ms
23
20 40 80 120 160 200 240
StallReViveI/O
1.11.00.90.80.70.60.50.40.30.20.10
Checkpoint interval (ms)
Norm
aliz
ed T
hro
ughput
(%)
TPC-C on Oracle9i with 30 Clients
![Page 118: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/118.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Integration with ReVive (for TPC-C)
24
Checkpoint Interval (ms)
Red
uct
ion in T
hro
ughput
(%
)
![Page 119: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/119.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
0 20 40 60 80 100 120 140 160 1800
2
4
6
8
10
12
Integration with ReVive (for TPC-C)
24
Checkpoint Interval (ms)
Red
uct
ion in T
hro
ughput
(%
)
![Page 120: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/120.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
0 20 40 60 80 100 120 140 160 1800
2
4
6
8
10
12
Integration with ReVive (for TPC-C)
24
ReVive + ReVive I/O
ReVive Only (estimated)
Checkpoint Interval (ms)
Red
uct
ion in T
hro
ughput
(%
)
![Page 121: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/121.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
0 20 40 60 80 100 120 140 160 1800
2
4
6
8
10
12
Integration with ReVive (for TPC-C)
• ReVive + ReViveI/O: only 7% throughput reduction for 60 - 120 ms checkpoint intervals
24
ReVive + ReVive I/O
ReVive Only (estimated)
Checkpoint Interval (ms)
Red
uct
ion in T
hro
ughput
(%
)
![Page 122: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/122.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
4 hr2 1 day 1 week 10 years1 year631 month
99.9%
99.99%
99.999%
99.9999%
99.99999%
MTBFMR
Availability
Availability
25
1 hr
![Page 123: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/123.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
4 hr2 1 day 1 week 10 years1 year631 month
99.9%
99.99%
99.999%
99.9999%
99.99999%
MTBFMR
Availability
Baseline (No Rollback)
Availability
25
1 hr
![Page 124: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/124.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
4 hr2 1 day 1 week 10 years1 year631 month
99.9%
99.99%
99.999%
99.9999%
99.99999%
MTBFMR
Availability
Baseline (No Rollback)
Availability
25
1 hr
ReVive + ReViveI/O. MTBFNMR = 1000 x MTBFMR
![Page 125: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/125.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
4 hr2 1 day 1 week 10 years1 year631 month
99.9%
99.99%
99.999%
99.9999%
99.99999%
MTBFMR
Availability
Baseline (No Rollback)
Availability
• ReVive I/O beats baseline thanks to its low recovery latency for MR faults
25
1 hr
ReVive + ReViveI/O. MTBFNMR = 1000 x MTBFMR
![Page 126: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/126.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Conclusions
26
![Page 127: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/127.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Conclusions
• Support for I/O undo/redo in memory-checkpointing systems:
• Transparent to OS and applications
• Fast recovery and negligible overhead during fault-free operation
• Low space overhead (see paper)
26
![Page 128: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/128.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Conclusions
• Support for I/O undo/redo in memory-checkpointing systems:
• Transparent to OS and applications
• Fast recovery and negligible overhead during fault-free operation
• Low space overhead (see paper)
• Troughtput-oriented workloads:
• < 1% overhead over ReVive
26
![Page 129: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/129.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Conclusions
• Support for I/O undo/redo in memory-checkpointing systems:
• Transparent to OS and applications
• Fast recovery and negligible overhead during fault-free operation
• Low space overhead (see paper)
• Troughtput-oriented workloads:
• < 1% overhead over ReVive
• Latency-bound workloads:
• Tolerable checkpoint interval is application dependent (see paper)
26
![Page 130: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/130.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Conclusions
• Support for I/O undo/redo in memory-checkpointing systems:
• Transparent to OS and applications
• Fast recovery and negligible overhead during fault-free operation
• Low space overhead (see paper)
• Troughtput-oriented workloads:
• < 1% overhead over ReVive
• Latency-bound workloads:
• Tolerable checkpoint interval is application dependent (see paper)
26
• ReViveI/O can be used for transactional memory
![Page 131: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/131.jpg)
ReVive I/O:
Efficient Handling of I/O in Highly-Available Rollback-Recovery
ServersJun Nakano✢, Pablo Montesinos, Kourosh Gharachorloo❆, Josep Torrellas
University of Illinois at Urbana-Champaign✢IBM❆Google
![Page 132: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/132.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Distributed N+1 Parity in ReVive
28
![Page 133: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/133.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Distributed N+1 Parity in ReVive
28
Parity
Node O
Data
Node 1
Data
Node N
...
![Page 134: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/134.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Distributed N+1 Parity in ReVive
28
Parity
Node O
Data
Node 1
Data
Node N
...Parity Group
![Page 135: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/135.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Distributed N+1 Parity in ReVive
• Allocation granularity: page
• Update granularity: cache line
• Distributed parity: can recover from loss of one node
28
Parity
Node O
Data
Node 1
Data
Node N
...Parity Group
![Page 136: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/136.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Space Overhead
• Network PDD:
• Need to increase OS’s TCP sliding window size
• e.g., Gigabit Ethernet, T = 100 ms → 12 MB
• Disk PDD (buffering):
• Disk PDD needs private memory area for buffering
• e.g., 40 MB/s disk, T = 100 ms → 4 MB
29
![Page 137: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/137.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Latency-oriented workload
30
![Page 138: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/138.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Latency-oriented workload
30
100 200 300 400 500
800
700
600
500
400
300
200
100
0
Number of Clients
Res
ponse
Tim
e (m
s)
WebStone + Apache
![Page 139: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/139.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Latency-oriented workload
30
100 200 300 400 500
BaselineBuffering - 20ms CKPTBuffering - 40ms CKPTBuffering - 80ms CKPTBuffering - 160ms CKPT
800
700
600
500
400
300
200
100
0
Number of Clients
Res
ponse
Tim
e (m
s)
WebStone + Apache
![Page 140: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/140.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Latency-oriented workload
• Tolerable checkpoint interval is application dependent
• Response times up to 100ms are acceptable for interactive applications
30
100 200 300 400 500
BaselineBuffering - 20ms CKPTBuffering - 40ms CKPTBuffering - 80ms CKPTBuffering - 160ms CKPT
800
700
600
500
400
300
200
100
0
Number of Clients
Res
ponse
Tim
e (m
s)
WebStone + Apache
![Page 141: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/141.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Disk PDD Schemes
31
StallOn I/O write issue Stalls the write
After next CKPT Writes data to disk
LoggingOn I/O write issue
Copies old data elsewhere in disk Writes data in place
After next CKPT Deletes pointer to old data
BufferingOn I/O write issue Writes data to disk buffer and mem
After next CKPT Copies data from mem to disk
RenamingOn I/O write issue
Writes data to renamed disk location Saves old logical → physical mapping
After next CKPT Delete old mapping
![Page 142: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/142.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
![Page 143: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/143.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
![Page 144: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/144.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTi
![Page 145: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/145.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTi
PKTi
![Page 146: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/146.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTiX
PKTi
![Page 147: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/147.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTiX
PKTi
![Page 148: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/148.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTiX X
PKTi
![Page 149: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/149.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Network PDD and TCP
• Resend packets after rollback
• TCP eliminates duplicates
32
Kernel, APP
Network PDD
CKPT CKPT
Client
Time
• Can avoid saving inputs replay
• Sender timeouts and resendKernel, APP
Network PDD
CKPT CKPT
Client
Time
PKTiX X
PKTi
![Page 150: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/150.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
Integration with ReVive
• Simple model of ReVive overhead:
• c and r are constants (regardless of checkpoint frequency)
• For instance, c = 1 ms, r=5% and T = 100 ms give 6% overhead
33
Checkpointc [ms]
Time
overhead r
checkpoint interval T
![Page 151: ReViveI/O: Efficient Handling of I/O in Highly …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca06.pdfMotivation Concept Implementation Evaluation Conclusions HPCA 12 - ReVive](https://reader034.fdocuments.net/reader034/viewer/2022050502/5f94254cdf71f542da02ffc6/html5/thumbnails/151.jpg)
Motivation Concept Implementation Evaluation Conclusions
HPCA 12 - ReVive I/O. Pablo Montesinos. University of Illinois.
0 20 40 60 80 100 120 140 160 180
Integration with ReVive (WebStone)
• Increase in a latency-bound workload is 2 x T (Checkpoint Interval)
• Most of the overhead comes from ReVive I/O
34
ReVive + ReVive I/O
ReVive Only
Checkpoint Interval (ms)
Incr
ease
in r
esponse
tim
e (m
s) 400
350
300
250
200
150
100
50
0