Understanding and Improving Device Access Complexity
Fine-grained fault tolerance using device checkpoints
Understanding Modern Device Drivers
Tolerating Hardware Device Failures in Software
Live Migration of Direct-Access Devices
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed Machine Learning)