Understanding and Improving Device Access Complexity
Fine-grained fault tolerance using device checkpoints
Tolerating Hardware Device Failures in Software
Understanding Modern Device Drivers
Live Migration of Direct-Access Devices
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed Machine Learning)