Publication: Network-Application Co-design for Efficient Datacenters
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Modern datacenters contain hundreds of thousands of servers and high-speed networks to run diverse applications. However, these datacenters suffer from low resource utilization and poor software performance that cannot be improved simply by relying on faster hardware. Because of these utilization and performance challenges, datacenters incur high operational costs, increased energy usage, and difficulty in handling growing application demands.
In this thesis, I will focus on improving resource utilization and application performance through network-application co-design. I will first discuss how resource disaggregation, especially the far memory technique, is a promising way to improve memory utilization. However, prior research often lacks fault tolerance, a crucial requirement in datacenters. Subsequently, I will describe a fault-tolerant far memory system with network-efficient memory swapping and erasure coding, which requires far fewer network I/O operations than conventional wisdom, unlocking higher performance. I will then discuss how application-customized networking stacks can vastly improve the performance of network I/O-intensive distributed protocols such as consensus and transactions. The key insight is to safely offload protocol logic into kernel networking stacks to reduce kernel overhead. The resulting systems achieve the performance of kernel-bypass approaches but the security of kernel stacks.