Publication:

Network-Application Co-design for Efficient Datacenters

Loading...
Thumbnail Image

Date

2024-07-09

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Zhou, Yang. 2024. Network-Application Co-design for Efficient Datacenters. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Modern datacenters contain hundreds of thousands of servers and high-speed networks to run diverse applications. However, these datacenters suffer from low resource utilization and poor software performance that cannot be improved simply by relying on faster hardware. Because of these utilization and performance challenges, datacenters incur high operational costs, increased energy usage, and difficulty in handling growing application demands.

In this thesis, I will focus on improving resource utilization and application performance through network-application co-design. I will first discuss how resource disaggregation, especially the far memory technique, is a promising way to improve memory utilization. However, prior research often lacks fault tolerance, a crucial requirement in datacenters. Subsequently, I will describe a fault-tolerant far memory system with network-efficient memory swapping and erasure coding, which requires far fewer network I/O operations than conventional wisdom, unlocking higher performance. I will then discuss how application-customized networking stacks can vastly improve the performance of network I/O-intensive distributed protocols such as consensus and transactions. The key insight is to safely offload protocol logic into kernel networking stacks to reduce kernel overhead. The resulting systems achieve the performance of kernel-bypass approaches but the security of kernel stacks.

Description

Other Available Sources

Research Data

Keywords

Distributed Systems, Networking, Systems, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories