Publication:

Architecting Efficient, Large-Scale AI: An Algorithm-System Co-Design Approach

Loading...
Thumbnail Image

Date

2024-05-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Hsia, Samuel Cheng-Yuan. 2024. Architecting Efficient, Large-Scale AI: An Algorithm-System Co-Design Approach. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Driven by significant advancements in algorithmic techniques and the emergence of new multimodal generative applications, deep learning has entered the era of "large-scale AI". As leading models dramatically increase in size and complexity, the hardware and software requirements also become significantly more demanding. If efficient solutions are not developed in a timely manner, model exploration will grind to a halt and at-scale serving will be infeasible. End-to-end co-design solutions must address three key themes: the unique technical challenges posed by the large-scale nature of these models, the distinct requirements for training versus inference, and the critical need for efficiency.

This dissertation presents three case studies for navigating the complexities of large-scale AI. The first case involves a cross-stack characterization of large-scale models, identifying performance bottlenecks and potential avenues for optimization across different system layers. The second case study explores redesigning embedding-centric models through data and hardware-aware observations, aiming for substantial improvements from novel embedding representations. The third case study develops tools that help researchers gain better insights into mapping of increasingly complex models onto physical infrastructures, addressing the logistical and operational challenges of deploying large-scale AI systems in data centers.

Looking forward, the dissertation identifies areas for future research, including co-design strategies tailored for embedding-driven, multimodal AI models and the role of reliability versus resiliency in data center-scale training environments. Collectively, this work contributes to the foundational understanding and practical advancement of large-scale AI technology, setting a course for future innovations in the field.

Description

Other Available Sources

Research Data

Keywords

Computer Architecture, Generative AI, Hardware Accelerators, Hardware-Software Co-Design, Machine Learning, Recommender Systems, Computer science, Computer engineering, Electrical engineering

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories