Publication: Learning and Modeling the Rules of Endogenous Antigen Presentation on MHC Class I
No Thumbnail Available
Open/View Files
Date
2019-09-10
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Sarkizova, Siranush. 2019. Learning and Modeling the Rules of Endogenous Antigen Presentation on MHC Class I. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
The adaptive immune system is critical for clearance and long-term protection against pathogens and tumors. It relies on the ability of T cells to identify and eliminate compromised cells that display foreign-born antigens bound to human leukocyte antigen (HLA) class I molecules. Similarly to infected cells that present immunogenic fragments of viral proteins, the accumulation of somatic mutations in cancer gives rise to tumor-specific epitopes that potentiate anti-tumor immune responses. Thus, the ability to predict which peptides will be presented by polymorphic HLA class I gene variants is fundamental to understanding immunity and has applications in the design of both protective and therapeutic vaccines.
While the patterns of peptide presentation on HLA class I have gradually been revealed based on in vitro measurements of synthetic peptides binding to recombinant HLA proteins, there remains a need to ascertain the rules for endogenous antigen display that take into consideration not only peptide-MHC interaction potential but also the intracellular processes that govern the presentation pathway.
Here, we describe a method to sequence naturally presented HLA ligands by mass spectrometry one allele at a time, and utilize it to profile 95 HLA-A, B, C and G alleles, identifying >185,000 epitopes and covering the most frequent alleles in the population. The dataset of per-allele eluted peptides allowed us to discern HLA binding motifs de novo, evaluate the natural length preferences of each allele, and pinpoint length-specific binding characteristics. Beyond allele-specific preferences, we find patterns of inter-allele similarity revealed by ligand motifs that are mirrored by similarities in the physicochemical properties of the HLA binding clefts. By introducing a peptide similarity metric and clustering approach, we decompose aggregate motifs into submotifs, revealing unique and shared binding registers across alleles. We integrate these findings with peptide precursor abundance, as measured by RNA-seq, and a novel predictor of peptide cleavage processing to derive endogenous presentation rules encoded in neural network models. We propose positive predictive value as a more suitable and unsaturated evaluation metric, demonstrate two-fold performance gains relative to established methods in independent datasets, and examine the variability in accuracy across alleles. Finally, we implement publicly accessible computational tools for exploratory analysis and prediction.
We hope that these efforts lead to more potent and safe vaccine therapies against pathogens and tumors owing to better-informed epitope selection.
Description
Other Available Sources
Keywords
HLA, MHC, MHC class I, antigen presentation, antigen prediction, antigen processing, immunopeptidome, tumor antigens, mono-allelic, neoantigen, epitope, peptide, mass spectrometry
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service