Evaluating Measurement Error in Mass Spectrometry-Based Proteomics
Lim, Matt Y.
MetadataShow full item record
CitationLim, Matt Y. 2020. Evaluating Measurement Error in Mass Spectrometry-Based Proteomics. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractOver the last few decades, liquid chromatography with tandem mass spectrometry (LC-MS/MS) has become a pillar in the field of proteomics. As the popularity of and access to LC-MS/MS techniques grows, a concerted effort has been dedicated to increase its quantitative capabilities. Generally, the quantitative mass spectrometry-based proteomics field is divided into two camps: 1) users of isotope labels and 2) those who prefer a label-free approach. While labelling techniques, such as Tandem Mass Tags (TMT), avoid the stochasticity of single sample analyses by multiplexing, isolation of unwanted ions can lead to interference thereby negatively affecting measurement accuracy. Conversely, while label-free methods are not susceptible to peptide interference, stochasticity results in a large increase of missing values, limiting the quantitative usefulness of these data sets.
The work of this dissertation has been focused on understanding how measurement error occurs across various LC-MS/MS experiments from instrumentation to downstream data analysis. We first analyzed how the latest mass spectrometry technologies affect peptide interference, one of the major causes of measurement error in TMT LC-MS/MS experiments. In order to obtain large highly quantitative data sets, our results show that a balance must be maintained between interference, signal-to-noise, and identification rates. Secondly, we developed a two-sample, two-proteome experiment to evaluate the commonly-used Match-Between-Runs algorithm (MBR) – a software solution used by the label-free community in an attempt to circumvent the missing values problem inherent to label-free LC-MS/MS. We find that while MBR will incorrectly transfer identifications at a large rate, quantitative algorithms bundled with the software will not quantity most incorrect transfers. While this audit enforces quantitative accuracy, it effectively negates any gains produced by MBR. Lastly, we developed a TMT-based approach to measure global phosphorylation occupancy but found that mass spectrometry measurements were inadequate at reliably measuring the small changes associated with this type of experiment leading to estimations of negative occupancies. As such, a Bayesian statistical tool was developed to better model the data. By utilizing the measurement error and generating credible intervals, we report an elegant strategy for presenting phosphorylation occupancy data that better encompasses the estimation, as well as related uncertainty.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365524
- FAS Theses and Dissertations