Publication:

Integrative analysis of transcriptomics, epigenetics, and copy number to assess lineage and dynamics of tumor clones during cancer progression

Loading...
Thumbnail Image

Date

2026-01-06

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Li, Ruitong. 2026. Integrative analysis of transcriptomics, epigenetics, and copy number to assess lineage and dynamics of tumor clones during cancer progression. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Reconstructing dynamic biological processes, such as cancer evolution, is challenged by sparse clinical sampling and incomplete multi-omic measurements. This dissertation develops and applies integrative computational strategies to recover interpretable cell-state dynamics under these constraints, focusing on multiple myeloma (MM), chronic lymphocytic leukemia (CLL) progression to Richter's syndrome (RS), and anti-BCMA CAR-T cell therapy.

First, to model cellular dynamics from static snapshots, I contributed to scDiffEq, a neural stochastic differential equation framework that infers cellular drift and diffusion from scRNA-seq. This approach improves trajectory inference and fate prediction. My contributions included designing benchmark criteria and developing novel simulation strategies using binned CytoTRACE pseudotime, which validated the method's robustness to sparse sampling density.

Second, to reconstruct evolutionary time using genetic lineage, I developed Numbat-Multiome. This method unifies copy number variation (CNV) inference from both scRNA-seq and scATAC-seq data. By integrating coverage and allelic imbalance signals within a shared genomic binning scheme, the method accurately detects diverse events (F1 > 0.9), including copy-neutral loss-of-heterozygosity. This enables the reconstruction of subclonal phylogenies to serve as a lineage anchor for multi-omic regulatory analysis.

Applying these frameworks, I dissected regulatory heterogeneity in multiple myeloma. By profiling chromatin accessibility (scATAC-seq) across 36 patient samples, I identified differentially accessible regions and transcription factor programs associated with disease progression. In a separate study on anti-BCMA CAR-T therapy, single-cell multiome profiling of post-infusion cells characterized the coupled transcriptional and epigenetic states underlying T-cell exhaustion and linked CAR promoter accessibility to CAR expression.

To trace clonal evolution during the transformation of CLL to RS, I integrated scRNA-seq, mitochondrial scATAC-seq, and scDNA-seq from pilot cases to map clone-specific regulatory rewiring. Furthermore, I analyzed STAG-seq data from six CLL samples, which jointly profiles targeted genotype and transcriptome in the same cells. This analysis enabled the direct and unambiguous assignment of transcriptional programs and immune cell states to specific genetic subclones.

Together, these contributions provide a lineage-anchored, time-aware framework for studying tumor evolution and therapy response. By coupling mutation-defined ancestry with multi-omic regulatory readouts and neural dynamical models, this work delivers practical tools and conceptual clarity for inferring cancer cell-state transitions from limited and static clinical data.

Description

Other Available Sources

Research Data

Keywords

B-cell malignancies, copy‐number variation, gene regulatory networks, lineage tracing, neural SDEs, single‐cell multi‐omics, Bioinformatics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories