Person: Irizarry, Rafael
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Irizarry
First Name
Rafael
Name
Irizarry, Rafael
17 results
Search Results
Now showing 1 - 10 of 17
Publication Robust decomposition of cell type mixtures in spatial transcriptomics(SpringerNature, 2020-05-08) Cable, Dylan; Murray, Evan; Zou, Luli; Goeva, Aleksandrina; Macosko, Evan; Chen, Fei; Irizarry, RafaelSpatial transcriptomic technologies measure gene expression at increasing spatial resolution, approaching individual cells. However, a limitation of current technologies is that spatial measurements may contain contributions from multiple cells, hindering the discovery of cell type-specific spatial patterns of localization and expression. Here, we develop Robust Cell Type Decomposition (RCTD, https://github.com/dmcable/RCTD), a computational method that leverages cell type profiles learned from single-cell RNA sequencing data to decompose mixtures, such as those observed in spatial transcriptomic technologies. Our approach accounts for platform effects introduced by systematic technical variability inherent to different sequencing modalities. We demonstrate RCTD provides substantial improvement in cell type assignment in Slide-seq data by accurately reproducing known cell type and subtype localization patterns in the cerebellum and hippocampus. We further show the advantages of RCTD by its ability to detect mixtures and identify cell types on an assessment dataset. Finally, we show how RCTD’s recovery of cell type localization uniquely enables the discovery of genes within a cell type whose expression depends on spatial environment. Spatial mapping of cell types with RCTD has the potential to enable the definition of spatial components of cellular identity, uncovering new principles of cellular organization in biological tissue.Publication Every Body Counts: Measuring Mortality From the COVID-19 Pandemic(American College of Physicians, 2020-09-11) Kiang, Mathew; Irizarry, Rafael; Buckee, Caroline; Balsari, SatchitAs of mid-August 2020, more than 170 000 U.S. residents have died of coronavirus disease 2019 (COVID-19); however, the true number of deaths resulting from COVID-19, both directly and indirectly, is likely to be much higher. The proper attribution of deaths to this pandemic has a range of societal, legal, mortuary, and public health consequences. This article discusses the current difficulties of disaster death attribution and describes the strengths and limitations of relying on death counts from death certificates, estimations of indirect deaths, and estimations of excess mortality. Improving the tabulation of direct and indirect deaths on death certificates will require concerted efforts and consensus across medical institutions and public health agencies. In addition, actionable estimates of excess mortality will require timely access to standardized and structured vital registry data, which should be shared directly at the state level to ensure rapid response for local governments. Correct attribution of direct and indirect deaths and estimation of excess mortality are complementary goals that are critical to our understanding of the pandemic and its effect on human life.Publication Flexible expressed region analysis for RNA-seq with derfinder(Oxford University Press, 2017) Collado-Torres, Leonardo; Nellore, Abhinav; Frazee, Alyssa C.; Wilks, Christopher; Love, Michael I.; Langmead, Ben; Irizarry, Rafael; Leek, Jeffrey T.; Jaffe, Andrew E.Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.Publication Mortality in Puerto Rico after Hurricane Maria(New England Journal of Medicine (NEJM/MMS), 2018) Kishore, Nishant; Marqués, Domingo; Mahmud, Ayesha; Kiang, Mathew; Rodriguez, Irmary; Fuller, Arlan; Ebner, Peggy; Sorensen, Cecilia; Racy, Fabio De Castro Jorge; Lemery, Jay; Maas, Leslie; Leaning, Jennifer; Irizarry, Rafael; Balsari, Satchit; Buckee, CarolineBACKGROUND Quantifying the effect of natural disasters on society is critical for recovery of public health services and infrastructure. The death toll can be difficult to assess in the aftermath of a major disaster. In September 2017, Hurricane Maria caused massive infrastructural damage to Puerto Rico, but its effect on mortality remains contentious. The official death count is 64. METHODS Using a representative, stratified sample, we surveyed 3299 randomly chosen households across Puerto Rico to produce an independent estimate of all-cause mortality after the hurricane. Respondents were asked about displacement, infrastructure loss, and causes of death. We calculated excess deaths by comparing our estimated post-hurricane mortality rate with official rates for the same period in 2016. RESULTS From the survey data, we estimated a mortality rate of 14.3 deaths (95% confidence interval [CI], 9.8 to 18.9) per 1000 persons from September 20 through December 31, 2017. This rate yielded a total of 4645 excess deaths during this period (95% CI, 793 to 8498), equivalent to a 62% increase in the mortality rate as compared with the same period in 2016. However, this number is likely to be an underestimate because of survivor bias. The mortality rate remained high through the end of December 2017, and one third of the deaths were attributed to delayed or interrupted health care. Hurricane-related migration was substantial. CONCLUSIONS This household-based survey suggests that the number of excess deaths related to Hurricane Maria in Puerto Rico is more than 70 times the official estimate. (Funded by the Harvard T.H. Chan School of Public Health and others.)Publication Salmon: fast and bias-aware quantification of transcript expression using dual-phase inference(2017) Patro, Rob; Duggal, Geet; Love, Michael I; Irizarry, Rafael; Kingsford, CarlWe introduce Salmon, a method for quantifying transcript abundance from RNA-seq reads that is accurate and fast. Salmon is the first transcriptome-wide quantifier to correct for fragment GC content bias, which we demonstrate substantially improves the accuracy of abundance estimates and the reliability of subsequent differential expression analysis. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure.Publication Bisulfite-independent analysis of CpG island methylation enables genome-scale stratification of single cells(Oxford University Press, 2017) Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L.; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M.; Michor, Franziska; Fan, Rong; Pan, XinghuaAbstract Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population.Publication Accounting for cellular heterogeneity is critical in epigenome-wide association studies(BioMed Central, 2014) Jaffe, Andrew E; Irizarry, RafaelBackground: Epigenome-wide association studies of human disease and other quantitative traits are becoming increasingly common. A series of papers reporting age-related changes in DNA methylation profiles in peripheral blood have already been published. However, blood is a heterogeneous collection of different cell types, each with a very different DNA methylation profile. Results: Using a statistical method that permits estimating the relative proportion of cell types from DNA methylation profiles, we examine data from five previously published studies, and find strong evidence of cell composition change across age in blood. We also demonstrate that, in these studies, cellular composition explains much of the observed variability in DNA methylation. Furthermore, we find high levels of confounding between age-related variability and cellular composition at the CpG level. Conclusions: Our findings underscore the importance of considering cell composition variability in epigenetic studies based on whole blood and other heterogeneous tissue sources. We also provide software for estimating and exploring this composition confounding for the Illumina 450k microarray.Publication quantro: a data-driven approach to guide the choice of an appropriate normalization method(BioMed Central, 2015) Hicks, Stephanie C.; Irizarry, RafaelNormalization is an essential step in the analysis of high-throughput data. Multi-sample global normalization methods, such as quantile normalization, have been successfully used to remove technical variation. However, these methods rely on the assumption that observed global changes across samples are due to unwanted technical variability. Applying global normalization methods has the potential to remove biologically driven variation. Currently, it is up to the subject matter experts to determine if the stated assumptions are appropriate. Here, we propose a data-driven alternative. We demonstrate the utility of our method (quantro) through examples and simulations. A software implementation is available from http://www.bioconductor.org/packages/release/bioc/html/quantro.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-015-0679-0) contains supplementary material, which is available to authorized users.Publication MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens(BioMed Central, 2014) Li, Wei; Xu, Han; Xiao, Tengfei; Cong, Le; Love, Michael I.; Zhang, Feng; Irizarry, Rafael; Liu, Jun; Brown, Myles; Liu, X ShirleyWe propose the Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. MAGeCK demonstrates better performance compared with existing methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions. Using public datasets, MAGeCK identified novel essential genes and pathways, including EGFR in vemurafenib-treated A375 cells harboring a BRAF mutation. MAGeCK also detected cell type-specific essential genes, including BCR and ABL1, in KBM7 cells bearing a BCR-ABL fusion, and IGF1R in HL-60 cells, which depends on the insulin signaling pathway for proliferation. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0554-4) contains supplementary material, which is available to authorized users.Publication Large hypomethylated blocks as a universal defining epigenetic alteration in human solid tumors(BioMed Central, 2014) Timp, Winston; Bravo, Hector Corrada; McDonald, Oliver G; Goggins, Michael; Umbricht, Chris; Zeiger, Martha; Feinberg, Andrew P; Irizarry, RafaelBackground: One of the most provocative recent observations in cancer epigenetics is the discovery of large hypomethylated blocks, including single copy genes, in colorectal cancer, that correspond in location to heterochromatic LOCKs (large organized chromatin lysine-modifications) and LADs (lamin-associated domains). Methods: Here we performed a comprehensive genome-scale analysis of 10 breast, 28 colon, nine lung, 38 thyroid, 18 pancreas cancers, and five pancreas neuroendocrine tumors as well as matched normal tissue from most of these cases, as well as 51 premalignant lesions. We used a new statistical approach that allows the identification of large hypomethylated blocks on the Illumina HumanMethylation450 BeadChip platform. Results: We find that hypomethylated blocks are a universal feature of common solid human cancer, and that they occur at the earliest stage of premalignant tumors and progress through clinical stages of thyroid and colon cancer development. We also find that the disrupted CpG islands widely reported previously, including hypermethylated island bodies and hypomethylated shores, are enriched in hypomethylated blocks, with flattening of the methylation signal within and flanking the islands. Finally, we found that genes showing higher between individual gene expression variability are enriched within these hypomethylated blocks. Conclusion: Thus hypomethylated blocks appear to be a universal defining epigenetic alteration in human cancer, at least for common solid tumors. Electronic supplementary material The online version of this article (doi:10.1186/s13073-014-0061-y) contains supplementary material, which is available to authorized users.