Person: Getz, Gad
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Getz
First Name
Gad
Name
Getz, Gad
37 results
Search Results
Now showing 1 - 10 of 37
Publication RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues(American Association for the Advancement of Science (AAAS), 2019-06-06) Yizhak, Keren; Aguet, Francois; Kim, Jaegil; Hess, Julian; Kubler, Kirsten; Grimsby, Jonna; Frazer, Ruslana; Zhang, Hailei; Haradhvala, Nicholas; Rosebrock, Daniel; Livitz, Dimitri; Li, Xiao; Arich-Landkof, Eila; Shoresh, Noam; Stewart, Chip; Segre, Ayelet; Branton, Philip A; Polak, Paz; Ardlie, Kristin; Getz, GadHow somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA-sequencing data from ~6,700 samples across 29 normal tissues reveals multiple somatic variants, demonstrating that macroscopic clones can be found in many normal tissues. We confirm that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, suggesting that environmental factors can promote somatic mosaicism. Mutation burden is associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over time and number of cell divisions. Finally, we find that normal tissues harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as the basis to associate clonal expansion with environmental factors, aging and risk of disease.Publication Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases(Nature Publishing Group, 2015) Yang, Jialiang; Huang, Tao; Petralia, Francesca; Long, Quan; Zhang, Bin; Argmann, Carmen; Zhao, Yong; Mobbs, Charles V.; Schadt, Eric E.; Zhu, Jun; Tu, Zhidong; Ardlie, Kristin G.; Deluca, David S.; Segrè, Ayellet V.; Sullivan, Timothy J.; Young, Taylor R.; Gelfand, Ellen T.; Trowbridge, Casandra A.; Maller, Julian B.; Tukiainen, Taru; Lek, Monkol; Ward, Lucas D.; Kheradpour, Pouya; Iriarte, Benjamin; Meng, Yan; Palmer, Cameron D.; Winckler, Wendy; Hirschhorn, Joel; Kellis, Manolis; MacArthur, Daniel; Getz, Gad; Shablin, Andrey A.; Li, Gen; Zhou, Yi-Hui; Nobel, Andrew B.; Rusyn, Ivan; Wright, Fred A.; Lappalainen, Tuuli; Ferreira, Pedro G.; Ongen, Halit; Rivas, Manuel A.; Battle, Alexis; Mostafavi, Sara; Monlong, Jean; Sammeth, Michael; Mele, Marta; Reverter, Ferran; Goldman, Jakob; Koller, Daphne; Guigo, Roderic; McCarthy, Mark I.; Dermitzakis, Emmanouil T.; Gamazon, Eric R.; Konkashbaev, Anuar; Nicolae, Dan L.; Cox, Nancy J.; Flutre, Timothée; Wen, Xiaoquan; Stephens, Matthew; Pritchard, Jonathan K.; Lin, Luan; Liu, Jun; Brown, Amanda; Mestichelli, Bernadette; Tidwell, Denee; Lo, Edmund; Salvatore, Mike; Shad, Saboor; Thomas, Jeffrey A.; Lonsdale, John T.; Choi, Christopher; Karasik, Ellen; Ramsey, Kimberly; Moser, Michael T.; Foster, Barbara A.; Gillard, Bryan M.; Syron, John; Fleming, Johnelle; Magazine, Harold; Hasz, Rick; Walters, Gary D.; Bridge, Jason P.; Miklos, Mark; Sullivan, Susan; Barker, Laura K.; Traino, Heather; Mosavel, Magboeba; Siminoff, Laura A.; Valley, Dana R.; Rohrer, Daniel C.; Jewel, Scott; Branton, Philip; Sobin, Leslie H.; Qi, Liqun; Hariharan, Pushpa; Wu, Shenpei; Tabor, David; Shive, Charles; Smith, Anna M.; Buia, Stephen A.; Undale, Anita H.; Robinson, Karna L.; Roche, Nancy; Valentino, Kimberly M.; Britton, Angela; Burges, Robin; Bradbury, Debra; Hambright, Kenneth W.; Seleski, John; Korzeniewski, Greg E.; Erickson, Kenyon; Marcus, Yvonne; Tejada, Jorge; Taherian, Mehran; Lu, Chunrong; Robles, Barnaby E.; Basile, Margaret; Mash, Deborah C.; Volpi, Simona; Struewing, Jeff; Temple, Gary F.; Boyer, Joy; Colantuoni, Deborah; Little, Roger; Koester, Susan; Carithers, NCI Latarsha J.; Moore, Helen M.; Guan, Ping; Compton, Carolyn; Sawyer, Sherilyn J.; Demchok, Joanne P.; Vaught, Jimmie B.; Rabiner, Chana A.; Lockhart, Nicole C.Aging is one of the most important biological processes and is a known risk factor for many age-related diseases in human. Studying age-related transcriptomic changes in tissues across the whole body can provide valuable information for a holistic understanding of this fundamental process. In this work, we catalogue age-related gene expression changes in nine tissues from nearly two hundred individuals collected by the Genotype-Tissue Expression (GTEx) project. In general, we find the aging gene expression signatures are very tissue specific. However, enrichment for some well-known aging components such as mitochondria biology is observed in many tissues. Different levels of cross-tissue synchronization of age-related gene expression changes are observed, and some essential tissues (e.g., heart and lung) show much stronger “co-aging” than other tissues based on a principal component analysis. The aging gene signatures and complex disease genes show a complex overlapping pattern and only in some cases, we see that they are significantly overlapped in the tissues affected by the corresponding diseases. In summary, our analyses provide novel insights to the co-regulation of age-related gene expression in multiple tissues; it also presents a tissue-specific view of the link between aging and age-related diseases.Publication Analysis of protein-coding genetic variation in 60,706 humans(2016) Lek, Monkol; Karczewski, Konrad; Minikel, Eric; Samocha, Kaitlin E.; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark; MacArthur, DanielSummary Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of truncating variants with 72% having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.Publication NetSig: network-based discovery from cancer genomes(2018) Horn, Heiko; Lawrence, Michael; Chouinard, Candace R.; Shrestha, Yashaswi; Hu, Jessica Xin; Worstell, Elizabeth; Shea, Emily; Ilic, Nina; Kim, Eejung; Kamburov, Atanas; Kashani, Alireza; Hahn, William; Campbell, Joshua D.; Boehm, Jesse S.; Getz, Gad; Lage, KasperApproaches that integrate molecular network information and tumor genome data could complement gene-based statistical tests to identify likely new cancer genes, but are challenging to validate at scale and their predictive value remains unclear. We developed a robust statistic (NetSig) that integrates protein interaction networks and data from 4,742 tumor exomes and used it to accurately classify known driver genes in 60% of tested tumor types and to predict 62 new candidates. We designed a quantitative experimental framework to compare the in vivo tumorigenic potential of NetSig candidates, known oncogenes and random genes in mice showing that NetSig candidates induce tumors at rates comparable to known oncogenes and 10-fold higher than random genes. By reanalyzing nine tumor-inducing NetSig candidates in 242 patients with oncogene-negative lung adenocarcinomas, we find that two (AKT2 and TFDP2) are significantly amplified. Overall, we illustrate a scalable integrated computational and experimental workflow to expand discovery from cancer genomes.Publication Mutations driving CLL and their evolution in progression and relapse(2015) Landau, Dan A.; Tausch, Eugen; Taylor-Weiner, Amaro N; Stewart, Chip; Reiter, Johannes G.; Bahlo, Jasmin; Kluth, Sandra; Bozic, Ivana; Lawrence, Mike; Böttcher, Sebastian; Carter, Scott; Cibulskis, Kristian; Mertens, Daniel; Sougnez, Carrie; Rosenberg, Mara; Hess, Julian M.; Edelmann, Jennifer; Kless, Sabrina; Kneba, Michael; Ritgen, Matthias; Fink, Anna; Fischer, Kirsten; Gabriel, Stacey; Lander, Eric; Nowak, Martin; Döhner, Hartmut; Hallek, Michael; Neuberg, Donna; Getz, Gad; Stilgenbauer, Stephan; Wu, CatherineSUMMARY Which genetic alterations drive tumorigenesis and how they evolve over the course of disease and therapy are central questions in cancer biology. We identify 44 recurrently mutated genes and 11 recurrent somatic copy number variations through whole-exome sequencing of 538 chronic lymphocytic leukemia (CLL) and matched germline DNA samples, 278 of which were collected in a prospective clinical trial. These include previously unrecognized cancer drivers (RPS15, IKZF3) and collectively identify RNA processing and export, MYC activity and MAPK signaling as central pathways involved in CLL. Clonality analysis of this large dataset further enabled reconstruction of temporal relationships between driver events. Direct comparison between matched pre-treatment and relapse samples from 59 patients demonstrated highly frequent clonal evolution. Thus, large sequencing datasets of clinically informative samples enable the discovery of novel cancer genes and the network of relationships between the driver events and their impact on disease relapse and clinical outcome.Publication Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics(Nature Publishing Group UK, 2018) Barbeira, Alvaro N.; Dickinson, Scott P.; Bonazzola, Rodrigo; Zheng, Jiamao; Wheeler, Heather E.; Torres, Jason M.; Torstenson, Eric S.; Shah, Kaanan P.; Garcia, Tzintzuni; Edwards, Todd L.; Stahl, Eli A.; Huckins, Laura M.; Aguet, François; Ardlie, Kristin G.; Cummings, Beryl; Gelfand, Ellen T.; Getz, Gad; Hadley, Kane; Handsaker, Robert; Huang, Katherine H.; Kashin, Seva; Karczewski, Konrad; Lek, Monkol; Li, Xiao; MacArthur, Daniel; Nedzel, Jared L.; Nguyen, Duyen T.; Noble, Michael S.; Segre, Ayellet; Trowbridge, Casandra A.; Tukiainen, Taru; Abell, Nathan S.; Balliu, Brunilda; Barshir, Ruth; Basha, Omer; Battle, Alexis; Bogu, Gireesh K.; Brown, Andrew; Brown, Christopher D.; Castel, Stephane E.; Chen, Lin S.; Chiang, Colby; Conrad, Donald F.; Damani, Farhan N.; Davis, Joe R.; Delaneau, Olivier; Dermitzakis, Emmanouil T.; Engelhardt, Barbara E.; Eskin, Eleazar; Ferreira, Pedro G.; Frésard, Laure; Gamazon, Eric R.; Garrido-Martín, Diego; Gewirtz, Ariel D. H.; Gliner, Genna; Gloudemans, Michael J.; Guigo, Roderic; Hall, Ira M.; Han, Buhm; He, Yuan; Hormozdiari, Farhad; Howald, Cedric; Jo, Brian; Kang, Eun Yong; Kim, Yungil; Kim-Hellmuth, Sarah; Lappalainen, Tuuli; Li, Gen; Li, Xin; Liu, Boxiang; Mangul, Serghei; McCarthy, Mark I.; McDowell, Ian C.; Mohammadi, Pejman; Monlong, Jean; Montgomery, Stephen B.; Muñoz-Aguirre, Manuel; Ndungu, Anne W.; Nobel, Andrew B.; Oliva, Meritxell; Ongen, Halit; Palowitch, John J.; Panousis, Nikolaos; Papasaikas, Panagiotis; Park, YoSon; Parsana, Princy; Payne, Anthony J.; Peterson, Christine B.; Quan, Jie; Reverter, Ferran; Sabatti, Chiara; Saha, Ashis; Sammeth, Michael; Scott, Alexandra J.; Shabalin, Andrey A.; Sodaei, Reza; Stephens, Matthew; Stranger, Barbara E.; Strober, Benjamin J.; Sul, Jae Hoon; Tsang, Emily K.; Urbut, Sarah; van de Bunt, Martijn; Wang, Gao; Wen, Xiaoquan; Wright, Fred A.; Xi, Hualin S.; Yeger-Lotem, Esti; Zappala, Zachary; Zaugg, Judith B.; Zhou, Yi-Hui; Akey, Joshua M.; Bates, Daniel; Chan, Joanne; Claussnitzer, Melina; Demanelis, Kathryn; Diegel, Morgan; Doherty, Jennifer A.; Feinberg, Andrew P.; Fernando, Marian S.; Halow, Jessica; Hansen, Kasper D.; Haugen, Eric; Hickey, Peter F.; Hou, Lei; Jasmine, Farzana; Jian, Ruiqi; Jiang, Lihua; Johnson, Audra; Kaul, Rajinder; Kellis, Manolis; Kibriya, Muhammad G.; Lee, Kristen; Li, Jin Billy; Li, Qin; Lin, Jessica; Lin, Shin; Linder, Sandra; Linke, Caroline; Liu, Yaping; Maurano, Matthew T.; Molinie, Benoit; Nelson, Jemma; Neri, Fidencio J.; Park, Yongjin; Pierce, Brandon L.; Rinaldi, Nicola J.; Rizzardi, Lindsay F.; Sandstrom, Richard; Skol, Andrew; Smith, Kevin S.; Snyder, Michael P.; Stamatoyannopoulos, John; Tang, Hua; Wang, Li; Wang, Meng; Van Wittenberghe, Nicholas; Wu, Fan; Zhang, Rui; Nierras, Concepcion R.; Branton, Philip A.; Carithers, Latarsha J.; Guan, Ping; Moore, Helen M.; Rao, Abhi; Vaught, Jimmie B.; Gould, Sarah E.; Lockart, Nicole C.; Martin, Casey; Struewing, Jeffery P.; Volpi, Simona; Addington, Anjene M.; Koester, Susan E.; Little, A. Roger; Brigham, Lori E.; Hasz, Richard; Hunter, Marcus; Johns, Christopher; Johnson, Mark; Kopen, Gene; Leinweber, William F.; Lonsdale, John T.; McDonald, Alisa; Mestichelli, Bernadette; Myer, Kevin; Roe, Brian; Salvatore, Michael; Shad, Saboor; Thomas, Jeffrey A.; Walters, Gary; Washington, Michael; Wheeler, Joseph; Bridge, Jason; Foster, Barbara A.; Gillard, Bryan M.; Karasik, Ellen; Kumar, Rachna; Miklos, Mark; Moser, Michael T.; Jewell, Scott D.; Montroy, Robert G.; Rohrer, Daniel C.; Valley, Dana R.; Davis, David A.; Mash, Deborah C.; Undale, Anita H.; Smith, Anna M.; Tabor, David E.; Roche, Nancy V.; McLean, Jeffrey A.; Vatanian, Negin; Robinson, Karna L.; Sobin, Leslie; Barcus, Mary E.; Valentino, Kimberly M.; Qi, Liqun; Hunter, Steven; Hariharan, Pushpa; Singh, Shilpi; Um, Ki Sung; Matose, Takunda; Tomaszewski, Maria M.; Barker, Laura K.; Mosavel, Maghboeba; Siminoff, Laura A.; Traino, Heather M.; Flicek, Paul; Juettemann, Thomas; Ruffier, Magali; Sheppard, Dan; Taylor, Kieron; Trevanion, Stephen J.; Zerbino, Daniel R.; Craft, Brian; Goldman, Mary; Haeussler, Maximilian; Kent, W. James; Lee, Christopher M.; Paten, Benedict; Rosenbloom, Kate R.; Vivian, John; Zhu, Jingchun; Nicolae, Dan L.; Cox, Nancy J.; Im, Hae KyungScalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.Publication Resolving the phylogenetic origin of glioblastoma via multifocal genomic analysis of pre-treatment and treatment-resistant autopsy specimens(Nature Publishing Group UK, 2017) Brastianos, Priscilla; Nayyar, Naema; Rosebrock, Daniel; Leshchiner, Ignaty; Gill, Corey M.; Livitz, Dimitri; Bertalan, Mia S.; D’Andrea, Megan; Hoang, Kaitlin; Aquilanti, Elisa; Chukwueke, Ugonma; Kaneb, Andrew; Chi, Andrew; Plotkin, Scott; Gerstner, Elizabeth; Frosch, Mathew P.; Suva, Mario; Cahill, Daniel; Getz, Gad; Batchelor, TracyGlioblastomas are malignant neoplasms composed of diverse cell populations. This intratumoral diversity has an underlying architecture, with a hierarchical relationship through clonal evolution from a common ancestor. Therapies are limited by emergence of resistant subclones from this phylogenetic reservoir. To characterize this clonal ancestral origin of recurrent tumors, we determined phylogenetic relationships using whole exome sequencing of pre-treatment IDH1/2 wild-type glioblastoma specimens, matched to post-treatment autopsy samples (n = 9) and metastatic extracranial post-treatment autopsy samples (n = 3). We identified “truncal” genetic events common to the evolutionary ancestry of the initial specimen and later recurrences, thereby inferring the identity of the precursor cell population. Mutations were identified in a subset of cases in known glioblastoma genes such as NF1(n = 3), TP53(n = 4) and EGFR(n = 5). However, by phylogenetic analysis, there were no protein-coding mutations as recurrent truncal events across the majority of cases. In contrast, whole copy-loss of chromosome 10 (12 of 12 cases), copy-loss of chromosome 9p21 (11 of 12 cases) and copy-gain in chromosome 7 (10 of 12 cases) were identified as shared events in the majority of cases. Strikingly, mutations in the TERT promoter were also identified as shared events in all evaluated pairs (9 of 9). Thus, we define four truncal non-coding genomic alterations that represent early genomic events in gliomagenesis, that identify the persistent cellular reservoir from which glioblastoma recurrences emerge. Therapies to target these key early genomic events are needed. These findings offer an evolutionary explanation for why precision therapies that target protein-coding mutations lack efficacy in GBM.Publication Rare Germline Variants in ATM Are Associated with Chronic Lymphocytic Leukemia(2017) Tiao, Grace; Improgo, Ma. Reina; Kasar, Siddha; Poh, Weijie; Kamburov, Atanas; Landau, Dan-Avi; Tausch, Eugen; Taylor-Weiner, Amaro; Cibulskis, Carrie; Bahl, Samira; Fernandes, Stacey M.; Hoang, Kevin; Rheinbay, Esther; Kim, Haesook T.; Bahlo, Jasmin; Robrecht, Sandra; Fischer, Kirsten; Hallek, Michael; Gabriel, Stacey; Lander, Eric; Stilgenbauer, Stephan; Wu, Catherine; Kiezun, Adam; Getz, Gad; Brown, JenniferPublication Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples(2013) Cibulskis, Kristian; Lawrence, Michael S.; Carter, Scott L.; Sivachenko, Andrey; Jaffe, David; Sougnez, Carrie; Gabriel, Stacey; Meyerson, Matthew; Lander, Eric; Getz, GadDetection of somatic point substitutions is a key step in characterizing the cancer genome. Mutations in cancer are rare (0.1–100/Mb) and often occur only in a subset of the sequenced cells, either due to contamination by normal cells or due to tumor heterogeneity. Consequently, mutation calling methods need to be both specific, avoiding false positives, and sensitive to detect clonal and sub-clonal mutations. The decreased sensitivity of existing methods for low allelic fraction mutations highlights the pressing need for improved and systematically evaluated mutation detection methods. Here we present MuTect, a method based on a Bayesian classifier designed to detect somatic mutations with very low allele-fractions, requiring only a few supporting reads, followed by a set of carefully tuned filters that ensure high specificity. We also describe novel benchmarking approaches, which use real sequencing data to evaluate the sensitivity and specificity as a function of sequencing depth, base quality and allelic fraction. Compared with other methods, MuTect has higher sensitivity with similar specificity, especially for mutations with allelic fractions as low as 0.1 and below, making MuTect particularly useful for studying cancer subclones and their evolution in standard exome and genome sequencing data.Publication Mutational heterogeneity in cancer and the search for new cancer genes(2014) Lawrence, Michael S.; Stojanov, Petar; Polak, Paz; Kryukov, Gregory V.; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Stewart, Chip; Mermel, Craig; Roberts, Steven A.; Kiezun, Adam; Hammerman, Peter S.; McKenna, Aaron; Drier, Yotam; Zou, Lihua; Ramos, Alex H.; Pugh, Trevor J.; Stransky, Nicolas; Helman, Elena; Kim, Jaegil; Sougnez, Carrie; Ambrogio, Lauren; Nickerson, Elizabeth; Shefler, Erica; Cortés, Maria L.; Auclair, Daniel; Saksena, Gordon; Voet, Douglas; Noble, Michael; DiCara, Daniel; Lin, Pei; Lichtenstein, Lee; Heiman, David I.; Fennell, Timothy; Imielinski, Marcin; Hernandez, Bryan; Hodis, Eran; Baca, Sylvan; Dulak, Austin M.; Lohr, Jens; Landau, Dan-Avi; Wu, Catherine; Melendez-Zajgla, Jorge; Hidalgo-Miranda, Alfredo; Koren, Amnon; McCarroll, Steven; Mora, Jaume; Crompton, Brian; Onofrio, Robert; Parkin, Melissa; Winckler, Wendy; Ardlie, Kristin; Gabriel, Stacey B.; Roberts, Charles W. M.; Biegel, Jaclyn A.; Stegmaier, Kimberly; Bass, Adam; Garraway, Levi; Meyerson, Matthew; Golub, Todd; Gordenin, Dmitry A.; Sunyaev, Shamil; Lander, Eric; Getz, GadMajor international projects are now underway aimed at creating a comprehensive catalog of all genes responsible for the initiation and progression of cancer. These studies involve sequencing of matched tumor–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here, we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false positive findings that overshadow true driver events. Here, we show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumor-normal pairs and discover extraordinary variation in (i) mutation frequency and spectrum within cancer types, which shed light on mutational processes and disease etiology, and (ii) mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and allow true cancer genes to rise to attention.