Person: Daly, Mark
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication Testing For An Unusual Distribution Of Rare Variants
(Public Library of Science, 2011) Neale, Benjamin; Rivas, Manuel A.; Voight, Benjamin F.; Altshuler, David; Devlin, Bernie; Orho-Melander, Marju; Kathiresan, Sekar; Purcell, Shaun M.; Roeder, Kathryn; Daly, MarkTechnological advances make it possible to use high-throughput sequencing as a primary discovery tool of medical genetics, specifically for assaying rare variation. Still this approach faces the analytic challenge that the influence of very rare variants can only be evaluated effectively as a group. A further complication is that any given rare variant could have no effect, could increase risk, or could be protective. We propose here the C-alpha test statistic as a novel approach for testing for the presence of this mixture of effects across a set of rare variants. Unlike existing burden tests, C-alpha, by testing the variance rather than the mean, maintains consistent power when the target set contains both risk and protective variants. Through simulations and analysis of case/control data, we demonstrate good power relative to existing methods that assess the burden of rare variants in individuals.
Publication Genome-Wide Association Studies in an Isolated Founder Population from the Pacific Island of Kosrae
(Public Library of Science, 2009) Lowe, Jennifer K.; Maller, Julian B.; Pe'er, Itsik; Neale, Benjamin M.; Salit, Jacqueline; Kenny, Eimear E.; Shea, Jessica L.; Burkhardt, Ralph; Ji, Weizhen; Noel, Martha; Foo, Jia Nee; Blundell, Maude L.; Skilling, Vita; Garcia, Laura; Sullivan, Marcia L.; Lee, Heather E.; Labek, Anna; Ferdowsian, Hope; Auerbach, Steven B.; Lifton, Richard P.; Breslow, Jan L.; Stoffel, Markus; Smith, J. Gustav; Newton-Cheh, Christopher; Daly, Mark; Altshuler, David; Friedman, Jeffrey M.It has been argued that the limited genetic diversity and reduced allelic heterogeneity observed in isolated founder populations facilitates discovery of loci contributing to both Mendelian and complex disease. A strong founder effect, severe isolation, and substantial inbreeding have dramatically reduced genetic diversity in natives from the island of Kosrae, Federated States of Micronesia, who exhibit a high prevalence of obesity and other metabolic disorders. We hypothesized that genetic drift and possibly natural selection on Kosrae might have increased the frequency of previously rare genetic variants with relatively large effects, making these alleles readily detectable in genome-wide association analysis. However, mapping in large, inbred cohorts introduces analytic challenges, as extensive relatedness between subjects violates the assumptions of independence upon which traditional association test statistics are based. We performed genome-wide association analysis for 15 quantitative traits in 2,906 members of the Kosrae population, using novel approaches to manage the extreme relatedness in the sample. As positive controls, we observe association to known loci for plasma cholesterol, triglycerides, and C-reactive protein and to a compelling candidate loci for thyroid stimulating hormone and fasting plasma glucose. We show that our study is well powered to detect common alleles explaining ≥5% phenotypic variance. However, no such large effects were observed with genome-wide significance, arguing that even in such a severely inbred population, common alleles typically have modest effects. Finally, we show that a majority of common variants discovered in Caucasians have indistinguishable effect sizes on Kosrae, despite the major differences in population genetics and environment.
Publication Autosomal Monoallelic Expression in the Mouse
(BioMed Central, 2012) Zwemer, Lillian M; Zak, Alexander; Thompson, Benjamin R; Kirby, Andrew; Daly, Mark; Chess, Andrew; Gimelbrant, AlexanderBackground: Random monoallelic expression defines an unusual class of genes displaying random choice for expression between the maternal and paternal alleles. Once established, the allele-specific expression pattern is stably maintained and mitotically inherited. Examples of random monoallelic genes include those found on the X-chromosome and a subset of autosomal genes, which have been most extensively studied in humans. Here, we report a genome-wide analysis of random monoallelic expression in the mouse. We used high density mouse genome polymorphism mapping arrays to assess allele-specific expression in clonal cell lines derived from heterozygous mouse strains. Results: Over 1,300 autosomal genes were assessed for allele-specific expression, and greater than 10% of them showed random monoallelic expression. When comparing mouse and human, the number of autosomal orthologs demonstrating random monoallelic expression in both organisms was greater than would be expected by chance. Random monoallelic expression on the mouse autosomes is broadly similar to that in human cells: it is widespread throughout the genome, lacks chromosome-wide coordination, and varies between cell types. However, for some mouse genes, there appears to be skewing, in some ways resembling skewed X-inactivation, wherein one allele is more frequently active. Conclusions: These data suggest that autosomal random monoallelic expression was present at least as far back as the last common ancestor of rodents and primates. Random monoallelic expression can lead to phenotypic variation beyond the phenotypic variation dictated by genotypic variation. Thus, it is important to take into account random monoallelic expression when examining genotype-phenotype correlation.
Publication Complex Reorganization and Predominant Non-Homologous Repair Following Chromosomal Breakage in Karyotypically Balanced Germline Rearrangements and Transgenic Integration
(Nature Publishing Group, 2012) Chiang, Colby; Jacobsen, Jessie C.; Ernst, Carl; Hanscom, Carrie; Heilbut, Adrian; Blumenthal, Ian; Mills, Ryan E.; Kirby, Andrew; Rudiger, Skye R.; McLaughlan, Clive J.; Bawden, C. Simon; Reid, Suzanne J.; Faull, Richard L. M.; Snell, Russell G.; Hall, Ira M.; Ohsumi, Toshiro K.; Shen, Yiping; Borowsky, Mark L; Daly, Mark; Lee, Charles; Morton, Cynthia; MacDonald, Marcy; Gusella, James; Talkowski, Michael; Lindgren, Amelia M.We defined the genetic landscape of balanced chromosomal rearrangements at nucleotide resolution by sequencing 141 breakpoints from cytogenetically-interpreted translocations and inversions. We confirm that the recently described phenomenon of “chromothripsis” (massive chromosomal shattering and reorganization) is not unique to cancer cells but also occurs in the germline where it can resolve to a karyotypically balanced state with frequent inversions. We detected a high incidence of complex rearrangements (19.2%) and substantially less reliance on microhomology (31%) than previously observed in benign CNVs. We compared these results to experimentally-generated DNA breakage-repair by sequencing seven transgenic animals, and revealed extensive rearrangement of the transgene and host genome with similar complexity to human germline alterations. Inversion is the most common rearrangement, suggesting that a combined mechanism involving template switching and non-homologous repair mediates the formation of balanced complex rearrangements that are viable, stably replicated and transmitted unaltered to subsequent generations.
Publication Patterns and rates of exonic de novo mutations in autism spectrum disorders
(2013) Neale, Benjamin; Kou, Yan; Liu, Li; Ma'ayan, Avi; Samocha, Kaitlin E.; Sabo, Aniko; Lin, Chiao-Feng; Stevens, Christine; Wang, Li-San; Makarov, Vladimir; Polak, Paz; Yoon, Seungtai; Maguire, Jared; Crawford, Emily L.; Campbell, Nicholas G.; Geller, Evan T.; Valladares, Otto; Shafer, Chad; Liu, Han; Zhao, Tuo; Cai, Guiqing; Lihm, Jayon; Dannenfelser, Ruth; Jabado, Omar; Peralta, Zuleyma; Nagaswamy, Uma; Muzny, Donna; Reid, Jeffrey G.; Newsham, Irene; Wu, Yuanqing; Lewis, Lora; Han, Yi; Voight, Benjamin F.; Lim, Elaine; Rossin, Elizabeth; Kirby, Andrew; Flannick, Jason; Fromer, Menachem; Shakir, Khalid; Fennell, Tim; Garimella, Kiran; Banks, Eric; Poplin, Ryan; Gabriel, Stacey; DePristo, Mark; Wimbish, Jack R.; Boone, Braden E.; Levy, Shawn E.; Betancur, Catalina; Sunyaev, Shamil; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H.; Devlin, Bernie; Gibbs, Richard A.; Roeder, Kathryn; Schellenberg, Gerard D.; Sutcliffe, James S.; Daly, MarkAutism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified1,2. To identify further genetic risk factors, we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n= 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant and the overall rate of mutation is only modestly higher than the expected rate. In contrast, there is significantly enriched connectivity among the proteins encoded by genes harboring de novo missense or nonsense mutations, and excess connectivity to prior ASD genes of major effect, suggesting a subset of observed events are relevant to ASD risk. The small increase in rate of de novo events, when taken together with the connections among the proteins themselves and to ASD, are consistent with an important but limited role for de novo point mutations, similar to that documented for de novo copy number variants. Genetic models incorporating these data suggest that the majority of observed de novo events are unconnected to ASD, those that do confer risk are distributed across many genes and are incompletely penetrant (i.e., not necessarily causal). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5 to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favor of CHD8 and KATNAL2 as genuine autism risk factors.
Publication Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls
(Public Library of Science, 2013) Liu, Li; Sabo, Aniko; Neale, Benjamin; Nagaswamy, Uma; Stevens, Christine; Lim, Elaine; Bodea, Corneliu A.; Muzny, Donna; Reid, Jeffrey G.; Banks, Eric; Coon, Hillary; DePristo, Mark; Dinh, Huyen; Fennel, Tim; Flannick, Jason; Gabriel, Stacey; Garimella, Kiran; Gross, Shannon; Hawes, Alicia; Lewis, Lora; Makarov, Vladimir; Maguire, Jared; Newsham, Irene; Poplin, Ryan; Ripke, Stephan; Shakir, Khalid; Samocha, Kaitlin E.; Wu, Yuanqing; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H., Jr.; Devlin, Bernie; Schellenberg, Gerard D.; Sutcliffe, James S.; Daly, Mark; Gibbs, Richard A.; Roeder, KathrynWe report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD.
Publication Lack of Association of Rare Functional Variants in TSC1/TSC2 Genes with Autism Spectrum Disorder
(BioMed Central, 2013) Bahl, Samira; Chiang, Colby; Beauchamp, Roberta L; Neale, Benjamin; Daly, Mark; Gusella, James; Talkowski, Michael; Ramesh, VijayaBackground: Autism spectrum disorder (ASD) is reported in 30 to 60% of patients with tuberous sclerosis complex (TSC) but shared genetic mechanisms that exist between TSC-associated ASD and idiopathic ASD have yet to be determined. Through the small G-protein Rheb, the TSC proteins, hamartin and tuberin, negatively regulate mammalian target of rapamycin complex 1 (mTORC1) signaling. It is well established that mTORC1 plays a pivotal role in neuronal translation and connectivity, so dysregulation of mTORC1 signaling could be a common feature in many ASDs. Pam, an E3 ubiquitin ligase, binds to TSC proteins and regulates mTORC1 signaling in the CNS, and the FBXO45-Pam ubiquitin ligase complex plays an essential role in neurodevelopment by regulating synapse formation and growth. Since mounting evidence has established autism as a disorder of the synapses, we tested whether rare genetic variants in TSC1, TSC2, MYCBP2, RHEB and FBXO45, genes that regulate mTORC1 signaling and/or play a role in synapse development and function, contribute to the pathogenesis of idiopathic ASD. Methods: Exons and splice junctions of TSC1, TSC2, MYCBP2, RHEB and FBXO45 were resequenced for 300 ASD trios from the Simons Simplex Collection (SSC) using a pooled PCR amplification and next-generation sequencing strategy, targeted to the discovery of deleterious coding variation. These detected, potentially functional, variants were confirmed by Sanger sequencing of the individual samples comprising the pools in which they were identified. Results: We identified a total of 23 missense variants in MYCBP2, TSC1 and TSC2. These variants exhibited a near equal distribution between the proband and parental pools, with no statistical excess in ASD cases (P > 0.05). All proband variants were inherited. No putative deleterious variants were confirmed in RHEB and FBXO45. Three intronic variants, identified as potential splice defects in MYCBP2 did not show aberrant splicing upon RNA assay. Overall, we did not find an over-representation of ASD causal variants in the genes studied to support them as contributors to autism susceptibility. Conclusions: We did not observe an enrichment of rare functional variants in TSC1 and TSC2 genes in our sample set of 300 trios.
Publication Mapping Copy Number Variation by Population Scale Genome Sequencing
(Nature Publishing Group, 2011) Mills, Ryan Edward; Handsaker, Robert; Korn, Joshua; Nemesh, James; Shi, Xinghua; Lee, Charles; McCarroll, Steven; Altshuler, David; Gabriel, Stacey B.; Lander, Eric; Ambrogio, Lauren; Bloom, Toby; Cibulskis, Kristian; Fennell, Tim J.; Jaffe, David B.; Shefler, Erica; Sougnez, Carrie L.; Daly, Mark; DePristo, Mark A.; Ball, Aaron D.; Banks, Eric; Browning, Brian L.; Garimella, Kiran V.; Grossman, Sharon; Hanna, Matt; Hartl, Chris; Kernytsky, Andrew M.; Li, Heng; Maguire, Jared R.; McKenna, Aaron; Philippakis, Anthony Andrew; Poplin, Ryan E.; Price, Alkes; Rivas, Manuel A.; Sabeti, Pardis; Schaffner, Stephen; Shlyakhter, Ilya; Wilkinson, JaneGenomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
Publication A Map of Human Genome Variation from Population Scale Sequencing
(Nature Publishing Group, 2010) Altshuler, David; Lander, Eric; Ambrogio, Lauren; Bloom, Toby; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Jaffe, David B.; Shefler, Erica; Sougnez, Carrie L.; Lee, Charles; Mills, Ryan Edward; Shi, Xinghua; Daly, Mark; DePristo, Mark A.; Ball, Aaron D.; Banks, Eric; Browning, Brian L.; Garimella, Kiran V.; Grossman, Sharon; Handsaker, Robert; Hanna, Matt; Hartl, Chris; Kernytsky, Andrew M.; Korn, Joshua M.; Li, Heng; Maguire, Jared R.; McCarroll, Steven; Nemesh, James C.; McKenna, Aaron; Philippakis, Anthony Andrew; Poplin, Ryan E.; Price, Alkes; Rivas, Manuel A.; Sabeti, Pardis; Schaffner, Stephen; Shlyakhter, IlyaThe 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother–father–child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately (10^{−8}) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
Publication Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits
(Public Library of Science, 2010) Segrè, Ayellet V.; Groop, Leif; Mootha, Vamsi; Daly, Mark; Altshuler, DavidMitochondrial dysfunction has been observed in skeletal muscle of people with diabetes and insulin-resistant individuals. Furthermore, inherited mutations in mitochondrial DNA can cause a rare form of diabetes. However, it is unclear whether mitochondrial dysfunction is a primary cause of the common form of diabetes. To date, common genetic variants robustly associated with type 2 diabetes (T2D) are not known to affect mitochondrial function. One possibility is that multiple mitochondrial genes contain modest genetic effects that collectively influence T2D risk. To test this hypothesis we developed a method named Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA; http://www.broadinstitute.org/mpg/magenta). MAGENTA, in analogy to Gene Set Enrichment Analysis, tests whether sets of functionally related genes are enriched for associations with a polygenic disease or trait. MAGENTA was specifically designed to exploit the statistical power of large genome-wide association (GWA) study meta-analyses whose individual genotypes are not available. This is achieved by combining variant association p-values into gene scores and then correcting for confounders, such as gene size, variant number, and linkage disequilibrium properties. Using simulations, we determined the range of parameters for which MAGENTA can detect associations likely missed by single-marker analysis. We verified MAGENTA's performance on empirical data by identifying known relevant pathways in lipid and lipoprotein GWA meta-analyses. We then tested our mitochondrial hypothesis by applying MAGENTA to three gene sets: nuclear regulators of mitochondrial genes, oxidative phosphorylation genes, and ∼1,000 nuclear-encoded mitochondrial genes. The analysis was performed using the most recent T2D GWA meta-analysis of 47,117 people and meta-analyses of seven diabetes-related glycemic traits (up to 46,186 non-diabetic individuals). This well-powered analysis found no significant enrichment of associations to T2D or any of the glycemic traits in any of the gene sets tested. These results suggest that common variants affecting nuclear-encoded mitochondrial genes have at most a small genetic contribution to T2D susceptibility.