Person: Lek, Monkol
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication Distribution and Medical Impact of Loss-of-Function Variants in the Finnish Founder Population
(Public Library of Science, 2014) Lim, Elaine T.; Würtz, Peter; Havulinna, Aki S.; Palta, Priit; Tukiainen, Taru; Rehnström, Karola; Esko, Tõnu; Mägi, Reedik; Inouye, Michael; Lappalainen, Tuuli; Chan, Yingleong; Salem, Rany M.; Lek, Monkol; Flannick, Jason; Sim, Xueling; Manning, Alisa; Ladenvall, Claes; Bumpstead, Suzannah; Hämäläinen, Eija; Aalto, Kristiina; Maksimow, Mikael; Salmi, Marko; Blankenberg, Stefan; Ardissino, Diego; Shah, Svati; Horne, Benjamin; McPherson, Ruth; Hovingh, Gerald K.; Reilly, Muredach P.; Watkins, Hugh; Goel, Anuj; Farrall, Martin; Girelli, Domenico; Reiner, Alex P.; Stitziel, Nathan O.; Kathiresan, Sekar; Gabriel, Stacey; Barrett, Jeffrey C.; Lehtimäki, Terho; Laakso, Markku; Groop, Leif; Kaprio, Jaakko; Perola, Markus; McCarthy, Mark I.; Boehnke, Michael; Altshuler, David; Lindgren, Cecilia M.; Hirschhorn, Joel N.; Metspalu, Andres; Freimer, Nelson B.; Zeller, Tanja; Jalkanen, Sirpa; Koskinen, Seppo; Raitakari, Olli; Durbin, Richard; MacArthur, Daniel; Salomaa, Veikko; Ripatti, Samuli; Daly, Mark; Palotie, AarnoExome sequencing studies in complex diseases are challenged by the allelic heterogeneity, large number and modest effect sizes of associated variants on disease risk and the presence of large numbers of neutral variants, even in phenotypically relevant genes. Isolated populations with recent bottlenecks offer advantages for studying rare variants in complex diseases as they have deleterious variants that are present at higher frequencies as well as a substantial reduction in rare neutral variation. To explore the potential of the Finnish founder population for studying low-frequency (0.5–5%) variants in complex diseases, we compared exome sequence data on 3,000 Finns to the same number of non-Finnish Europeans and discovered that, despite having fewer variable sites overall, the average Finn has more low-frequency loss-of-function variants and complete gene knockouts. We then used several well-characterized Finnish population cohorts to study the phenotypic effects of 83 enriched loss-of-function variants across 60 phenotypes in 36,262 Finns. Using a deep set of quantitative traits collected on these cohorts, we show 5 associations (p<5×10−8) including splice variants in LPA that lowered plasma lipoprotein(a) levels (P = 1.5×10−117). Through accessing the national medical records of these participants, we evaluate the LPA finding via Mendelian randomization and confirm that these splice variants confer protection from cardiovascular disease (OR = 0.84, P = 3×10−4), demonstrating for the first time the correlation between very low levels of LPA in humans with potential therapeutic implications for cardiovascular diseases. More generally, this study articulates substantial advantages for studying the role of rare variation in complex phenotypes in founder populations like the Finns and by combining a unique population genetic history with data from large population cohorts and centralized research access to National Health Registers.
Publication A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans
(Nature Pub. Group, 2014) Timpson, Nicholas J.; Walter, Klaudia; Min, Josine L.; Tachmazidou, Ioanna; Malerba, Giovanni; Shin, So-Youn; Chen, Lu; Futema, Marta; Southam, Lorraine; Iotchkova, Valentina; Cocca, Massimiliano; Huang, Jie; Memari, Yasin; McCarthy, Shane; Danecek, Petr; Muddyman, Dawn; Mangino, Massimo; Menni, Cristina; Perry, John R. B.; Ring, Susan M.; Gaye, Amadou; Dedoussis, George; Farmaki, Aliki-Eleni; Burton, Paul; Talmud, Philippa J.; Gambaro, Giovanni; Spector, Tim D.; Smith, George Davey; Durbin, Richard; Richards, J Brent; Humphries, Steve E.; Zeggini, Eleftheria; Soranzo, Nicole; Al Turki, Saeed; Anderson, Carl; Anney, Richard; Antony, Dinu; Soler Artigas, Maria; Ayub, Muhammad; Balasubramaniam, Senduran; Barrett, Jeffrey C.; Barroso, Inês; Beales, Phil; Bentham, Jamie; Bhattacharya, Shoumo; Birney, Ewan; Blackwood, Douglas; Bobrow, Martin; Bochukova, Elena; Bolton, Patrick; Bounds, Rebecca; Boustred, Chris; Breen, Gerome; Calissano, Mattia; Carss, Keren; Chatterjee, Krishna; Ciampi, Antonio; Cirak, Sebhattin; Clapham, Peter; Clement, Gail; Coates, Guy; Collier, David; Cosgrove, Catherine; Cox, Tony; Craddock, Nick; Crooks, Lucy; Curran, Sarah; Curtis, David; Daly, Allan; Davey Smith, George; Day-Williams, Aaron; Day, Ian N. M.; Down, Thomas; Du, Yuanping; Dunham, Ian; Edkins, Sarah; Ellis, Peter; Evans, David; Faroogi, Sadaf; Fatemifar, Ghazaleh; Fitzpatrick, David R.; Flicek, Paul; Flyod, James; Foley, A Reghan; Franklin, Christopher S; Gallagher, Louise; Gaunt, Tom; Geihs, Matthias; Geschwind, Daniel; Greenwood, Celia; Griffin, Heather; Grozeva, Detelina; Guo, Xueqin; Guo, Xiaosen; Gurling, Hugh; Hart, Deborah; Hendricks, Audrey; Holmans, Peter; Howie, Bryan; Huang, Liren; Hubbard, Tim; Hurles, Matthew E.; Hysi, Pirro; Jackson, David K.; Jamshidi, Yalda; Jing, Tian; Joyce, Chris; Kaye, Jane; Keane, Thomas; Keogh, Julia; Kemp, John; Kennedy, Karen; Kolb-Kokocinski, Anja; Lachance, Genevieve; Langford, Cordelia; Lawson, Daniel; Lee, Irene; Lek, Monkol; Liang, Jieqin; Lin, Hong; Li, Rui; Li, Yingrui; Liu, Ryan; Lönnqvist, Jouko; Lopes, Margarida; Lotchkova, Valentina; MacArthur, Daniel; Marchini, Jonathan; Maslen, John; Massimo, Mangino; Mathieson, Iain; Marenne, Gaëlle; McGuffin, Peter; McIntosh, Andrew; McKechanie, Andrew G.; McQuillin, Andrew; Metrustry, Sarah; Min, Josine; Mitchison, Hannah; Moayyeri, Alireza; Morris, James; Muntoni, Francesco; Northstone, Kate; O'Donnovan, Michael; Onoufriadis, Alexandros; O'Rahilly, Stephen; Oualkacha, Karim; Owen, Michael J.; Palotie, Aarno; Panoutsopoulou, Kalliope; Parker, Victoria; Parr, Jeremy R.; Paternoster, Lavinia; Paunio, Tiina; Payne, Felicity; Perry, John; Pietilainen, Olli; Plagnol, Vincent; Quaye, Lydia; Quail, Michael A.; Raymond, Lucy; Rehnström, Karola; Richards, Brent; Ring, Susan; Ritchie, Graham R. S.; Roberts, Nicola; Savage, David B.; Scambler, Peter; Schiffels, Stephen; Schmidts, Miriam; Schoenmakers, Nadia; Semple, Robert K.; Serra, Eva; Sharp, Sally I.; Shihab, Hasheem; Skuse, David; Small, Kerrin; Spasic-Boskovic, Olivera; Spector, Tim; St Clair, David; Stalker, Jim; Stevens, Elizabeth; St Pourcian, Beate; Sun, Jianping; Surdulescu, Gabriela; Suvisaari, Jaana; Tachmazidou, Ionna; Timpson, Nicholas; Tobin, Martin D.; Valdes, Ana; Van Kogelenberg, Margriet; Vijayarangakannan, Parthiban; Visscher, Peter M.; Wain, Louise V.; Walters, James T. R.; Wang, Guangbiao; Wang, Jun; Wang, Yu; Ward, Kirsten; Wheeler, Elanor; Whyte, Tamieka; Williams, Hywel; Williamson, Kathleen A.; Wilson, Crispian; Wilson, Scott G.; Wong, Kim; Xu, ChangJiang; Yang, Jian; Zhang, Fend; Zhang, Pingbo; Zheng, Hou-FengThe analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (−1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10−8)) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (−1.0 s.d. (s.e.=0.173), P-value=7.32 × 10−9). This is consistent with an effect between 0.5 and 1.5 mmol l−1 dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.
Publication The sensitivity of exome sequencing in identifying pathogenic mutations for LGMD in the United States
(2016) Reddy, Hemakumar M.; Cho, Kyung-Ah; Lek, Monkol; Estrella, Elicia; Valkanas, Elise; Jones, Michael D.; Mitsuhashi, Satomi; Darras, Basil; Amato, Anthony; Lidov, Hart; Brownstein, Catherine; Margulies, David; Yu, Timothy W.; Salih, Mustafa A.; Kunkel, Louis; MacArthur, Daniel; Kang, Peter B.The current study characterizes a cohort of limb-girdle muscular dystrophy (LGMD) in the United States using whole exome sequencing. Fifty-five families affected by LGMD were recruited using an institutionally-approved protocol. Exome sequencing was performed on probands and selected parental samples. Pathogenic mutations and co-segregation patterns were confirmed by Sanger sequencing. Twenty-two families (40%) had novel and previously reported pathogenic mutations, primarily in LGMD genes, but also in genes for Duchenne muscular dystrophy, facioscapulohumeral muscular dystrophy, congenital myopathy, myofibrillar myopathy, inclusion body myopathy, and Pompe disease. One family was diagnosed via clinical testing. Dominant mutations were identified in COL6A1, COL6A3, FLNC, LMNA, RYR1, SMCHD1, and VCP, recessive mutations in ANO5, CAPN3, GAA, LAMA2, SGCA, and SGCG, and X-linked mutations in DMD. A previously reported variant in DMD was confirmed to be benign. Exome sequencing is a powerful diagnostic tool for LGMD. Despite careful phenotypic screening, pathogenic mutations were found in other muscle disease genes, largely accounting for the increased sensitivity of exome sequencing. Our experience suggests that broad sequencing panels are useful for these analyses due to the phenotypic overlap of many neuromuscular conditions. The confirmation of a benign DMD variant illustrates the potential of exome sequencing to help determine pathogenicity.
Publication High-throughput discovery of novel developmental phenotypes
(2016) Dickinson, Mary E.; Flenniken, Ann M.; Ji, Xiao; Teboul, Lydia; Wong, Michael D.; White, Jacqueline K.; Meehan, Terrence F.; Weninger, Wolfgang J.; Westerberg, Henrik; Adissu, Hibret; Baker, Candice N.; Bower, Lynette; Brown, James M.; Caddle, L. Brianna; Chiani, Francesco; Clary, Dave; Cleak, James; Daly, Mark; Denegre, James M.; Doe, Brendan; Dolan, Mary E.; Edie, Sarah M.; Fuchs, Helmut; Gailus-Durner, Valerie; Galli, Antonella; Gambadoro, Alessia; Gallegos, Juan; Guo, Shiying; Horner, Neil R.; Hsu, Chih-wei; Johnson, Sara J.; Kalaga, Sowmya; Keith, Lance C.; Lanoue, Louise; Lawson, Thomas N.; Lek, Monkol; Mark, Manuel; Marschall, Susan; Mason, Jeremy; McElwee, Melissa L.; Newbigging, Susan; Nutter, Lauryl M.J.; Peterson, Kevin A.; Ramirez-Solis, Ramiro; Rowland, Douglas J.; Ryder, Edward; Samocha, Kaitlin E.; Seavitt, John R.; Selloum, Mohammed; Szoke-Kovacs, Zsombor; Tamura, Masaru; Trainor, Amanda G; Tudose, Ilinca; Wakana, Shigeharu; Warren, Jonathan; Wendling, Olivia; West, David B.; Wong, Leeyean; Yoshiki, Atsushi; MacArthur, Daniel; Tocchini-Valentini, Glauco P.; Gao, Xiang; Flicek, Paul; Bradley, Allan; Skarnes, William C.; Justice, Monica J.; Parkinson, Helen E.; Moore, Mark; Wells, Sara; Braun, Robert E.; Svenson, Karen L.; de Angelis, Martin Hrabe; Herault, Yann; Mohun, Tim; Mallon, Ann-Marie; Henkelman, R. Mark; Brown, Steve D.M.; Adams, David J.; Lloyd, K.C. Kent; McKerlie, Colin; Beaudet, Arthur L.; Bucan, Maja; Murray, Stephen A.Approximately one third of all mammalian genes are essential for life. Phenotypes resulting from mouse knockouts of these genes have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5000 knockout mouse lines, we have identified 410 lethal genes during the production of the first 1751 unique gene knockouts. Using a standardised phenotyping platform that incorporates high-resolution 3D imaging, we identified novel phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes identified in our screen, thus providing a novel dataset that facilitates prioritization and validation of mutations identified in clinical sequencing efforts.
Publication Allelic Expression of Deleterious Protein-Coding Variants across Human Tissues
(Public Library of Science, 2014) Kukurba, Kimberly R.; Zhang, Rui; Li, Xin; Smith, Kevin S.; Knowles, David A.; How Tan, Meng; Piskol, Robert; Lek, Monkol; Snyder, Michael; MacArthur, Daniel; Li, Jin Billy; Montgomery, Stephen B.Personal exome and genome sequencing provides access to loss-of-function and rare deleterious alleles whose interpretation is expected to provide insight into individual disease burden. However, for each allele, accurate interpretation of its effect will depend on both its penetrance and the trait's expressivity. In this regard, an important factor that can modify the effect of a pathogenic coding allele is its level of expression; a factor which itself characteristically changes across tissues. To better inform the degree to which pathogenic alleles can be modified by expression level across multiple tissues, we have conducted exome, RNA and deep, targeted allele-specific expression (ASE) sequencing in ten tissues obtained from a single individual. By combining such data, we report the impact of rare and common loss-of-function variants on allelic expression exposing stronger allelic bias for rare stop-gain variants and informing the extent to which rare deleterious coding alleles are consistently expressed across tissues. This study demonstrates the potential importance of transcriptome data to the interpretation of pathogenic protein-coding variants.
Publication An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge
(BioMed Central, 2014) Brownstein, Catherine; Beggs, Alan; Homer, Nils; Merriman, Barry; Yu, Timothy W; Flannery, Katherine; DeChene, Elizabeth T; Towne, Meghan C; Savage, Sarah K; Price, Emily N; Holm, Ingrid; Luquette, Joe; Lyon, Elaine; Majzoub, Joseph; Neupert, Peter; McCallie Jr, David; Szolovits, Peter; Willard, Huntington F; Mendelsohn, Nancy J; Temme, Renee; Finkel, Richard S; Yum, Sabrina W; Medne, Livija; Sunyaev, Shamil; Adzhubey, Ivan; Cassa, Christopher; de Bakker, Paul IW; Duzkale, Hatice; Dworzyński, Piotr; Fairbrother, William; Francioli, Laurent; Funke, Birgit; Giovanni, Monica A; Handsaker, Robert; Lage, Kasper; Lebo, Matthew; Lek, Monkol; Leshchiner, Ignaty; MacArthur, Daniel; McLaughlin, Heather M; Murray, Michael F; Pers, Tune H; Polak, Paz P; Raychaudhuri, Soumya; Rehm, Heidi; Soemedi, Rachel; Stitziel, Nathan O; Vestecka, Sara; Supper, Jochen; Gugenmus, Claudia; Klocke, Bernward; Hahn, Alexander; Schubach, Max; Menzel, Mortiz; Biskup, Saskia; Freisinger, Peter; Deng, Mario; Braun, Martin; Perner, Sven; Smith, Richard JH; Andorf, Janeen L; Huang, Jian; Ryckman, Kelli; Sheffield, Val C; Stone, Edwin M; Bair, Thomas; Black-Ziegelbein, E Ann; Braun, Terry A; Darbro, Benjamin; DeLuca, Adam P; Kolbe, Diana L; Scheetz, Todd E; Shearer, Aiden E; Sompallae, Rama; Wang, Kai; Bassuk, Alexander G; Edens, Erik; Mathews, Katherine; Moore, Steven A; Shchelochkov, Oleg A; Trapane, Pamela; Bossler, Aaron; Campbell, Colleen A; Heusel, Jonathan W; Kwitek, Anne; Maga, Tara; Panzer, Karin; Wassink, Thomas; Van Daele, Douglas; Azaiez, Hela; Booth, Kevin; Meyer, Nic; Segal, Michael M; Williams, Marc S; Tromp, Gerard; White, Peter; Corsmeier, Donald; Fitzgerald-Butt, Sara; Herman, Gail; Lamb-Thrush, Devon; McBride, Kim L; Newsom, David; Pierson, Christopher R; Rakowsky, Alexander T; Maver, Aleš; Lovrečić, Luca; Palandačić, Anja; Peterlin, Borut; Torkamani, Ali; Wedell, Anna; Huss, Mikael; Alexeyenko, Andrey; Lindvall, Jessica M; Magnusson, Måns; Nilsson, Daniel; Stranneheim, Henrik; Taylan, Fulya; Gilissen, Christian; Hoischen, Alexander; van Bon, Bregje; Yntema, Helger; Nelen, Marcel; Zhang, Weidong; Sager, Jason; Zhang, Lu; Blair, Kathryn; Kural, Deniz; Cariaso, Michael; Lennon, Greg G; Javed, Asif; Agrawal, Saloni; Ng, Pauline C; Sandhu, Komal S; Krishna, Shuba; Veeramachaneni, Vamsi; Isakov, Ofer; Halperin, Eran; Friedman, Eitan; Shomron, Noam; Glusman, Gustavo; Roach, Jared C; Caballero, Juan; Cox, Hannah C; Mauldin, Denise; Ament, Seth A; Rowen, Lee; Richards, Daniel R; Lucas, F Anthony San; Gonzalez-Garay, Manuel L; Caskey, C Thomas; Bai, Yu; Huang, Ying; Fang, Fang; Zhang, Yan; Wang, Zhengyuan; Barrera, Jorge; Garcia-Lobo, Juan M; González-Lamuño, Domingo; Llorca, Javier; Rodriguez, Maria C; Varela, Ignacio; Reese, Martin G; De La Vega, Francisco M; Kiruluta, Edward; Cargill, Michele; Hart, Reece K; Sorenson, Jon M; Lyon, Gholson J; Stevenson, David A; Bray, Bruce E; Moore, Barry M; Eilbeck, Karen; Yandell, Mark; Zhao, Hongyu; Hou, Lin; Chen, Xiaowei; Yan, Xiting; Chen, Mengjie; Li, Cong; Yang, Can; Gunel, Murat; Li, Peining; Kong, Yong; Alexander, Austin C; Albertyn, Zayed I; Boycott, Kym M; Bulman, Dennis E; Gordon, Paul MK; Innes, A Micheil; Knoppers, Bartha M; Majewski, Jacek; Marshall, Christian R; Parboosingh, Jillian S; Sawyer, Sarah L; Samuels, Mark E; Schwartzentruber, Jeremy; Kohane, Isaac; Margulies, DavidBackground: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. Results: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. Conclusions: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.
Publication Transcriptome and genome sequencing uncovers functional variation in humans
(2013) Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil TSummary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
Publication Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects
(Nature Publishing Group, 2016) Zou, James; Valiant, Gregory; Valiant, Paul; Karczewski, Konrad; Chan, Siu On; Samocha, Kaitlin E.; Lek, Monkol; Sunyaev, Shamil; Daly, Mark; MacArthur, DanielAs new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 individuals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of individuals. With 500K individuals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of <0.00001 in healthy humans, consistent with very strong intolerance to gene inactivation.
Publication The ExAC browser: displaying reference data information from over 60 000 exomes
(Oxford University Press, 2017) Karczewski, Konrad; Weisburd, Ben; Thomas, Brett; Solomonson, Matthew; Ruderfer, Douglas M.; Kavanagh, David; Hamamsy, Tymor; Lek, Monkol; Samocha, Kaitlin E.; Cummings, Beryl; Birnbaum, Daniel; Daly, Mark; MacArthur, DanielWorldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60 706 individuals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available at http://exac.broadinstitute.org, and has already been used extensively by clinical laboratories worldwide.
Publication Patterns of genic intolerance of rare copy number variation in 59,898 human exomes
(2016) Ruderfer, Douglas M.; Hamamsy, Tymor; Lek, Monkol; Karczewski, Konrad; Kavanagh, David; Samocha, Kaitlin E.; Daly, Mark; MacArthur, Daniel; Fromer, Menachem; Purcell, Shaun M.Copy number variation (CNV) impacting protein-coding genes contributes significantly to human diversity and disease. Here we characterized the rates and properties of rare genic CNV (<0.5% frequency) in exome-sequencing data from nearly 60,000 individuals in the Exome Aggregation Consortium (ExAC). On average, individuals possessed 0.81 deleted and 1.75 duplicated genes, and most (70%) carried at least one rare genic CNV. For every gene, we empirically estimated an index of relative intolerance to CNVs that demonstrated moderate correlation with measures of genic constraint based on single-nucleotide variation (SNV) and was independently correlated with measures of evolutionary conservation. For individuals with schizophrenia, genes impacted by CNVs were more intolerant than in controls. ExAC CNV data constitutes a critical component of an integrated database spanning the spectrum of human genetic variation, aiding the interpretation of personal genomes as well as population-based disease studies. These data are freely available for download and visualization online.
- «
- 1 (current)
- 2
- 3
- »