Publication:
A case-control collapsing analysis identifies epilepsy genes implicated in trio sequencing studies focused on de novo mutations

Thumbnail Image

Open/View Files

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

Public Library of Science
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Zhu, X., R. Padmanabhan, B. Copeland, J. Bridgers, Z. Ren, S. Kamalakaran, A. O'Driscoll-Collins, et al. 2017. “A case-control collapsing analysis identifies epilepsy genes implicated in trio sequencing studies focused on de novo mutations.” PLoS Genetics 13 (11): e1007104. doi:10.1371/journal.pgen.1007104. http://dx.doi.org/10.1371/journal.pgen.1007104.

Research Data

Abstract

Trio exome sequencing has been successful in identifying genes with de novo mutations (DNMs) causing epileptic encephalopathy (EE) and other neurodevelopmental disorders. Here, we evaluate how well a case-control collapsing analysis recovers genes causing dominant forms of EE originally implicated by DNM analysis. We performed a genome-wide search for an enrichment of "qualifying variants" in protein-coding genes in 488 unrelated cases compared to 12,151 unrelated controls. These "qualifying variants" were selected to be extremely rare variants predicted to functionally impact the protein to enrich for likely pathogenic variants. Despite modest sample size, three known EE genes (KCNT1, SCN2A, and STXBP1) achieved genome-wide significance (p<2.68×10−6). In addition, six of the 10 most significantly associated genes are known EE genes, and the majority of the known EE genes (17 out of 25) originally implicated in trio sequencing are nominally significant (p<0.05), a proportion significantly higher than the expected (Fisher’s exact p = 2.33×10−17). Our results indicate that a case-control collapsing analysis can identify several of the EE genes originally implicated in trio sequencing studies, and clearly show that additional genes would be implicated with larger sample sizes. The case-control analysis not only makes discovery easier and more economical in early onset disorders, particularly when large cohorts are available, but also supports the use of this approach to identify genes in diseases that present later in life when parents are not readily available.

Description

Keywords

Biology and life sciences, Molecular biology, Molecular biology techniques, Sequencing techniques, DNA sequencing, Gene Sequencing, Medicine and Health Sciences, Neurology, Epilepsy, Biology and Life Sciences, Computational Biology, Genome Analysis, Genetics, Genomics, Molecular Biology, Molecular Biology Techniques, Sequencing Techniques, Genome Sequencing, Physical Sciences, Mathematics, Discrete Mathematics, Combinatorics, Permutation, Genomic Medicine, Mathematical and Statistical Techniques, Statistical Methods, Multivariate Analysis, Principal Component Analysis, Statistics (Mathematics), Genetic Loci, Alleles

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories