Person: Cassa, Christopher
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Cassa
First Name
Christopher
Name
Cassa, Christopher
11 results
Search Results
Now showing 1 - 10 of 11
Publication Evidence for secondary-variant genetic burden and non-random distribution across biological modules in a recessive ciliopathy(SpringerNature, 2018-07-05) Kousi, Maria; Söylemez, Onuralp; Ozanturk, Aysegül; Akle, Sebastian; Jungreis, Irwin; Muller, Jean-Francois; Cassa, Christopher; Brand, Harrison; Mokry, Jill Anne; Wolf, Maxim; Sadeghpour, Azita; McFadden, Kelsey; Lewis, Richard A.; Talkowski, Michael; Dollfus, Hélène; Kellis, Manolis; Davis, Erica E.; Sunyaev, Shamil; Katsanis, NicholasThe influence of genetic background on driver mutations is well established; however, the mechanisms by which the background interacts with Mendelian loci remains unclear. We performed a systematic secondary-variant burden analysis of two independent Bardet-Biedl syndrome (BBS)1 cohorts with known recessive biallelic pathogenic mutations in one of 17 BBS genes for each individual. We observed a significant enrichment of trans-acting rare nonsynonymous secondary variants compared to either population controls or to a cohort of individuals with a non-BBS diagnosis and recessive variants in the same gene set. Strikingly, we found a significant over-representation of secondary alleles in chaperonin-encoding genes, a finding corroborated by the observation of epistatic interactions involving this complex in vivo. These data indicate a complex genetic architecture for BBS that informs the biological properties of disease modules and presents a model paradigm for secondary-variant burden analysis in recessive disorders.Publication Estimating the Selective Effects of Heterozygous Protein Truncating Variants from Human Exome Data(2017) Cassa, Christopher; Weghorn, Donate; Balick, Daniel; Jordan, Daniel M.; Nusinow, David; Samocha, Kaitlin E.; O’Donnell-Luria, Anne; MacArthur, Daniel; Daly, Mark; Beier, David R.; Sunyaev, ShamilThe dispensability of individual genes for viability has interested generations of geneticists. For some genes it is essential to maintain two functional chromosomal copies, while others may tolerate the loss of one or both copies. Exome sequence data from 60,706 individuals provide sufficient observations of rare protein truncating variants (PTVs) to make genome-wide estimates of selection against heterozygous loss of gene function. The cumulative frequency of rare deleterious PTVs is primarily determined by the balance between incoming mutations and purifying selection rather than genetic drift. This enables the estimation of the genome-wide distribution of selection coefficients for heterozygous PTVs and corresponding Bayesian estimates for individual genes. The strength of selection can discriminate the severity, age of onset, and mode of inheritance in Mendelian exome sequencing cases. We find that genes under the strongest selection are enriched in embryonic lethal mouse knockouts, putatively cell-essential genes, Mendelian disease genes, and regulators of transcription. Screening by essentiality, we find a large set of genes under strong selection that likely have critical function but have not yet been extensively annotated in published literature.Publication An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge(BioMed Central, 2014) Brownstein, Catherine; Beggs, Alan; Homer, Nils; Merriman, Barry; Yu, Timothy W; Flannery, Katherine; DeChene, Elizabeth T; Towne, Meghan C; Savage, Sarah K; Price, Emily N; Holm, Ingrid; Luquette, Joe; Lyon, Elaine; Majzoub, Joseph; Neupert, Peter; McCallie Jr, David; Szolovits, Peter; Willard, Huntington F; Mendelsohn, Nancy J; Temme, Renee; Finkel, Richard S; Yum, Sabrina W; Medne, Livija; Sunyaev, Shamil; Adzhubey, Ivan; Cassa, Christopher; de Bakker, Paul IW; Duzkale, Hatice; Dworzyński, Piotr; Fairbrother, William; Francioli, Laurent; Funke, Birgit; Giovanni, Monica A; Handsaker, Robert; Lage, Kasper; Lebo, Matthew; Lek, Monkol; Leshchiner, Ignaty; MacArthur, Daniel; McLaughlin, Heather M; Murray, Michael F; Pers, Tune H; Polak, Paz P; Raychaudhuri, Soumya; Rehm, Heidi; Soemedi, Rachel; Stitziel, Nathan O; Vestecka, Sara; Supper, Jochen; Gugenmus, Claudia; Klocke, Bernward; Hahn, Alexander; Schubach, Max; Menzel, Mortiz; Biskup, Saskia; Freisinger, Peter; Deng, Mario; Braun, Martin; Perner, Sven; Smith, Richard JH; Andorf, Janeen L; Huang, Jian; Ryckman, Kelli; Sheffield, Val C; Stone, Edwin M; Bair, Thomas; Black-Ziegelbein, E Ann; Braun, Terry A; Darbro, Benjamin; DeLuca, Adam P; Kolbe, Diana L; Scheetz, Todd E; Shearer, Aiden E; Sompallae, Rama; Wang, Kai; Bassuk, Alexander G; Edens, Erik; Mathews, Katherine; Moore, Steven A; Shchelochkov, Oleg A; Trapane, Pamela; Bossler, Aaron; Campbell, Colleen A; Heusel, Jonathan W; Kwitek, Anne; Maga, Tara; Panzer, Karin; Wassink, Thomas; Van Daele, Douglas; Azaiez, Hela; Booth, Kevin; Meyer, Nic; Segal, Michael M; Williams, Marc S; Tromp, Gerard; White, Peter; Corsmeier, Donald; Fitzgerald-Butt, Sara; Herman, Gail; Lamb-Thrush, Devon; McBride, Kim L; Newsom, David; Pierson, Christopher R; Rakowsky, Alexander T; Maver, Aleš; Lovrečić, Luca; Palandačić, Anja; Peterlin, Borut; Torkamani, Ali; Wedell, Anna; Huss, Mikael; Alexeyenko, Andrey; Lindvall, Jessica M; Magnusson, Måns; Nilsson, Daniel; Stranneheim, Henrik; Taylan, Fulya; Gilissen, Christian; Hoischen, Alexander; van Bon, Bregje; Yntema, Helger; Nelen, Marcel; Zhang, Weidong; Sager, Jason; Zhang, Lu; Blair, Kathryn; Kural, Deniz; Cariaso, Michael; Lennon, Greg G; Javed, Asif; Agrawal, Saloni; Ng, Pauline C; Sandhu, Komal S; Krishna, Shuba; Veeramachaneni, Vamsi; Isakov, Ofer; Halperin, Eran; Friedman, Eitan; Shomron, Noam; Glusman, Gustavo; Roach, Jared C; Caballero, Juan; Cox, Hannah C; Mauldin, Denise; Ament, Seth A; Rowen, Lee; Richards, Daniel R; Lucas, F Anthony San; Gonzalez-Garay, Manuel L; Caskey, C Thomas; Bai, Yu; Huang, Ying; Fang, Fang; Zhang, Yan; Wang, Zhengyuan; Barrera, Jorge; Garcia-Lobo, Juan M; González-Lamuño, Domingo; Llorca, Javier; Rodriguez, Maria C; Varela, Ignacio; Reese, Martin G; De La Vega, Francisco M; Kiruluta, Edward; Cargill, Michele; Hart, Reece K; Sorenson, Jon M; Lyon, Gholson J; Stevenson, David A; Bray, Bruce E; Moore, Barry M; Eilbeck, Karen; Yandell, Mark; Zhao, Hongyu; Hou, Lin; Chen, Xiaowei; Yan, Xiting; Chen, Mengjie; Li, Cong; Yang, Can; Gunel, Murat; Li, Peining; Kong, Yong; Alexander, Austin C; Albertyn, Zayed I; Boycott, Kym M; Bulman, Dennis E; Gordon, Paul MK; Innes, A Micheil; Knoppers, Bartha M; Majewski, Jacek; Marshall, Christian R; Parboosingh, Jillian S; Sawyer, Sarah L; Samuels, Mark E; Schwartzentruber, Jeremy; Kohane, Isaac; Margulies, DavidBackground: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. Results: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. Conclusions: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.Publication When “N of 2” is not enough: integrating statistical and functional data in gene discovery(Cold Spring Harbor Laboratory Press, 2017) Cassa, Christopher; Akle, Sebastian; Jordan, Daniel M.; Rosenfeld, Jill A.The expanding use of genomic sequencing promises to improve clinical diagnostics and to drive the discovery of new disease genes. Candidate genes are increasingly being identified through recurrent cases (e.g., two or more independent cases [“N of 2”] in which variants are present in the same gene). These second case hits provide statistical evidence of an association, which may then be combined with functional validation or familial segregation studies to bolster the evidence that a gene is truly causal. Here, we discuss how to integrate different forms of functional evidence with human genetics case and segregation data to improve the significance of new disease–gene associations.Publication Inherited CHST11/MIR3922 deletion is associated with a novel recessive syndrome presenting with skeletal malformation and malignant lymphoproliferative disease(John Wiley & Sons, Ltd, 2015) Chopra, Sameer; Leshchiner, Ignaty; Duzkale, Hatice; McLaughlin, Heather; Giovanni, Monica; Zhang, Chengsheng; Stitziel, Nathan; Fingeroth, Joyce; Joyce, Robin; Lebo, Matthew; Rehm, Heidi; Vuzman, Dana; Maas, Richard; Sunyaev, Shamil; Murray, Michael; Cassa, ChristopherGlycosaminoglycans (GAGs) such as chondroitin are ubiquitous disaccharide carbohydrate chains that contribute to the formation and function of proteoglycans at the cell membrane and in the extracellular matrix. Although GAG-modifying enzymes are required for diverse cellular functions, the role of these proteins in human development and disease is less well understood. Here, we describe two sisters out of seven siblings affected by congenital limb malformation and malignant lymphoproliferative disease. Using Whole-Genome Sequencing (WGS), we identified in the proband deletion of a 55 kb region within chromosome 12q23 that encompasses part of CHST11 (encoding chondroitin-4-sulfotransferase 1) and an embedded microRNA (MIR3922). The deletion was homozygous in the proband but not in each of three unaffected siblings. Genotyping data from the 1000 Genomes Project suggest that deletions inclusive of both CHST11 and MIR3922 are rare events. Given that CHST11 deficiency causes severe chondrodysplasia in mice that is similar to human limb malformation, these results underscore the importance of chondroitin modification in normal skeletal development. Our findings also potentially reveal an unexpected role for CHST11 and/or MIR3922 as tumor suppressors whose disruption may contribute to malignant lymphoproliferative disease.Publication Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck(Public Library of Science, 2015) Balick, Daniel; Do, Ron; Cassa, Christopher; Reich, David; Sunyaev, ShamilPopulation bottlenecks followed by re-expansions have been common throughout history of many populations. The response of alleles under selection to such demographic perturbations has been a subject of great interest in population genetics. On the basis of theoretical analysis and computer simulations, we suggest that this response qualitatively depends on dominance. The number of dominant or additive deleterious alleles per haploid genome is expected to be slightly increased following the bottleneck and re-expansion. In contrast, the number of completely or partially recessive alleles should be sharply reduced. Changes of population size expose differences between recessive and additive selection, potentially providing insight into the prevalence of dominance in natural populations. Specifically, we use a simple statistic, BR≡∑xipop1/∑xjpop2, where xi represents the derived allele frequency, to compare the number of mutations in different populations, and detail its functional dependence on the strength of selection and the intensity of the population bottleneck. We also provide empirical evidence showing that gene sets associated with autosomal recessive disease in humans may have a BR indicative of recessive selection. Together, these theoretical predictions and empirical observations show that complex demographic history may facilitate rather than impede inference of parameters of natural selection.Publication Re-Identification of Home Addresses from Spatial Locations Anonymized by Gaussian Skew(BioMed Central, 2008) Cassa, Christopher; Wieland, Shannon C.; Mandl, KennethBackground: Knowledge of the geographical locations of individuals is fundamental to the practice of spatial epidemiology. One approach to preserving the privacy of individual-level addresses in a data set is to de-identify the data using a non-deterministic blurring algorithm that shifts the geocoded values. We investigate a vulnerability in this approach which enables an adversary to reidentify individuals using multiple anonymized versions of the original data set. If several such versions are available, each can be used to incrementally refine estimates of the original geocoded location. Results: We produce multiple anonymized data sets using a single set of addresses and then progressively average the anonymized results related to each address, characterizing the steep decline in distance from the re-identified point to the original location, (and the reduction in privacy). With ten anonymized copies of an original data set, we find a substantial decrease in average distance from 0.7 km to 0.2 km between the estimated, re-identified address and the original address. With fifty anonymized copies of an original data set, we find a decrease in average distance from 0.7 km to 0.1 km. Conclusion: We demonstrate that multiple versions of the same data, each anonymized by nondeterministic Gaussian skew, can be used to ascertain original geographic locations. We explore solutions to this problem that include infrastructure to support the safe disclosure of anonymized medical data to prevent inference or re-identification of original address data, and the use of a Markov-process based algorithm to mitigate this risk.Publication A software tool for creating simulated outbreaks to benchmark surveillance systems(BioMed Central, 2005) Cassa, Christopher; Iancu, Karin; Olson, Karen; Mandl, KennethBackground: Evaluating surveillance systems for the early detection of bioterrorism is particularly challenging when systems are designed to detect events for which there are few or no historical examples. One approach to benchmarking outbreak detection performance is to create semi-synthetic datasets containing authentic baseline patient data (noise) and injected artificial patient clusters, as signal. Methods: We describe a software tool, the AEGIS Cluster Creation Tool (AEGIS-CCT), that enables users to create simulated clusters with controlled feature sets, varying the desired cluster radius, density, distance, relative location from a reference point, and temporal epidemiological growth pattern. AEGIS-CCT does not require the use of an external geographical information system program for cluster creation. The cluster creation tool is an open source program, implemented in Java and is freely available under the Lesser GNU Public License at its Sourceforge website. Cluster data are written to files or can be appended to existing files so that the resulting file will include both existing baseline and artificially added cases. Multiple cluster file creation is an automated process in which multiple cluster files are created by varying a single parameter within a user-specified range. To evaluate the output of this software tool, sets of test clusters were created and graphically rendered. Results: Based on user-specified parameters describing the location, properties, and temporal pattern of simulated clusters, AEGIS-CCT created clusters accurately and uniformly. Conclusion: AEGIS-CCT enables the ready creation of datasets for benchmarking outbreak detection systems. It may be useful for automating the testing and validation of spatial and temporal cluster detection algorithms.Publication My Sister's Keeper?: Genomic Research and the Identifiability of Siblings(BioMed Central, 2008) Cassa, Christopher; Schmidt, Brian; Kohane, Isaac; Mandl, KennethBackground: Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. Methods: We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy. Results: Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency \(\geq 0.20\), \((N = 452684, 65.1\%)\) we achieve \(91.9\%\) inference accuracy for sibling genotypes. Conclusion: These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.Publication Automated Validation Of Genetic Variants From Large Databases: Ensuring That Variant References Refer To The Same Genomic Locations(Oxford University Press, 2011) Tong, Mark Y.; Cassa, Christopher; Kohane, IsaacAccurate annotations of genomic variants are necessary to achieve full-genome clinical interpretations that are scientifically sound and medically relevant. Many disease associations, especially those reported before the completion of the HGP, are limited in applicability because of potential inconsistencies with our current standards for genomic coordinates, nomenclature and gene structure. In an effort to validate and link variants from the medical genetics literature to an unambiguous reference for each variant, we developed a software pipeline and reviewed 68 641 single amino acid mutations from Online Mendelian Inheritance in Man (OMIM), Human Gene Mutation Database (HGMD) and dbSNP. The frequency of unresolved mutation annotations varied widely among the databases, ranging from 4 to 23%. A taxonomy of primary causes for unresolved mutations was produced.