Person: Jung, Jae-Yoon
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Jung
First Name
Jae-Yoon
Name
Jung, Jae-Yoon
9 results
Search Results
Now showing 1 - 9 of 9
Publication Genetic Networks of Complex Disorders: from a Novel Search Engine for PubMed Article Database(American Medical Informatics Association, 2013) Jung, Jae-Yoon; Wall, Dennis PaulFinding genetic risk factors of complex disorders may involve reviewing hundreds of genes or thousands of research articles iteratively, but few tools have been available to facilitate this procedure. In this work, we built a novel publication search engine that can identify target-disorder specific, genetics-oriented research articles and extract the genes with significant results. Preliminary test results showed that the output of this engine has better coverage in terms of genes or publications, than other existing applications. We consider it as an essential tool for understanding genetic networks of complex disorders.Publication A literature search tool for intelligent extraction of disease-associated genes(BMJ Publishing Group, 2014) Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis PObjective: To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. Methods: We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. Results: We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder–gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. Conclusions: We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene–disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.Publication Cloud Computing for Comparative Genomics with Windows Azure Platform(Libertas Academica, 2012) Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd; Nelson, Tristan; Wall, Dennis PaulCloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.Publication Multi-Locus Genome-Wide Association Analysis Supports the Role of Glutamatergic Synaptic Transmission in the Etiology of Major Depressive Disorder(Nature Publishing Group, 2012) Lee, Phil; Perlis, Roy H.; Jung, Jae-Yoon; Byrne, Enda M.; Haddad, Stephen; Rueckert, Erroll; Siburian, Richie; Mayerfeld, Catherine E.; Heath, Andrew C.; Pergadia, Michele L.; Madden, Pamela A.F.; Boomsma, Dorret I.; Penninx, Brenda W.; Sklar, Pamela; Martin, Nicholas G.; Purcell, Shaun M.; Smoller, Jordan; Wray, Naomi R.Major depressive disorder (MDD) is a common psychiatric illness characterized by low mood and loss of interest in pleasurable activities. Despite years of effort, recent genome-wide association studies (GWAS) have identified few susceptibility variants or genes that are robustly associated with MDD. Standard single-SNP (single nucleotide polymorphism)-based GWAS analysis typically has limited power to deal with the extensive heterogeneity and substantial polygenic contribution of individually weak genetic effects underlying the pathogenesis of MDD. Here, we report an alternative, gene-set-based association analysis of MDD in an effort to identify groups of biologically related genetic variants that are involved in the same molecular function or cellular processes and exhibit a significant level of aggregated association with MDD. In particular, we used a text-mining-based data analysis to prioritize candidate gene sets implicated in MDD and conducted a multi-locus association analysis to look for enriched signals of nominally associated MDD susceptibility loci within each of the gene sets. Our primary analysis is based on the meta-analysis of three large MDD GWAS data sets (total N=4346 cases and 4430 controls). After correction for multiple testing, we found that genes involved in glutamatergic synaptic neurotransmission were significantly associated with MDD (set-based association \(P=6.9 × 10^{−4}\)). This result is consistent with previous studies that support a role of the glutamatergic system in synaptic plasticity and MDD and support the potential utility of targeting glutamatergic neurotransmission in the treatment of MDD.Publication Autworks: a Cross-Disease Network Biology Application for Autism and Related Disorders(BioMed Central, 2012) Nelson, Tristan; Jung, Jae-Yoon; DeLuca, Todd; Hinebaugh, Byron Kent; St Gabriel, Kristian Che; Wall, Dennis PaulBackground: The genetic etiology of autism is heterogeneous. Multiple disorders share genotypic and phenotypic traits with autism. Network based cross-disorder analysis can aid in the understanding and characterization of the molecular pathology of autism, but there are few tools that enable us to conduct cross-disorder analysis and to visualize the results. Description: We have designed Autworks as a web portal to bring together gene interaction and gene-disease association data on autism to enable network construction, visualization, network comparisons with numerous other related neurological conditions and disorders. Users may examine the structure of gene interactions within a set of disorder-associated genes, compare networks of disorder/disease genes with those of other disorders/diseases, and upload their own sets for comparative analysis. Conclusions: Autworks is a web application that provides an easy-to-use resource for researchers of varied backgrounds to analyze the autism gene network structure within and between disorders.Publication Phylogenetically informed logic relationships improve detection of biological network organization(BioMed Central, 2011) Cui, Jike; DeLuca, Todd; Jung, Jae-Yoon; Wall, Dennis PaulBackground: A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results: Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion: Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction.Publication Use of Artificial Intelligence to Shorten the Behavioral Diagnosis of Autism(Public Library of Science, 2012) Wall, Dennis Paul; Dally, Rebecca; Luyster, Rhiannon; Jung, Jae-Yoon; DeLuca, ToddThe Autism Diagnostic Interview-Revised (ADI-R) is one of the most commonly used instruments for assisting in the behavioral diagnosis of autism. The exam consists of 93 questions that must be answered by a care provider within a focused session that often spans 2.5 hours. We used machine learning techniques to study the complete sets of answers to the ADI-R available at the Autism Genetic Research Exchange (AGRE) for 891 individuals diagnosed with autism and 75 individuals who did not meet the criteria for an autism diagnosis. Our analysis showed that 7 of the 93 items contained in the ADI-R were sufficient to classify autism with 99.9% statistical accuracy. We further tested the accuracy of this 7-question classifier against complete sets of answers from two independent sources, a collection of 1654 individuals with autism from the Simons Foundation and a collection of 322 individuals with autism from the Boston Autism Consortium. In both cases, our classifier performed with nearly 100% statistical accuracy, properly categorizing all but one of the individuals from these two resources who previously had been diagnosed with autism through the standard ADI-R. Our ability to measure specificity was limited by the small numbers of non-spectrum cases in the research data used, however, both real and simulated data demonstrated a range in specificity from 99% to 93.8%. With incidence rates rising, the capacity to diagnose autism quickly and effectively requires careful design of behavioral assessment methods. Ours is an initial attempt to retrospectively analyze large data repositories to derive an accurate, but significantly abbreviated approach that may be used for rapid detection and clinical prioritization of individuals likely to have an autism spectrum disorder. Such a tool could assist in streamlining the clinical diagnostic process overall, leading to faster screening and earlier treatment of individuals with autism.Publication Genotator: A Disease-Agnostic Tool for Genetic Annotation of Disease(BioMed Central, 2010) Wall, Dennis Paul; Pivovarov, Rimma; Tong, Mark; Jung, Jae-Yoon; Fusaro, Vincent Alfred; DeLuca, Todd; Tonellato, PeterBackground: Disease-specific genetic information has been increasing at rapid rates as a consequence of recent improvements and massive cost reductions in sequencing technologies. Numerous systems designed to capture and organize this mounting sea of genetic data have emerged, but these resources differ dramatically in their disease coverage and genetic depth. With few exceptions, researchers must manually search a variety of sites to assemble a complete set of genetic evidence for a particular disease of interest, a process that is both time-consuming and error-prone. Methods: We designed a real-time aggregation tool that provides both comprehensive coverage and reliable gene-to-disease rankings for any disease. Our tool, called Genotator, automatically integrates data from 11 externally accessible clinical genetics resources and uses these data in a straightforward formula to rank genes in order of disease relevance. We tested the accuracy of coverage of Genotator in three separate diseases for which there exist specialty curated databases, Autism Spectrum Disorder, Parkinson's Disease, and Alzheimer Disease. Genotator is freely available at http://genotator.hms.harvard.edu. Results: Genotator demonstrated that most of the 11 selected databases contain unique information about the genetic composition of disease, with 2514 genes found in only one of the 11 databases. These findings confirm that the integration of these databases provides a more complete picture than would be possible from any one database alone. Genotator successfully identified at least 75% of the top ranked genes for all three of our use cases, including a 90% concordance with the top 40 ranked candidates for Alzheimer Disease. Conclusions: As a meta-query engine, Genotator provides high coverage of both historical genetic research as well as recent advances in the genetic understanding of specific diseases. As such, Genotator provides a real-time aggregation of ranked data that remains current with the pace of research in the disease fields. Genotator's algorithm appropriately transforms query terms to match the input requirements of each targeted databases and accurately resolves named synonyms to ensure full coverage of the genetic results with official nomenclature. Genotator generates an excel-style output that is consistent across disease queries and readily importable to other applications.Publication Roundup 2.0: Enabling Comparative Genomics for over 1800 Genomes(Oxford University Press, 2012) DeLuca, Todd; Cui, Jike; Jung, Jae-Yoon; St. Gabriel, Kristian Che; Wall, Dennis PaulSummary: Roundup is an online database of gene orthologs for over 1800 genomes, including 226 Eukaryota, 1447 Bacteria, 113 Archaea, and 21 Viruses. Orthologs are inferred using the Reciprocal Smallest Distance algorithm. Users may query Roundup for single-linkage clusters of orthologous genes based on any group of genomes. Annotated query results may be viewed in a variety of ways including as clusters of orthologs and as phylogenetic profiles. Genomic results may be downloaded in formats suitable for functional as well as phylogenetic analysis, including the recent OrthoXML standard. In addition, gene IDs can be retrieved using FASTA sequence search. All orthology results and source code are freely available.