Person:
Yung, Godwin

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Yung

First Name

Godwin

Name

Yung, Godwin

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Publication
    Novel structural co-expression analysis linking the NPM1-associated ribosomal biogenesis network to chronic myelogenous leukemia
    (Nature Publishing Group, 2015) Chan, Lawrence WC; Lin, Xihong; Yung, Godwin; Lui, Thomas; Chiu, Ya Ming; Wang, Fengfeng; Tsui, Nancy BY; Cho, William CS; Yip, SP; Siu, Parco M.; Wong, SC Cesar; Yung, Benjamin YM
    Co-expression analysis reveals useful dysregulation patterns of gene cooperativeness for understanding cancer biology and identifying new targets for treatment. We developed a structural strategy to identify co-expressed gene networks that are important for chronic myelogenous leukemia (CML). This strategy compared the distributions of expressional correlations between CML and normal states, and it identified a data-driven threshold to classify strongly co-expressed networks that had the best coherence with CML. Using this strategy, we found a transcriptome-wide reduction of co-expression connectivity in CML, reflecting potentially loosened molecular regulation. Conversely, when we focused on nucleophosmin 1 (NPM1) associated networks, NPM1 established more co-expression linkages with BCR-ABL pathways and ribosomal protein networks in CML than normal. This finding implicates a new role of NPM1 in conveying tumorigenic signals from the BCR-ABL oncoprotein to ribosome biogenesis, affecting cellular growth. Transcription factors may be regulators of the differential co-expression patterns between CML and normal.
  • Publication
    Statistical methods for analyzing genetic sequencing association studies
    (2016-05-16) Yung, Godwin; Lin, Xihong; Kraft, Peter; Tchetgen Tchetgen, Eric
    Case-control genetic sequencing studies are increasingly being conducted to identify rare variants associated with complex diseases. Oftentimes, these studies collect a variety of secondary traits--quantitative and qualitative traits besides the case-control disease status. Reusing the data and studying the association between rare variants and secondary phenotypes provide an attractive and cost effective approach that can lead to discovery of new genetic associations. In Chapter 1, we carry out an extensive investigation of the validity of ad hoc methods, which are simple, computationally efficient methods frequently applied in practice to study the association between secondary phenotypes and single common genetic variants. Though other researchers have investigated the same problem, we make two key contributions to existing literature. First, we show that in taking an ad hoc approach, it may be desirable to adjust for covariates that affect the primary disease in the secondary phenotype model, even though these covariates are not necessarily associated with the secondary phenotype in the population. Second, we show that when the disease is rare, ad hoc methods can lead to severely biased estimation and inference if the true disease model follows a non-logistic model such as the probit model. Spurious associations can be avoided by including interaction terms in the fitted regression model. Our results are justified theoretically and via simulations, and illustrated by a genome-wide association study of smoking using a lung cancer case-control study. In Chapter 2, we consider the problem of testing associations between secondary phenotypes and sets of rare genetic variants. We show that popular region-based methods such as the burden test and the sequence kernel association test (SKAT) can only be applied under the same conditions as those applicable to ad hoc methods (Chapter 1). For a more robust alternative, we propose an inverse-probability-weighted version of the optimal SKAT (SKAT-O) to account for unequal sampling of cases and controls. As an extension of SKAT-O, our approach is data adaptive and includes the weighted burden test and weighted SKAT as special cases. In addition to weighting individuals to account for the biased sampling, we can also consider weighting the variants in SKAT-O. Decreasing the weight of non-causal variants and increasing the weight of causal variants can improve power. However, since researchers do not know which variants are actually causal, it is common practice to weight genetic variants as a function of their minor allele frequencies. This is motivated by the belief that rarer variants are more likely to have larger effects. In Chapter 3, we propose a new unsupervised statistical framework for predicting the functional status of genetic variants. Compared to existing methods, the proposed algorithm integrates a diverse set of annotations---which are partitioned beforehand into multiple groups by the user---and predicts the functional status for each group, taking into account within- and between-group correlations. We demonstrate the advantages of the algorithm through application to real annotation data and conclude with future directions.