Publication: Fine-mapping complex traits in large-scale biobanks across diverse populations
No Thumbnail Available
Open/View Files
Date
2022-05-12
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Kanai, Masahiro. 2022. Fine-mapping complex traits in large-scale biobanks across diverse populations. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
Identifying causal variants for complex traits is a major goal of human genetics research. Despite the great success of genome-wide association studies (GWAS) in locus discovery, individual causal variants in associated loci remain largely unresolved, limiting the biological inference possible from follow-up experimentation. In this dissertation, I present our fine-mapping analyses of complex traits in large-scale biobanks across diverse populations to create an atlas of causal variants.
We first fine-mapped complex traits using 361,194 European individuals from UK Biobank (UKBB) and gene expression using 49 tissues from GTEx (Chapter 1). We then extended our fine-mapping of complex traits to multiple populations, using 178,726 Japanese individuals from BioBank Japan and 271,341 Finnish individuals from FinnGen (Chapter 2). In total, we identified 4,518 variant-trait pairs with high posterior probability (> 0.9) of causality across the three populations. Aggregating data across populations enabled replication of 285 high-confidence variant-trait pairs as well as identification of 1,492 unique fine-mapped coding variants and 176 genes in which multiple independent coding variants influence the same trait. These results demonstrate that fine-mapping in diverse populations enables novel insights into the biology of complex traits by pinpointing high-confidence causal variants for further characterization.
Next, we investigated fine-mapping accuracy in GWAS meta-analysis (Chapter 3). We demonstrated that meta-analysis fine-mapping is substantially miscalibrated in simulations and proposed a novel quality-control method, SLALOM, that identifies suspicious loci for meta-analysis fine-mapping. Having validated SLALOM performance in simulations, we found widespread suspicious patterns in existing GWAS significant loci that call into question fine-mapping accuracy. We thus urge extreme caution when interpreting fine-mapping results from meta-analysis.
Finally, we introduce a new polygenic risk score (PRS) method, PolyPred, that improves cross-population polygenic prediction by combining a new fine-mapping-based predictor and a published BOLT-LMM predictor (Chapter 4). Leveraging estimated causal effects from fine-mapping enabled higher PRS transferability in non-European populations, achieving up to +32% improvement in prediction accuracy vs. BOLT-LMM using UKBB Africans.
Altogether, this work demonstrates key advances in fine-mapping complex traits across diverse populations and provides insights into further variant characterization as well as improved polygenic prediction based on fine-mapping.
Description
Other Available Sources
Keywords
biobank, fine-mapping, genome-wide association study, polygenic risk score, Bioinformatics, Genetics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service