|dc.description.abstract||The human genome's strongest influences on two common diseases, systemic lupus erythematosus (SLE) and schizophrenia, arise from genetic variation in the Human Leukocyte Antigen (HLA) locus. However, the genes and functional alleles driving these genetic relationships have remained unknown. We hypothesized that a complex, multi-allelic form of structural variation in the Complement component 4 (C4) gene, within the HLA locus, underlies these relationships.
Loci that exist in many structural forms and vary widely in copy number have been difficult to analyze molecularly. As a result, we know little about their population genetic properties or their influence on phenotypes. In this work, we developed molecular and statistical methods to characterize such loci and to evaluate their contribution to phenotypes.
Applying these methods to the C4 locus, we found that C4 segregates in four common and at least eleven low-frequency structural forms in human populations. Although there was only partial correlation between C4 structural variation and individual single nucleotide polymorphisms (SNPs), we developed an imputation approach to enable statistical prediction of C4 structural states from flanking SNP haplotypes.
C4 structural variation associated to gene expression in lymphoblastoid cell lines and human brain tissue. Applying our imputation strategy to SLE and schizophrenia case-control cohorts totaling > 75,000 individuals, we found that structural variation in C4 contributes to risk of both phenotypes in a manner predicted by its effect on gene expression in relevant tissues, and with largely opposite directions of effect - alleles that were protective for schizophrenia increased risk for SLE, and vice versa. Leveraging a natural allelic series of C4 structural forms, we developed a novel form of association testing and showed that the association to C4 is unlikely to be caused by correlation with HLA SNPs. C4 was expressed in human neurons, whereas other upstream complement pathway genes were expressed primarily by microglia. Mice lacking C4 showed a deficit in synaptic pruning that was rescued by human C4.
The methods developed in this thesis enable analysis of complex structural variation, and our results identify a novel form of genome variation as making a strong contribution to phenotypes.||en_US