Person:

Agniel, Denis

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Agniel

First Name

Denis

Name

Agniel, Denis

Search Results

Now showing 1 - 2 of 2
  • Publication

    Statistical Methods for Multivariate and Complex Phenotypes

    (2014-10-21) Agniel, Denis; Cai, Tianxi; Lin, Xihong; Haneuse, Sebastien

    Many important scientific questions can not be studied properly using a single measurement as a response. For example, many phenotypes of interest in recent clinical research may be difficult to characterize due to their inherent complexity. It may be difficult to determine the presence or absence of disease based on a single measurement, or even a few measurements, or the phenotype may only be defined based on a series of symptoms. Similarly, a set of related phenotypes or measurements may be studied together in order to detect a shared etiology. In this work, we propose methods for studying complex phenotypes of these types, where the phenotype may be characterized either longitudinally or by a diverse set of continuous, discrete, or not fully observed components. In chapter 1, we seek to identify predictors that are related to multiple components of diverse outcomes. We take up specifically the question of identifying a multiple regulator, where we seek a genetic marker that is associated with multiple biomarkers for autoimmune disease. To do this, we propose sparse multiple regulation testing (SMRT) both to estimate the relationship between a set of predictors and diverse outcomes and to provide a testing framework in which to identify which predictors are associated with multiple elements of the outcomes, while controlling error rates. In chapter 2, we seek to identify risk profiles or risk scores for diverse outcomes, where a risk profile is a linear combination of predictors. The risk profiles will be chosen to be highly correlated to latent traits underlying the outcomes. To do this, we propose semiparametric canonical correlation analysis (sCCA), an updated version of the classical canonical correlation analysis. In chapter 3, the scientific question of interest pertains directly to the progression of disease over time. We provide a testing framework in which to detect the association between a set of genetic markers and the progression of disease in the context of a GWAS. To test for this association while allowing for highly nonlinear longitudinal progression of disease, we propose functional principal variance component (FPVC) testing.

  • Publication

    Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts

    (Public Library of Science, 2015) Liao, Katherine; Ananthakrishnan, Ashwin; Kumar, Vishesh; Xia, Zongqi; Cagan, Andrew; Gainer, Vivian S.; Goryachev, Sergey; Chen, Pei; Savova, Guergana; Agniel, Denis; Churchill, Susanne; Lee, Jaeyoung; Murphy, Shawn; Plenge, Robert M.; Szolovits, Peter; Kohane, Isaac; Shaw, Stanley; Karlson, Elizabeth; Cai, Tianxi

    Background: Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. Methods and Results: We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p<0.0001) after adjusting for traditional cardiovascular risk factors. Conclusions: We developed and validated a CAD algorithm that performed well across diverse patient populations. The addition of NLP into the CAD algorithm improved the sensitivity of the algorithm, particularly in cohorts where the prevalence of CAD was low. Preliminary data suggest that CAD risk was significantly lower in RA and IBD compared to DM.