Person:
Miller, Timothy

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Miller

First Name

Timothy

Name

Miller, Timothy

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Publication
    Automatic Prediction of Rheumatoid Arthritis Disease Activity from the Electronic Medical Records
    (Public Library of Science, 2013) Lin, Chen; Karlson, Elizabeth; Canhao, Helena; Miller, Timothy; Dligach, Dmitriy; Chen, Pei Jun; Perez, Raul Natanael Guzman; Shen, Yuanyan; Weinblatt, Michael; Shadick, Nancy; Plenge, Robert M.; Savova, Guergana
    Objective: We aimed to mine the data in the Electronic Medical Record to automatically discover patients' Rheumatoid Arthritis disease activity at discrete rheumatology clinic visits. We cast the problem as a document classification task where the feature space includes concepts from the clinical narrative and lab values as stored in the Electronic Medical Record. Materials and Methods The Training Set consisted of 2792 clinical notes and associated lab values. Test Set 1 included 1749 clinical notes and associated lab values. Test Set 2 included 344 clinical notes for which there were no associated lab values. The Apache clinical Text Analysis and Knowledge Extraction System was used to analyze the text and transform it into informative features to be combined with relevant lab values. Results: Experiments over a range of machine learning algorithms and features were conducted. The best performing combination was linear kernel Support Vector Machines with Unified Medical Language System Concept Unique Identifier features with feature selection and lab values. The Area Under the Receiver Operating Characteristic Curve (AUC) is 0.831 (σ = 0.0317), statistically significant as compared to two baselines (AUC = 0.758, σ = 0.0291). Algorithms demonstrated superior performance on cases clinically defined as extreme categories of disease activity (Remission and High) compared to those defined as intermediate categories (Moderate and Low) and included laboratory data on inflammatory markers. Conclusion: Automatic Rheumatoid Arthritis disease activity discovery from Electronic Medical Record data is a learnable task approximating human performance. As a result, this approach might have several research applications, such as the identification of patients for genome-wide pharmacogenetic studies that require large sample sizes with precise definitions of disease activity and response to therapies.
  • Thumbnail Image
    Publication
    Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing
    (Public Library of Science, 2014) Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl
    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP.
  • Thumbnail Image
    Publication
    ClinicalTrials.gov as a Data Source for Semi-Automated Point-Of-Care Trial Eligibility Screening
    (Public Library of Science, 2014) Pfiffner, Pascal B; Oh, JiWon; Miller, Timothy; Mandl, Kenneth
    Background: Implementing semi-automated processes to efficiently match patients to clinical trials at the point of care requires both detailed patient data and authoritative information about open studies. Objective: To evaluate the utility of the ClinicalTrials.gov registry as a data source for semi-automated trial eligibility screening. Methods: Eligibility criteria and metadata for 437 trials open for recruitment in four different clinical domains were identified in ClinicalTrials.gov. Trials were evaluated for up to date recruitment status and eligibility criteria were evaluated for obstacles to automated interpretation. Finally, phone or email outreach to coordinators at a subset of the trials was made to assess the accuracy of contact details and recruitment status. Results: 24% (104 of 437) of trials declaring on open recruitment status list a study completion date in the past, indicating out of date records. Substantial barriers to automated eligibility interpretation in free form text are present in 81% to up to 94% of all trials. We were unable to contact coordinators at 31% (45 of 146) of the trials in the subset, either by phone or by email. Only 53% (74 of 146) would confirm that they were still recruiting patients. Conclusion: Because ClinicalTrials.gov has entries on most US and many international trials, the registry could be repurposed as a comprehensive trial matching data source. Semi-automated point of care recruitment would be facilitated by matching the registry's eligibility criteria against clinical data from electronic health records. But the current entries fall short. Ultimately, improved techniques in natural language processing will facilitate semi-automated complex matching. As immediate next steps, we recommend augmenting ClinicalTrials.gov data entry forms to capture key eligibility criteria in a simple, structured format.