Publication:
The Drug Data to Knowledge Pipeline: Large-Scale Claims Data Classification for Pharmacologic Insight

Thumbnail Image

Date

2016

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

American Medical Informatics Association
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Homer, Mark L., Nathan P. Palmer, Olivier Bodenreider, Aurel Cami, Laura Chadwick, and Kenneth D. Mandl. 2016. “The Drug Data to Knowledge Pipeline: Large-Scale Claims Data Classification for Pharmacologic Insight.” AMIA Summits on Translational Science Proceedings 2016 (1): 105-111.

Research Data

Abstract

In biomedical informatics, assigning drug codes to categories is a common step in the analysis pipeline. Unfortunately, incomplete mappings are the norm rather than the exception with coverage values less than 85% not uncommon. Here, we perform this linking task on a nationwide insurance claims database with over 13 million members who were dispensed, according to National Drug Codes (NDCs), over 50,000 unique product forms of medication. The chosen approach employs Cerner Multum’s VantageRx and the U.S. National Library of Medicine’s RxMix. As a result, 94.0% of the NDCs were successfully mapped to categories used by common drug terminologies, e.g., Anatomical Therapeutic Chemical (ATC). Implemented as an SQL database and scripts, the approach is generic and can be setup for a new data set in a few hours. Thus, the method is a viable option for large-scale drug classification.

Description

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories