Publication:
Statistical Power for Postlicensure Medical Product Safety Data Mining

Thumbnail Image

Open/View Files

Date

2018

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

Ubiquity Press
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Maro, Judith C., Michael D. Nguyen, Inna Dashevsky, Meghan A. Baker, and Martin Kulldorff. 2018. “Statistical Power for Postlicensure Medical Product Safety Data Mining.” eGEMs 5 (1): 6. doi:10.5334/egems.225. http://dx.doi.org/10.5334/egems.225.

Research Data

Abstract

Objective: To perform sample size calculations when using tree-based scan statistics in longitudinal observational databases. Methods: Tree-based scan statistics enable data mining on epidemiologic datasets where thousands of disease outcomes are organized into hierarchical tree structures with automatic adjustment for multiple testing. We show how to evaluate the statistical power of the unconditional and conditional Poisson versions. The null hypothesis is that there is no increase in the risk for any of the outcomes. The alternative is that one or more outcomes have an excess risk. We varied the excess risk, total sample size, frequency of the underlying event rate, and the level of across-the-board health care utilization. We also quantified the reduction in statistical power resulting from specifying a risk window that was too long or too short. Results: For 500,000 exposed people, we had at least 98 percent power to detect an excess risk of 1 event per 10,000 exposed for all outcomes. In the presence of potential temporal confounding due to across-the-board elevations of health care utilization in the risk window, the conditional tree-based scan statistic controlled type I error well, while the unconditional version did not. Discussion: Data mining analyses using tree-based scan statistics expand the pharmacovigilance toolbox, ensuring adequate monitoring of thousands of outcomes of interest while controlling for multiple hypothesis testing. These power evaluations enable investigators to design and optimize implementation of retrospective data mining analyses.

Description

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories