Person:
Kulldorff, Martin

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Kulldorff

First Name

Martin

Name

Kulldorff, Martin

Search Results

Now showing 1 - 10 of 25
  • Thumbnail Image
    Publication
    Assessment of Quadrivalent Human Papillomavirus Vaccine Safety Using the Self-Controlled Tree-Temporal Scan Statistic Signal-Detection Method in the Sentinel System
    (Oxford University Press, 2018) Yih, W Katherine; Maro, Judith; Nguyen, Michael; Baker, Meghan; Balsbaugh, Carolyn; Cole, David V; Dashevsky, Inna; Mba-Jonas, Adamma; Kulldorff, Martin
    Abstract The self-controlled tree-temporal scan statistic—a new signal-detection method—can evaluate whether any of a wide variety of health outcomes are temporally associated with receipt of a specific vaccine, while adjusting for multiple testing. Neither health outcomes nor postvaccination potential periods of increased risk need be prespecified. Using US medical claims data in the Food and Drug Administration’s Sentinel system, we employed the method to evaluate adverse events occurring after receipt of quadrivalent human papillomavirus vaccine (4vHPV). Incident outcomes recorded in emergency department or inpatient settings within 56 days after first doses of 4vHPV received by 9- through 26.9-year-olds in 2006–2014 were identified using International Classification of Diseases, Ninth Revision, diagnosis codes and analyzed by pairing the new method with a standard hierarchical classification of diagnoses. On scanning diagnoses of 1.9 million 4vHPV recipients, 2 statistically significant categories of adverse events were found: cellulitis on days 2–3 after vaccination and “other complications of surgical and medical procedures” on days 1–3 after vaccination. Cellulitis is a known adverse event. Clinically informed investigation of electronic claims records of the patients with “other complications” did not suggest any previously unknown vaccine safety problem. Considering that thousands of potential short-term adverse events and hundreds of potential risk intervals were evaluated, these findings add significantly to the growing safety record of 4vHPV.
  • Thumbnail Image
    Publication
    Public domain small-area cancer incidence data for New York State, 2005-2009
    (2017) Boscoe, Francis P.; Talbot, Thomas O.; Kulldorff, Martin
    There has long been a demand for cancer incidence data at a fine geographic resolution for use in etiologic hypothesis generation and testing, methodological evaluation and teaching. In this paper we describe a public domain dataset containing data for 23 anatomic sites of cancer diagnosed in New York State, USA between 2005 and 2009 at the census block group level. The dataset includes 524,503 tumours distributed across 13,823 block groups with an average population of about 1400. In addition, the data have been linked with race/ethnicity and with socioeconomic indicators such as income, educational attainment and language proficiency. We demonstrate the application of the dataset by confirming two well-established relationships: that between breast cancer and median household income and that between stomach cancer and Asian race. We foresee that this dataset will serve as the basis for a wide range of spatial analyses and as a benchmark for evaluating spatial methods in the future.
  • Thumbnail Image
    Publication
    Border analysis for spatial clusters
    (BioMed Central, 2018) Oliveira, Fernando L. P.; Cançado, André L. F.; de Souza, Gustavo; Moreira, Gladston J. P.; Kulldorff, Martin
    Background: The spatial scan statistic is widely used by public health professionals in the detection of spatial clusters in inhomogeneous point process. The most popular version of the spatial scan statistic uses a circular-shaped scanning window. Several other variants, using other parametric or non-parametric shapes, are also available. However, none of them offer information about the uncertainty on the borders of the detected clusters. Method We propose a new method to evaluate uncertainty on the boundaries of spatial clusters identified through the spatial scan statistic for Poisson data. For each spatial data location i, a function F(i) is calculated. While not a probability, this function takes values in the [0, 1] interval, with a higher value indicating more evidence that the location belongs to the true cluster. Results: Through a set of simulation studies, we show that the F function provides a way to define, measure and visualize the certainty or uncertainty of each specific location belonging to the true cluster. The method can be applied whether there are one or multiple detected clusters on the map. We illustrate the new method on a data set concerning Chagas disease in Minas Gerais, Brazil. Conclusions: The higher the intensity given to an area, the higher the plausibility of that particular area to belong to the true cluster in case it exists. This way, the F function provides information from which the public health practitioner can perform a border analysis of the detected spatial scan statistic clusters. We have implemented and illustrated the border analysis F function in the context of the circular spatial scan statistic for spatially aggregated Poisson data. The definition is clearly independent of both the shape of the scanning window and the probability model under which the data is generated. To make the new method widely available to users, it has been implemented in the freely available SaTScan\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\mathrm{TM}$$\end{document}TM software www.satscan.org.
  • Thumbnail Image
    Publication
    Statistical Power for Postlicensure Medical Product Safety Data Mining
    (Ubiquity Press, 2018) Maro, Judith; Nguyen, Michael D.; Dashevsky, Inna; Baker, Meghan; Kulldorff, Martin
    Objective: To perform sample size calculations when using tree-based scan statistics in longitudinal observational databases. Methods: Tree-based scan statistics enable data mining on epidemiologic datasets where thousands of disease outcomes are organized into hierarchical tree structures with automatic adjustment for multiple testing. We show how to evaluate the statistical power of the unconditional and conditional Poisson versions. The null hypothesis is that there is no increase in the risk for any of the outcomes. The alternative is that one or more outcomes have an excess risk. We varied the excess risk, total sample size, frequency of the underlying event rate, and the level of across-the-board health care utilization. We also quantified the reduction in statistical power resulting from specifying a risk window that was too long or too short. Results: For 500,000 exposed people, we had at least 98 percent power to detect an excess risk of 1 event per 10,000 exposed for all outcomes. In the presence of potential temporal confounding due to across-the-board elevations of health care utilization in the risk window, the conditional tree-based scan statistic controlled type I error well, while the unconditional version did not. Discussion: Data mining analyses using tree-based scan statistics expand the pharmacovigilance toolbox, ensuring adequate monitoring of thousands of outcomes of interest while controlling for multiple hypothesis testing. These power evaluations enable investigators to design and optimize implementation of retrospective data mining analyses.
  • Thumbnail Image
    Publication
    Maximum linkage space-time permutation scan statistics for disease outbreak detection
    (BioMed Central, 2014) Costa, Marcelo A; Kulldorff, Martin
    Background: In disease surveillance, the prospective space-time permutation scan statistic is commonly used for the early detection of disease outbreaks. The scanning window that defines potential clusters of diseases is cylindrical in shape, which does not allow incorporating into the cluster shape potential factors that can contribute to the spread of the disease, such as information about roads, landscape, among others. Furthermore, the cylinder scanning window assumes that the spatial extent of the cluster does not change in time. Alternatively, a dynamic space-time cluster may indicate the potential spread of the disease through time. For instance, the cluster may decrease over time indicating that the spread of the disease is vanishing. Methods: This paper proposes two irregularly shaped space-time permutation scan statistics. The cluster geometry is dynamically created using a graph structure. The graph can be created to include nearest-neighbor structures, geographical adjacency information or any relevant prior information regarding the contagious behavior of the event under surveillance. Results: The new methods are illustrated using influenza cases in three New England states, and compared with the cylindrical version. A simulation study is provided to investigate some properties of the proposed arbitrary cluster detection techniques. Conclusion: We have successfully developed two new space-time permutation scan statistics methods with irregular shapes and improved computational performance. The results demonstrate the potential of these methods to quickly detect disease outbreaks with irregular geometries. Future work aims at performing intensive simulation studies to evaluate the proposed methods using different scenarios, number of cases, and graph structures.
  • Thumbnail Image
    Publication
    Drug Adverse Event Detection in Health Plan Data Using the Gamma Poisson Shrinker and Comparison to the Tree-based Scan Statistic
    (MDPI, 2013) Brown, Jeffrey; Petronis, Kenneth R.; Bate, Andrew; Zhang, Fang; Dashevsky, Inna; Kulldorff, Martin; Avery, Taliser; Davis, Robert L.; Chan, K. Arnold; Andrade, Susan E.; Boudreau, Denise; Gunter, Margaret J.; Herrinton, Lisa; Pawloski, Pamala A.; Raebel, Marsha A.; Roblin, Douglas; Smith, David; Reynolds, Robert
    Background: Drug adverse event (AE) signal detection using the Gamma Poisson Shrinker (GPS) is commonly applied in spontaneous reporting. AE signal detection using large observational health plan databases can expand medication safety surveillance. Methods: Using data from nine health plans, we conducted a pilot study to evaluate the implementation and findings of the GPS approach for two antifungal drugs, terbinafine and itraconazole, and two diabetes drugs, pioglitazone and rosiglitazone. We evaluated 1676 diagnosis codes grouped into 183 different clinical concepts and four levels of granularity. Several signaling thresholds were assessed. GPS results were compared to findings from a companion study using the identical analytic dataset but an alternative statistical method—the tree-based scan statistic (TreeScan). Results: We identified 71 statistical signals across two signaling thresholds and two methods, including closely-related signals of overlapping diagnosis definitions. Initial review found that most signals represented known adverse drug reactions or confounding. About 31% of signals met the highest signaling threshold. Conclusions: The GPS method was successfully applied to observational health plan data in a distributed data environment as a drug safety data mining method. There was substantial concordance between the GPS and TreeScan approaches. Key method implementation decisions relate to defining exposures and outcomes and informed choice of signaling thresholds.
  • Thumbnail Image
    Publication
    Guillain-Barré Syndrome, Influenza Vaccination, and Antecedent Respiratory and Gastrointestinal Infections: A Case-Centered Analysis in the Vaccine Safety Datalink, 2009–2011
    (Public Library of Science, 2013) Greene, Sharon K.; Rett, Melisa D.; Vellozzi, Claudia; Li, Lingling; Kulldorff, Martin; Marcy, S. Michael; Daley, Matthew F.; Belongia, Edward A.; Baxter, Roger; Fireman, Bruce H.; Jackson, Michael L.; Omer, Saad B.; Nordin, James D.; Jin, Robert; Weintraub, Eric S.; Vijayadeva, Vinutha; Lee, Grace
    Background: Guillain-Barré Syndrome (GBS) can be triggered by gastrointestinal or respiratory infections, including influenza. During the 2009 influenza A (H1N1) pandemic in the United States, monovalent inactivated influenza vaccine (MIV) availability coincided with high rates of wildtype influenza infections. Several prior studies suggested an elevated GBS risk following MIV, but adjustment for antecedent infection was limited. Methods: We identified patients enrolled in health plans participating in the Vaccine Safety Datalink and diagnosed with GBS from July 2009 through June 2011. Medical records of GBS cases with 2009–10 MIV, 2010–11 trivalent inactivated influenza vaccine (TIV), and/or a medically-attended respiratory or gastrointestinal infection in the 1 through 141 days prior to GBS diagnosis were reviewed and classified according to Brighton Collaboration criteria for diagnostic certainty. Using a case-centered design, logistic regression models adjusted for patient-level time-varying sources of confounding, including seasonal vaccinations and infections in GBS cases and population-level controls. Results: Eighteen confirmed GBS cases received vaccination in the 6 weeks preceding onset, among 1.27 million 2009–10 MIV recipients and 2.80 million 2010–11 TIV recipients. Forty-four confirmed GBS cases had infection in the 6 weeks preceding onset, among 3.77 million patients diagnosed with medically-attended infection. The observed-versus-expected odds that 2009–10 MIV/2010–11 TIV was received in the 6 weeks preceding GBS onset was odds ratio = 1.54, 95% confidence interval (CI), 0.59–3.99; risk difference = 0.93 per million doses, 95% CI, −0.71–5.16. The association between GBS and medically-attended infection was: odds ratio = 7.73, 95% CI, 3.60–16.61; risk difference = 11.62 per million infected patients, 95% CI, 4.49–26.94. These findings were consistent in sensitivity analyses using alternative infection definitions and risk intervals for prior vaccination shorter than 6 weeks. Conclusions: After adjusting for antecedent infections, we found no evidence for an elevated GBS risk following 2009–10 MIV/2010–11 TIV influenza vaccines. However, the association between GBS and antecedent infection was strongly elevated.
  • Thumbnail Image
    Publication
    Biochemical Phenotypes to Discriminate Microbial Subpopulations and Improve Outbreak Detection
    (Public Library of Science, 2013) Galar, Alicia; Kulldorff, Martin; Rudnick, Wallis; O'Brien, Thomas; Stelling, John
    Background: Clinical microbiology laboratories worldwide constitute an invaluable resource for monitoring emerging threats and the spread of antimicrobial resistance. We studied the growing number of biochemical tests routinely performed on clinical isolates to explore their value as epidemiological markers. Methodology/Principal Findings Microbiology laboratory results from January 2009 through December 2011 from a 793-bed hospital stored in WHONET were examined. Variables included patient location, collection date, organism, and 47 biochemical and 17 antimicrobial susceptibility test results reported by Vitek 2. To identify biochemical tests that were particularly valuable (stable with repeat testing, but good variability across the species) or problematic (inconsistent results with repeat testing), three types of variance analyses were performed on isolates of K. pneumonia: descriptive analysis of discordant biochemical results in same-day isolates, an average within-patient variance index, and generalized linear mixed model variance component analysis. Results: 4,200 isolates of K. pneumoniae were identified from 2,485 patients, 32% of whom had multiple isolates. The first two variance analyses highlighted SUCT, TyrA, GlyA, and GGT as “nuisance” biochemicals for which discordant within-patient test results impacted a high proportion of patient results, while dTAG had relatively good within-patient stability with good heterogeneity across the species. Variance component analyses confirmed the relative stability of dTAG, and identified additional biochemicals such as PHOS with a large between patient to within patient variance ratio. A reduced subset of biochemicals improved the robustness of strain definition for carbapenem-resistant K. pneumoniae. Surveillance analyses suggest that the reduced biochemical profile could improve the timeliness and specificity of outbreak detection algorithms. Conclusions: The statistical approaches explored can improve the robust recognition of microbial subpopulations with routinely available biochemical test results, of value in the timely detection of outbreak clones and evolutionarily important genetic events.
  • Thumbnail Image
    Publication
    Multiple Source Spatial Cluster Detection Through Multi-criteria Analysis
    (University of Illinois at Chicago Library, 2013) Duczmal, Luiz H.; Almeida, Alexandre C. L.; da Silva, Fabio R.; Kulldorff, Martin
    Objective: To incorporate information from multiple data streams of disease surveillance to achieve more coherent spatial cluster detection using statistical tools from multi-criteria analysis. Introduction: Multiple data sources are essential to provide reliable information regarding the emergence of potential health threats, compared to single source methods [1,2]. Spatial Scan Statistics have been adapted to analyze multivariate data sources [1]. In this context, only ad hoc procedures have been devised to address the problem of selecting the most likely cluster and computing its significance. A multi-objective scan was proposed to detect clusters for a single data source [3]. Methods: For simplicity, consider only two data streams. The j-th objective function evaluates the strength of candidate clusters using only information from the j-th data stream. The best cluster solutions are found by maximizing two objective functions simultaneously, based on the concept of dominance: a point is called dominated if it is worse than another point in at least one objective, while not being better than that point in any other objective [4]. The nondominated set consists of all solutions which are not dominated by any other solution. To evaluate the statistical significance of solutions, a statistical approach based on the concept of attainment function is used [4]. Results: The two datasets are standardized brain cancer mortality rates for male and female adults for each of the 3111 counties in the 48 contiguous states of the US, from 1986 to 1995 [5]. We run the circular scan and plot the (m(Zi),w(Zi)) points in the Cartesian plane, where m(Zi) and w(Zi) are the LLR for the zone Zi in the men’s and women’s brain cancer map, respectively, and i, i=1,...,N(r) is the set of all circular zones up to a radius r>0. The non-dominated set is inspected to observe possible correlations between the two maps regarding brain cancer clustering (Figure 1); e.g., the upper inset map has high LLR value on women’s map, but not on men’s; the inverse happens to the lower inset map. Other nondominated clusters in the middle have lower LLR values on both datasets. The first two examples have comparatively lower p-value (they belong to the two “knees” in the nondominated set), as computed using the attainment surfaces (not shown in the figure). Conclusions: The multi-criteria multivariate approach has several advantages: (i) the representation of the evaluation function for each datastream is very clear, and does not suffer from an artificial, and possibly confusing mixture with the other datastream evaluations; (jj) it is possible to attribute, in a rigorous way, the statistical significance of each candidate cluster; (iii) it is possible to analyze and pick-up the best cluster solutions, as given naturally by the non-dominated set. Part of the solution set in the LLR(male) X LLR(female) space of the male/female brain cancer datasets for the US counties map. Clusters are indicated by blue points, with the non-dominated solutions represented by small red circles. The inset maps depict the geographic location of the clusters found in the US counties map (yellow circles) for two sample non-dominated solutions.
  • Thumbnail Image
    Publication
    Laboratory-Based Prospective Surveillance for Community Outbreaks of Shigella spp. in Argentina
    (Public Library of Science, 2013) Viñas, María R.; Tuduri, Ezequiel; Galar, Alicia; Yih, Katherine; Pichel, Mariana; Stelling, John; Brengi, Silvina P.; Della Gaspera, Anabella; van der Ploeg, Claudia; Bruno, Susana; Rogé, Ariel; Caffer, María I.; Kulldorff, Martin; Galas, Marcelo
    Background: To implement effective control measures, timely outbreak detection is essential. Shigella is the most common cause of bacterial diarrhea in Argentina. Highly resistant clones of Shigella have emerged, and outbreaks have been recognized in closed settings and in whole communities. We hereby report our experience with an evolving, integrated, laboratory-based, near real-time surveillance system operating in six contiguous provinces of Argentina during April 2009 to March 2012. Methodology To detect localized shigellosis outbreaks timely, we used the prospective space-time permutation scan statistic algorithm of SaTScan, embedded in WHONET software. Twenty three laboratories sent updated Shigella data on a weekly basis to the National Reference Laboratory. Cluster detection analysis was performed at several taxonomic levels: for all Shigella spp., for serotypes within species and for antimicrobial resistance phenotypes within species. Shigella isolates associated with statistically significant signals (clusters in time/space with recurrence interval ≥365 days) were subtyped by pulsed field gel electrophoresis (PFGE) using PulseNet protocols. Principal Findings In three years of active surveillance, our system detected 32 statistically significant events, 26 of them identified before hospital staff was aware of any unexpected increase in the number of Shigella isolates. Twenty-six signals were investigated by PFGE, which confirmed a close relationship among the isolates for 22 events (84.6%). Seven events were investigated epidemiologically, which revealed links among the patients. Seventeen events were found at the resistance profile level. The system detected events of public health importance: infrequent resistance profiles, long-lasting and/or re-emergent clusters and events important for their duration or size, which were reported to local public health authorities. Conclusions/Significance: The WHONET-SaTScan system may serve as a model for surveillance and can be applied to other pathogens, implemented by other networks, and scaled up to national and international levels for early detection and control of outbreaks.