Publication:
Improving risk prediction accuracy for new soldiers in the U.S. Army by adding self-report survey data to administrative data

Thumbnail Image

Date

2018

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Bernecker, S. L., A. J. Rosellini, M. K. Nock, W. T. Chiu, P. M. Gutierrez, I. Hwang, T. E. Joiner, et al. 2018. “Improving risk prediction accuracy for new soldiers in the U.S. Army by adding self-report survey data to administrative data.” BMC Psychiatry 18 (1): 87. doi:10.1186/s12888-018-1656-4. http://dx.doi.org/10.1186/s12888-018-1656-4.

Research Data

Abstract

Background: High rates of mental disorders, suicidality, and interpersonal violence early in the military career have raised interest in implementing preventive interventions with high-risk new enlistees. The Army Study to Assess Risk and Resilience in Servicemembers (STARRS) developed risk-targeting systems for these outcomes based on machine learning methods using administrative data predictors. However, administrative data omit many risk factors, raising the question whether risk targeting could be improved by adding self-report survey data to prediction models. If so, the Army may gain from routinely administering surveys that assess additional risk factors. Methods: The STARRS New Soldier Survey was administered to 21,790 Regular Army soldiers who agreed to have survey data linked to administrative records. As reported previously, machine learning models using administrative data as predictors found that small proportions of high-risk soldiers accounted for high proportions of negative outcomes. Other machine learning models using self-report survey data as predictors were developed previously for three of these outcomes: major physical violence and sexual violence perpetration among men and sexual violence victimization among women. Here we examined the extent to which this survey information increases prediction accuracy, over models based solely on administrative data, for those three outcomes. We used discrete-time survival analysis to estimate a series of models predicting first occurrence, assessing how model fit improved and concentration of risk increased when adding the predicted risk score based on survey data to the predicted risk score based on administrative data. Results: The addition of survey data improved prediction significantly for all outcomes. In the most extreme case, the percentage of reported sexual violence victimization among the 5% of female soldiers with highest predicted risk increased from 17.5% using only administrative predictors to 29.4% adding survey predictors, a 67.9% proportional increase in prediction accuracy. Other proportional increases in concentration of risk ranged from 4.8% to 49.5% (median = 26.0%). Conclusions: Data from an ongoing New Soldier Survey could substantially improve accuracy of risk models compared to models based exclusively on administrative predictors. Depending upon the characteristics of interventions used, the increase in targeting accuracy from survey data might offset survey administration costs. Electronic supplementary material The online version of this article (10.1186/s12888-018-1656-4) contains supplementary material, which is available to authorized users.

Description

Keywords

Army, Military, Predictive modeling, Risk assessment, Violence, Sexual assault

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories