Background: Unhealthy drinking is prevalent in the United States and can lead to serious health and social consequences, yet it is under-diagnosed and under-treated. Identifying unhealthy drinkers can be time-consuming for primary care providers. An automated tool for identification would allow attention to be focused on patients most likely to need care and therefore increase efficiency and effectiveness.
Objectives: To build a clinical prediction tool for unhealthy drinking based solely on routinely collected demographic and laboratory data.
Methods: We obtained demographic and laboratory data on 89,325 adults seen at the University of Vermont Medical Center from 2011-2017. Logistic regression, support vector machines (SVM), k-nearest neighbor, and random forests were each used to build clinical prediction models. The model with the largest area under the receiver operator curve (AUC) was selected.
Results: SVM with polynomials of degree 3 produced the largest AUC. The most influential predictors were alkaline phosphatase, gender, glucose, and serum bicarbonate. The optimum operating point had sensitivity 31.1%, specificity 91.2%, positive predictive value 50.4%, and negative predictive value 82.1%. Application of the tool increased the prevalence of unhealthy drinking from 18.3% to 32.4%, while reducing the target population by 22%.
Limitations: Universal screening was not used during the time data was collected. The prevalence of unhealthy drinking among those screened was 60% suggesting the AUDIT-C was administered to confirm rather than screen for unhealthy drinking.
Conclusion: An automated tool, using commonly available data, can identify a subset of patients who appear to warrant clinical attention for unhealthy drinking.
Bonnell, Levi N.; Littenberg, Benjamin; Wshah, Safwan R.; and Rose, Gail L., "Automated Identification of Unhealthy Drinking Using Routinely Collected Data: A Machine Learning Approach" (2019). Larner College of Medicine Faculty Publications. 9.