Document Type


Publication Date



Background: Unhealthy drinking is prevalent in the United States and can lead to serious health and social consequences, yet it is under-diagnosed and under-treated. Identifying unhealthy drinkers can be time-consuming for primary care providers. An automated tool for identification would allow attention to be focused on patients most likely to need care and therefore increase efficiency and effectiveness.

Objectives: To build a clinical prediction tool for unhealthy drinking based solely on routinely collected demographic and laboratory data.

Methods: We obtained demographic and laboratory data on 89,325 adults seen at the University of Vermont Medical Center from 2011-2017. Logistic regression, support vector machines (SVM), k-nearest neighbor, and random forests were each used to build clinical prediction models. The model with the largest area under the receiver operator curve (AUC) was selected.

Results: SVM with polynomials of degree 3 produced the largest AUC. The most influential predictors were alkaline phosphatase, gender, glucose, and serum bicarbonate. The optimum operating point had sensitivity 31.1%, specificity 91.2%, positive predictive value 50.4%, and negative predictive value 82.1%. Application of the tool increased the prevalence of unhealthy drinking from 18.3% to 32.4%, while reducing the target population by 22%.

Limitations: Universal screening was not used during the time data was collected. The prevalence of unhealthy drinking among those screened was 60% suggesting the AUDIT-C was administered to confirm rather than screen for unhealthy drinking.

Conclusion: An automated tool, using commonly available data, can identify a subset of patients who appear to warrant clinical attention for unhealthy drinking.