Date of Completion
2023
Document Type
Honors College Thesis
Department
Statistics
Thesis Type
Honors College
First Advisor
Jean Gabriel Young
Second Advisor
Bernard Cole
Keywords
Bayesian, logistic regression, normal prior, double exponential prior, regularized horseshoe prior, FIDDLE, C. diff.
Abstract
Healthcare-associated Clostridiodides Difficile (C. diff.) infections are one of the most common healthcare associated infections in the U.S., leading to thousands of deaths per year. Machine learning algorithms have shown some ability to predict who is most vul- nerable to C. diff. infection utilizing electronic health records obtained soon after admittance, but these models have shown insufficient predictive capability. We extracted data from the electronic medical records provided in the MIMIC-III Clinical Database which contains data from the Beth Israel Deaconess Medical Center between 2001 and 2012, resulting in very large predictor matrices. We aimed to predict which patients would receive a positive test for C. diff. using a Bayesian logistic regression model. We examined the impact of three different priors, a normal, double exponential, and regularized horseshoe prior to understand how prior choice influenced predictive capability and the size of coefficients. We used cross-validation to test the predictive capability of each prior, and compared results between models using ROC and PR curves. Our results show that of the three priors, the regularized horseshoe prior achieves the highest prediction accuracy.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Recommended Citation
Blanchard, Trevor D., "Risk Analysis of Clostridiodides Difficile Infections in a Hospital Setting and the Impact of Prior Choice on Predictive Capability" (2023). UVM Patrick Leahy Honors College Senior Theses. 530.
https://scholarworks.uvm.edu/hcoltheses/530