Date of Completion


Document Type

Honors College Thesis



Thesis Type

Honors College

First Advisor

Indra Neil Sarkar


computing methodologies, vitamin deficiency diseases, vegetarian diet


Vitamins are nutrients that are essential to human health, and deficiencies have been shown to cause severe diseases. In this study, a computational approach was used to identify vitamin deficiency diseases and plant-based foods with vitamin content. Data from the United States Department of Agriculture Standard Reference (SR27), National Library of Medicine's Medical Subject Headings and MEDLINE, and Wikipedia were combined to identify vitamin deficiency diseases and vitamin content of plant-based foods. A total of 41,584 vitamin-disease associations were identified from MEDLINE-indexed articles as well as from entries in Wikipedia. The SR27 identified 1912 foods that contained at least one vitamin, with an average of 1276 foods per vitamin. Vitamin B12 and D contained the fewest number of foods (n=135 and 70, respectively). The results of this study establish the foundation for developing a process to link vitamin deficiency diseases to vitamin-rich foods.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.

full_association_scores.txt (3446 kB)
vitA.txt (117 kB)
vitA_with_vitamers.txt (445 kB)
vitB1.txt (183 kB)
vitB2.txt (191 kB)
vitB3.txt (183 kB)
vitB5.txt (181 kB)
vitB6.txt (189 kB)
vitB9.txt (173 kB)
vitB12.txt (14 kB)
vitB12_fortification_desc.txt (16 kB)
vitC.txt (178 kB)
vitD.txt (7 kB)
vitE.txt (133 kB)
vitE_noncomparable_vitamers.txt (226 kB)
vitK.txt (111 kB)
vitK_noncomparable_vitamers.txt (116 kB)