Large Scale Data Integration of Biomedical Literature
Conference Year
January 2021
Abstract
The ever-growing mountain of scientific publications on SARS-CoV-2 and COVID-19 must be continually translated into clinical and public health practices to better understand and stop this pandemic. Accomplishing this translation on the mass scale of hundreds of thousands of articles requires information integration systems such as the COVID-19 publications database built using the RefBin platform. The RefBin COVID-19 database integrates findings from a diverse set of incoming articles using free form narrative descriptions crafted by and for the users of the database. In this study we aim to investigate whether RefBin’s method of integrating data enables users to more efficiently and accurately answer COVID-19 related questions than traditional research methods. Study participants are given a test containing standardized COVID-19-related questions which they are randomly assigned to answer using either the COVID-19 publications database or a traditional research method of their choosing. The time required to answer each question and the accuracy of the responses are recorded. We expect to find that participants using RefBin will be able to more frequently answer questions correctly and require less time to find correct answers. These findings would indicate a greater degree of research efficiency while using RefBin and provide evidence in favor of using freeform narrative descriptions of information to better integrate expansive and diverse data sources.
Primary Faculty Mentor Name
David Krag
Faculty/Staff Collaborators
Shania Lunna, Sarah Robtoy, Rachel Bombardier
Status
Undergraduate
Student College
College of Arts and Sciences
Program/Major
Biological Science
Primary Research Category
Health Sciences
Large Scale Data Integration of Biomedical Literature
The ever-growing mountain of scientific publications on SARS-CoV-2 and COVID-19 must be continually translated into clinical and public health practices to better understand and stop this pandemic. Accomplishing this translation on the mass scale of hundreds of thousands of articles requires information integration systems such as the COVID-19 publications database built using the RefBin platform. The RefBin COVID-19 database integrates findings from a diverse set of incoming articles using free form narrative descriptions crafted by and for the users of the database. In this study we aim to investigate whether RefBin’s method of integrating data enables users to more efficiently and accurately answer COVID-19 related questions than traditional research methods. Study participants are given a test containing standardized COVID-19-related questions which they are randomly assigned to answer using either the COVID-19 publications database or a traditional research method of their choosing. The time required to answer each question and the accuracy of the responses are recorded. We expect to find that participants using RefBin will be able to more frequently answer questions correctly and require less time to find correct answers. These findings would indicate a greater degree of research efficiency while using RefBin and provide evidence in favor of using freeform narrative descriptions of information to better integrate expansive and diverse data sources.