Large Scale Data Integration of Biomedical Literature

Conference Year

January 2021

Abstract

The ever-growing mountain of scientific publications on SARS-CoV-2 and COVID-19 must be continually translated into clinical and public health practices to better understand and stop this pandemic. Accomplishing this translation on the mass scale of hundreds of thousands of articles requires information integration systems such as the COVID-19 publications database built using the RefBin platform. The RefBin COVID-19 database integrates findings from a diverse set of incoming articles using free form narrative descriptions crafted by and for the users of the database. In this study we aim to investigate whether RefBin’s method of integrating data enables users to more efficiently and accurately answer COVID-19 related questions than traditional research methods. Study participants are given a test containing standardized COVID-19-related questions which they are randomly assigned to answer using either the COVID-19 publications database or a traditional research method of their choosing. The time required to answer each question and the accuracy of the responses are recorded. We expect to find that participants using RefBin will be able to more frequently answer questions correctly and require less time to find correct answers. These findings would indicate a greater degree of research efficiency while using RefBin and provide evidence in favor of using freeform narrative descriptions of information to better integrate expansive and diverse data sources.

Primary Faculty Mentor Name

David Krag

Faculty/Staff Collaborators

Shania Lunna, Sarah Robtoy, Rachel Bombardier

Status

Undergraduate

Student College

College of Arts and Sciences

Program/Major

Biological Science

Primary Research Category

Health Sciences

Abstract only.

Share

COinS
 

Large Scale Data Integration of Biomedical Literature

The ever-growing mountain of scientific publications on SARS-CoV-2 and COVID-19 must be continually translated into clinical and public health practices to better understand and stop this pandemic. Accomplishing this translation on the mass scale of hundreds of thousands of articles requires information integration systems such as the COVID-19 publications database built using the RefBin platform. The RefBin COVID-19 database integrates findings from a diverse set of incoming articles using free form narrative descriptions crafted by and for the users of the database. In this study we aim to investigate whether RefBin’s method of integrating data enables users to more efficiently and accurately answer COVID-19 related questions than traditional research methods. Study participants are given a test containing standardized COVID-19-related questions which they are randomly assigned to answer using either the COVID-19 publications database or a traditional research method of their choosing. The time required to answer each question and the accuracy of the responses are recorded. We expect to find that participants using RefBin will be able to more frequently answer questions correctly and require less time to find correct answers. These findings would indicate a greater degree of research efficiency while using RefBin and provide evidence in favor of using freeform narrative descriptions of information to better integrate expansive and diverse data sources.