Date of Award

2014

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Xindong Wu

Second Advisor

Joshua Bongard

Third Advisor

Alan Rubin

Abstract

The emergence of social media has impacted the way people think, communicate, behave, learn, and conduct research. In recent years, a large number of studies have analyzed and modeled this social phenomena. Driven by commercial and social interests, social media has become an attractive subject for researchers. Accordingly, new models, algorithms, and applications to address specific domains and solve distinct problems have erupted. In this thesis, we propose a novel network model and a path mining algorithm called HashnetMiner to discover implicit knowledge that is not easily exposed using other network models. Our experiments using HashnetMiner have demonstrated anecdotal evidence of drug-drug interactions when applied to a drug reaction context.

The proposed research comprises three parts built upon the common theme of utilizing hashtags in tweets.

1 Digital Recruitment on Twitter. We build an expert system shell for two different studies: (1) a nicotine patch study where the system reads streams of tweets in real time and decides whether to recruit the senders to participate in the study, and (2) an environmental health study where the system identifies individuals who can participate in a survey using Twitter.

2 Does Social Media Big Data Make the World Smaller? This work provides an exploratory analysis of large-scale keyword-hashtag networks (K-H) generated from Twitter. We use two different measures, (1) the number of vertices that connect any two keywords, and (2) the eccentricity of keyword vertices, a well-known centrality and shortest path measure. Our analysis shows that K-H networks conform to the phenomenon of the shrinking world and expose hidden paths among concepts.

3 We pose the following biomedical web science question: Can patterns identified in Twitter hashtags provide clinicians with a powerful tool to extrapolate a new medical therapies and/or drugs? We present a systematic network mining method HashnetMiner, that operates on networks of medical concepts and hashtags. To the best of our knowledge, this is the first effort to present Biomedical Web Science models and algorithms that address such a question by means of data mining and knowledge discovery using hashtag-based networks.

Language

English

Number of Pages

95