Word Embeddings to quantify Depressive Language in Twitter
Sandhya,
Sandhya,
Citations
Altmetric:
License
License
Abstract
How do people discuss mental health on social media? Can we train an algorithm to recognize differences between discussions of depression and other topics? Can an algorithm predict that someone is depressed from their tweets alone? In this project, we collect tweets referencing `depression’ over a seven year period, and train word embeddings to characterize linguistic structures within the corpus. We find that neural word embeddings capture the contextual differences between “depressed” and “healthy” language. The best performing model for the prediction task is Long Short-Term Memory (LSTM) with 70% test accuracy. Finally, we train a similar model on a much smaller collection of tweets authored by individuals formally diagnosed with depression. The results suggest social media could serve as a valuable screening tool for mental health.
Description
Date
2019-01-01
Student Status
Graduate
Journal Title
Journal ISSN
Volume Title
Type of presentation
Poster Presentation
Collections
Research Projects
Organizational Units
Journal Issue
Citation
DOI
Advisor(s)
Department
Program/Major
Complex Systems
Computer Science
Computer Science
College/School
College of Engineering and Mathematical Sciences
College of Engineering and Mathematical Sciences
College of Engineering and Mathematical Sciences
Organization
item.page.researchcategory
Health Sciences
Social Sciences
Social Sciences
