Item

Threat Detection on Twitter Using Corpus Linguistics

Beach, Addie
Citations
Altmetric:
License
License
Abstract
As social media increases in popularity, its ability to create culturally meaningful tools grows as well. One of the most promising tools is categorization software, which analyzes the linguistic data in social media posts to make predictions. It does with the help of corpus linguistics, a form of analysis designed to pick out the most frequent and/or significant words in a dataset. This study focuses on software intended to detect threats. While this technology has the potential to flag abusive language used by groups or individuals, the text search strategies it currently uses often result in a high number of false positives, making it too unreliable for effective use. The software is most effective at marking whether or not a specific word is present in a tweet, not determining whether or not this word is actually being used in a threatening way (e.g. "I'm planning on killing him" vs. "this silence is killing me"). Discourse analysis, which looks at the role context plays in language, could minimize these errors by helping researchers refine the software in a manner that more closely matches how people actually use language. The goal of this project, then, is to investigate ways of combining corpus linguistics and discourse analysis with a Twitter database to improve predictive analysis.
Description
Date
2019-01-01
Student Status
Undergraduate
Journal Title
Journal ISSN
Volume Title
Type of presentation
Poster Presentation
Research Projects
Organizational Units
Journal Issue
Citation
DOI
Advisor(s)
Department
Program/Major
Linguistics
College/School
College of Arts and Sciences
Organization
item.page.researchcategory
Social Sciences
Embedded videos