Characterizing the dynamics of cultural phenomena with Tweets

Conference Year

January 2019

Abstract

Quantifying public discourse is a topic of perennial interest in the social sciences. With the advent of social media, researchers now have the ability to collect real-time, high-volume conversational data from a diverse group of individuals. The global micro-blogging service Twitter has become a prominent conduit for both information dissemination and interpersonal conversation. In this work, we characterize common motifs in collective attention at varying time scales, and cataloging similarities and divergences in the spectral properties of time series derived from observed word and phrase frequencies (N-grams). This corpus enables automated event detection and extraction of "shapelets"---characteristic shapes common to many N-gram time series. Using a corpus of 1-grams---collection of words---parsed from approximately 10% of all tweets authored between 2009 and 2019, we demonstrate the dynamics of social media conversations surrounding major hurricanes and US presidents.

Primary Faculty Mentor Name

Peter Dodds

Secondary Mentor Name

Chris Danforth

Status

Graduate

Student College

College of Engineering and Mathematical Sciences

Program/Major

Complex Systems

Primary Research Category

Social Sciences

Secondary Research Category

Engineering & Physical Sciences

Abstract only.

Share

COinS
 

Characterizing the dynamics of cultural phenomena with Tweets

Quantifying public discourse is a topic of perennial interest in the social sciences. With the advent of social media, researchers now have the ability to collect real-time, high-volume conversational data from a diverse group of individuals. The global micro-blogging service Twitter has become a prominent conduit for both information dissemination and interpersonal conversation. In this work, we characterize common motifs in collective attention at varying time scales, and cataloging similarities and divergences in the spectral properties of time series derived from observed word and phrase frequencies (N-grams). This corpus enables automated event detection and extraction of "shapelets"---characteristic shapes common to many N-gram time series. Using a corpus of 1-grams---collection of words---parsed from approximately 10% of all tweets authored between 2009 and 2019, we demonstrate the dynamics of social media conversations surrounding major hurricanes and US presidents.