Using Twitter for Demographic Research

Datalab Faculty

Emma Spiro


Hedy Lee

Tyler McCormick

Nina Cesare

Project Description

The digital traces we leave on Twitter are fruitful sources of data for social science research. However, users do not directly report key demographic characteristics – such as age, race and gender – that are critical to social scientists. Given this challenge, this project focuses on using systematic and scalable methods to extract demographic information from Twitter users’ profiles and leverage this information to answer sociologically driven questions. One current application of these methods considers whether associative networks within Twitter are as segregated as acquaintanceship networks offline. Acknowledging past work on the role that social structure and agency play in influencing the racial composition of individuals’ networks, we argue that Twitter blurs the roles of these forces as users actively create and are influenced by their own “structure.” This may result in networks that are more or less diverse than what is seen offline. Another current application of these methods addresses the role of race in considering microdynamics between citizens and the police. This project uses unsolicited, user-generated Twitter content to characterize citizens’ attitudes toward law enforcement and examines how these opinions vary along geographic, social (i.e. influence of social contacts) and demographic characteristics of the individuals involved.