Computational Social Science

Computational Social Science

Social interaction is increasingly mediated by Internet-enabled information and communication technologies. These systems archive digital traces of activity leading to rich data that promises to provide insight into individual and group behavior. DataLab researchers leverage this capacity to collect and analyze rich data to transform our understanding of individuals, organizations, and societies.

Current Projects

Detecting Misinformation Flows in Social Media Spaces During Crisis Events

This research seeks both to understand the patterns and mechanisms of the diffusion of misinformation on social media and to develop algorithms to automatically detect misinformation as events unfold. During natural disasters and other hazard events, individuals increasingly utilize social media to disseminate, search for and curate event-related information. There is great potential for this information to be used by affected communities and emergency responders to enhance situational awareness and improve decision-making, facilitating response activities and potentially saving lives. Yet several challenges remain; one is the generation and propagation of misinformation. Taking a novel and transformative approach, this project aims to utilize the collective intelligence of the crowd – the crowdwork of some social media users who challenge and correct questionable information – to distinguish misinformation and aid in its detection.

Role of Gender in Scholarly Authorship

Gender disparities are decreasing overall in academia.  However, we find in this data driven study that the story is a bit more complicated.  Examining more than 8 million academic papers, we find that, in certain fields, women are underrepresented as signle authored papers and that men are over represented in the first and last author positions. You can explore the data for hundreds of fields of science using the following interactive visualization (

Communication Dynamics During Disasters

Informal exchange of information occurs continually throughout daily life. These pre-existing communication patterns are vital during non-routine circumstances such as emergencies and disasters. In recent years, informal communication channels have been transformed by the widespread adoption of social media technologies and mobile devices. Although the potential to exploit this capacity for disaster response is increasingly recognized by practitioners, relatively little is known about the dynamics of informal online communication in response to exogenous hazard events. To address this gap, this project employs a longitudinal and comparative approach to examine the content, structure, and dynamics of online communication and information exchange during emergency and disaster events.

Serial Transmission of Information During Emergencies

Serial transmission - the passing on of information from one source to another - is a phenomenon of central interest in the study of informal communication in emergency settings.  Microblogging services such as Twitter make it possible to study serial transmission on a large scale, and to examine the factors that make retransmission of messages more or less likely.  Here, we consider factors predicting serial transmission at the interface of formal and informal communication during disaster; specifically, we examine the retransmission by individuals of messages (tweets) issued by formal organizations on Twitter.  Our central question is the following:  How do message content, message style, and public attention to tweets relate to the behavioral activity of retransmitting (i.e., retweeting) a message in disaster?

A Society of "Silent Separation": Migration and Ethnic Segregation in Estonia

We exploit a novel source of data to model the impact of migration and urbanization on segregation in Estonia.  Analyzing the complete mobile phone records of hundreds of thousands of Estonians, we observe the ethnicity of each individual on the network (Russian or Estonian), the complete history of locations visited by each individual, and every phone-based interactions taking place over the network.  We find that the ethnic composition of an individual's geographic neighborhood heavily influences the structure of the individual's phone-based network.  We further find that patterns of segregation are significantly different for migrants than for the at-large population: migrants are more likely to interact with coethnics than non-migrants, but are less sensitive to the ethnic composition of their immediate neighborhood than non-migrants.

Mass Convergence of Attention During Crisis Events: Degree Dynamics and Emergency Response in Online Setting

When crises occur, including natural disasters, mass casualty events, political and social protests, etc., we observe potentially drastic changes in social behavior. Local citizens, emergency responders and aid organizations flock to the physical location of the event. Global onlookers turn to communication and information exchange platforms to seek and disseminate event-related content. This social convergence behavior, long known to occur in offline settings in the wake of crisis events, is now mirrored – perhaps enhanced – in online settings. This project looks specifically at the mass convergence of public attention during crisis events. Viewed through the framework of social network analysis, mass convergence of attention onto individual actors can be conceptualized in terms of degree dynamics. This project employs a longitudinal study of social network structures in a prominent online social media platform to characterize instances of social convergence behavior and subsequent decay of social ties over time, across different actors types and different event types. 

Using Twitter for Demographic Research

The digital traces we leave on Twitter are fruitful sources of data for social science research. However, users do not directly report key demographic characteristics – such as age, race and gender – that are critical to social scientists. Given this challenge, this project focuses on using systematic and scalable methods to extract demographic information from Twitter users’ profiles and leverage this information to answer sociologically driven questions. One current application of these methods considers whether associative networks within Twitter are as segregated as acquaintanceship networks offline. Acknowledging past work on the role that social structure and agency play in influencing the racial composition of individuals’ networks, we argue that Twitter blurs the roles of these forces as users actively create and are influenced by their own “structure.” This may result in networks that are more or less diverse than what is seen offline. Another current application of these methods addresses the role of race in considering microdynamics between citizens and the police. This project uses unsolicited, user-generated Twitter content to characterize citizens’ attitudes toward law enforcement and examines how these opinions vary along geographic, social (i.e. influence of social contacts) and demographic characteristics of the individuals involved.