Social interaction is increasingly mediated by Internet-enabled information and communication technologies. These systems archive digital traces of activity leading to rich data that promises to provide insight into individual and group behavior. DataLab researchers leverage this capacity to collect and analyze rich data to transform our understanding of individuals, organizations, and societies.
A central question in the study of migration concerns the role that migrants play in bringing an economy towards a more efficient use of its resources. This project aims to substantially improve our state of knowledge about migration through the use of novel sources of large-scale, behavioral data.
Understanding the causes and effects of internal migration is critical to the effective design and implementation of policies that promote human development. Here, we describe how large sources of geotagged data generated by mobile phones can provide a novel source of data on internal migration.
This research seeks both to understand the patterns and mechanisms of the diffusion of misinformation on social media and to develop algorithms to automatically detect misinformation as events unfold. During natural disasters and other hazard events, individuals increasingly utilize social media to disseminate, search for and curate event-related information. There is great potential for this information to be used by affected communities and emergency responders to enhance situational awareness and improve decision-making, facilitating response activities and potentially saving lives. Yet several challenges remain; one is the generation and propagation of misinformation. Taking a novel and transformative approach, this project aims to utilize the collective intelligence of the crowd – the crowdwork of some social media users who challenge and correct questionable information – to distinguish misinformation and aid in its detection.
We provide empirical evidence that an early form of "mobile money" is used to share risk. Our analysis uses a unique dataset containing the entire universe of one country's mobile phone communications over a four-year period, and exploits spatio-temporal variation in communication caused by earthquakes and floods. We show that individuals are significantly more likely to send money to people affected by economic shocks, and that gifts are driven more by a desire for reciprocity than purely altruistic motives.
Gender disparities are decreasing overall in academia. However, we find in this data driven study that the story is a bit more complicated. Examining more than 8 million academic papers, we find that, in certain fields, women are underrepresented as signle authored papers and that men are over represented in the first and last author positions. You can explore the data for hundreds of fields of science using the following interactive visualization (http://www.eigenfactor.org/gender/).
Informal exchange of information occurs continually throughout daily life. These pre-existing communication patterns are vital during non-routine circumstances such as emergencies and disasters. In recent years, informal communication channels have been transformed by the widespread adoption of social media technologies and mobile devices. Although the potential to exploit this capacity for disaster response is increasingly recognized by practitioners, relatively little is known about the dynamics of informal online communication in response to exogenous hazard events. To address this gap, this project employs a longitudinal and comparative approach to examine the content, structure, and dynamics of online communication and information exchange during emergency and disaster events.
Serial transmission - the passing on of information from one source to another - is a phenomenon of central interest in the study of informal communication in emergency settings. Microblogging services such as Twitter make it possible to study serial transmission on a large scale, and to examine the factors that make retransmission of messages more or less likely. Here, we consider factors predicting serial transmission at the interface of formal and informal communication during disaster; specifically, we examine the retransmission by individuals of messages (tweets) issued by formal organizations on Twitter. Our central question is the following: How do message content, message style, and public attention to tweets relate to the behavioral activity of retransmitting (i.e., retweeting) a message in disaster?
We exploit a novel source of data to model the impact of migration and urbanization on segregation in Estonia. Analyzing the complete mobile phone records of hundreds of thousands of Estonians, we observe the ethnicity of each individual on the network (Russian or Estonian), the complete history of locations visited by each individual, and every phone-based interactions taking place over the network. We find that the ethnic composition of an individual's geographic neighborhood heavily influences the structure of the individual's phone-based network. We further find that patterns of segregation are significantly different for migrants than for the at-large population: migrants are more likely to interact with coethnics than non-migrants, but are less sensitive to the ethnic composition of their immediate neighborhood than non-migrants.
New sources of large-scale geospatial data can inform policy decisions ranging from disease monitoring and city planning to disaster management and humanitarian relief. However, existing methods for mining these data are not well suited to most developing country contexts where technology use is less intense and the digital traces are generally quite sparse. Here, we present a method for predicting the approximate location of a mobile phone subscriber that is more appropriate to contexts where the signal generated by each individual may be intermittent, but the collective population generates a large amount of data.
When crises occur, including natural disasters, mass casualty events, political and social protests, etc., we observe potentially drastic changes in social behavior. Local citizens, emergency responders and aid organizations flock to the physical location of the event. Global onlookers turn to communication and information exchange platforms to seek and disseminate event-related content. This social convergence behavior, long known to occur in offline settings in the wake of crisis events, is now mirrored – perhaps enhanced – in online settings. This project looks specifically at the mass convergence of public attention during crisis events. Viewed through the framework of social network analysis, mass convergence of attention onto individual actors can be conceptualized in terms of degree dynamics. This project employs a longitudinal study of social network structures in a prominent online social media platform to characterize instances of social convergence behavior and subsequent decay of social ties over time, across different actors types and different event types.
The digital traces we leave on Twitter are fruitful sources of data for social science research. However, users do not directly report key demographic characteristics – such as age, race and gender – that are critical to social scientists. Given this challenge, this project focuses on using systematic and scalable methods to extract demographic information from Twitter users’ profiles and leverage this information to answer sociologically driven questions. One current application of these methods considers whether associative networks within Twitter are as segregated as acquaintanceship networks offline. Acknowledging past work on the role that social structure and agency play in influencing the racial composition of individuals’ networks, we argue that Twitter blurs the roles of these forces as users actively create and are influenced by their own “structure.” This may result in networks that are more or less diverse than what is seen offline. Another current application of these methods addresses the role of race in considering microdynamics between citizens and the police. This project uses unsolicited, user-generated Twitter content to characterize citizens’ attitudes toward law enforcement and examines how these opinions vary along geographic, social (i.e. influence of social contacts) and demographic characteristics of the individuals involved.
Many critical policy decisions depend upon reliable and up-to-date information on market prices. We evaluate Premise Data, a new technology for measuring price information using crowd-sourced data contributed by local citizens. Our evaluation focuses on Liberia, a country with a history of economic and political instability. Our results indicate that the crowd-sourced price data is strongly correlated with traditional price indices, but that statistically and economically significant deviations exist that require deeper investigation.
Automatic payroll deduction plans, such as the popular 401(k) account in the U.S., represent one of the most effective means of increasing savings in developed countries. We designed and evaluated a mobile phone-based automatic payroll deduction system in Afghanistan, a country with limited formal financial infrastructure. Working with Afghanistan's largest telecommunications operator, we developed and launched a new mobile savings account, using a randomized control trial to concurrently evaluate 24 variants of a single basic account. Our results indicate that access to this account significantly increases the average employee's total savings.