About our Work
Research at the DataLab explores novel uses for large-scale heterogeneous data to understand the behavior of individuals, firms, societies and economies. Our projects are focused in six primary areas.
Research Focus Areas
The proliferation of mobile phones and other technologies in all regions of the world provides a unique opportunity to use large-scale digital data for social good. By developing a set of methods for understanding social and economic behavior in developing countries, work at the DataLab is helping to promote effective public policy for alleviating poverty and producing positive social change.
Today's digital environments faciliate the spread of information. Whether on Facebook, Twitter or other social media platforms, ideas can move from living room to world-wide audience in a very short time. The challenge we are facing today, though, is sorting out legitimate information from rumors, misinformation, and disinformation. In our lab, we study the spreading dynamics of these rumors, the environments that faciliate their spread, the impact they are having on social instiuttions, and the ways to combat misinformation.
Data visualizations convey patterns by encoding data in the visual attributes (e.g., color, size) of graphical marks (e.g., bars, lines). Interaction allows analysts to manipulate and compare large datasets. The DataLab develops better tools to facilitate visualization-based analysis and communication.
Social interaction is increasingly mediated by Internet-enabled information and communication technologies. These systems archive digital traces of activity leading to rich data that promises to provide insight into individual and group behavior. DataLab researchers leverage this capacity to collect and analyze rich data to transform our understanding of individuals, organizations, and societies.
Data curation is concerned with advancing access to trustworthy and reusable data resources. DataLab researchers are actively investigating how to build rich, functional collections of digital data for research communities in the sciences and social sciences and how to improve access to open data for the public. Their work contributes to sustaining the long-term value of open data resources and global progress toward shared cyberinfrastructure.
The Science of Science turns the microscope on itself. The research subjects are the scientists themselves, their inputs (ideas, funding, training), and their outputs (papers, students, patents). Those in this field try to uncover the origins of innovations, improved scholarly communication, issues of equity (gender, race, socioeconomic status) and policies that faciliate discovery.
Featured Projects
VizioMetrics is an image search engine and classifier. In order to improve it, we would like to automatically identify a “central figure” in a scientific article in cases when multiple figures are present. We defined “central figure” as a single visualization that encapsulates key aspects of a paper, a graphical summary that captures the content of the article for readers at a single glance. We surveyed 488,590 researchers in the biomedical field and found out that for an overwhelming majority of papers their authors were able to identify a single “central figure.”
We present the results of a field experiment in Afghanistan that was designed to increase adoption of mobile money, and determine if such adoption led to measurable changes in the lives of the adopters. The intervention we evaluate is a mobile salary payment program, in which a random subset of individuals of a large firm were transitioned into receiving their regular salaries in mobile money rather than in cash. While mobile money salaries led to immediate and significant cost savings to the employer, we find little consistent evidence that mobile money had an impact on several key indicators of individual wealth or well-being. Taken together, these results suggest that while mobile salary payments may greatly increase the efficiency and transparency of traditional economies, in the short run the benefits may be realized by those making the payments, rather than by those receiving them.
This research seeks both to understand the patterns and mechanisms of the diffusion of misinformation on social media and to develop algorithms to automatically detect misinformation as events unfold. During natural disasters and other hazard events, individuals increasingly utilize social media to disseminate, search for and curate event-related information. There is great potential for this information to be used by affected communities and emergency responders to enhance situational awareness and improve decision-making, facilitating response activities and potentially saving lives. Yet several challenges remain; one is the generation and propagation of misinformation. Taking a novel and transformative approach, this project aims to utilize the collective intelligence of the crowd – the crowdwork of some social media users who challenge and correct questionable information – to distinguish misinformation and aid in its detection.
For hundreds of years, scientists have been laying down trails of citations. These trails form a vast network, where papers are nodes and citations are links. This network can tell us a lot about the formation of new ideas, fields, and technology. We can identify salient papers and authors. We can construct maps that help us navigate this ever growing network. And we can better understand how information flows in social systems. These are some of the goals of the Eigenfactor Project (http://www.eigenfactor.org).