Introducing the Expanding Dataverse




Castro, Eleni
Quigley, Elizabeth

Journal Title

Journal ISSN

Volume Title



The Dataverse Project started in 2006 at Harvard’s Institute for Quantitative Social Science as an open­ source software application to share, cite and archive data. From its beginnings, Dataverse (then referred as the ‘Dataverse Network’) has provided a robust infrastructure for data stewards to host and archive data, while offering researchers an easy way to share and get credit for their data. Since then, there are now ten Dataverse repositories that share metadata with each other hosted in institutions around the world, which together serve more than 55,000 datasets with 750,000 data files ( These Dataverse repositories are using the Dataverse software in a variety of ways, from supporting existing large data archives to building institutional or public repositories. One of these Dataverse repositories is the Harvard Dataverse, that alone hosts more than 800 dataverses (containers of datasets) owned and managed by either researchers, research groups, organizations, departments or journals. The Harvard Dataverse has so far served more than a million downloads of its datasets, allowing researchers around the world to reuse the data, discover new findings, and extend or verify previous work. While the Dataverse project started from the social sciences for the social sciences, it has now expanded to benefit a wide range of disciplines and scientific domains (astronomy, life sciences, etc) leveraging our progress in the social science domain to define and enhance data publishing across all research communities. In particular, as part of the new Dataverse release (v4.0), we have evaluated the features needed in data publishing so data can be properly shared, found, accessed and reused. This presentation will provide some background information on the Dataverse's history and showcase the new features we have developed in version 4.0 for researchers.


Presentation slides for the 2015 Texas Conference on Digital Libraries (TCDL).