Assessing the Texas Data Repository: Determining What to Measure and How
Background: The Texas Data Repository (TDR) was launched in Spring 2017. The TDR is built on the Dataverse software platform and hosted by the Texas Digital Library (TDL)—a consortium of higher education institutions in Texas. Currently, 11 institutions participate in the TDR, and liaisons from these institutions serve on a TDR Steering Committee to provide feedback and guide the direction of the repository service. The Assessment Working Group (AWG) is a sub-group of the Steering Committee and is tasked with evaluating the progress of the TDR. Purpose: In Fall 2017, the AWG began an assessment to identify the needs for reporting on the TDR by addressing the following research questions:
- Which usage and descriptive information about the TDR will be most valuable?
- What process for gathering and distributing these metrics/information will be most useful? Approach: The first step in determining the most valuable usage metrics was distributing a survey to all TDR institutional liaisons. The survey was vital in identifying the widely varying resources and needs of the participating institutions as well as the information the liaisons were interested in seeing both institutionally and consortially. Because the most valuable usage metrics should allow for comparative assessment across repositories globally, the AWG also compiled descriptions of metrics recommended by three sources which suggest best practices for tracking the impact of research data, including the Make Data Count project’s “Code of practice for research data usage metrics.” The results from this compilation of best practices have been combined with the results of the survey to determine a prioritized list of metrics. Findings: In the survey, the following five metrics were most frequently reported as desirable: dataset download counts; unique page visitors; size of datasets (MB); the size of collections (MB); the number of files within datasets. Participants were also interested in information including Institutional researchers using the TDR; Hierarchy collections and datasets; List of links to all collections and datasets; List of dataset DOIs. The 24 compiled metrics were sorted into four categories: metrics about datasets and dataverses/collections, metrics about users, metrics about views and access, and metrics about the session. Combined with the top nine metrics requested from the survey, each metric was given a high (11), medium (5), or low (8) priority. The high priority metrics are dataset downloads; size of datasets; list of dataset DOIs; size of collections/dataverses; list of links to all dataverses/collections and datasets; list of institutional researchers using TDR; unique page visitors; unique dataset investigations, total dataset investigations, number of sessions and average session duration. Conclusions: The ASW will be submitting a report on this work to the TDR steering committee in February 2019. Among the recommendations will be incorporating the high priority metrics into bi-weekly reports and submitting monthly metrics to Make Data Count. These high priority metrics should be available from current TDR logs. If the recommendations are accepted by the TDR Steering committee, generating new reports will be incorporated into the TDR roadmap for the 2019-2020 fiscal year.