Digital Preservation and Compliance-based Access for Privacy-sensitive Records




Karadkar, Unmil P.
Verma, Nitin
Dong, Lorraine
Galloway, Patricia
Obaseki, Victor
Davis, King

Journal Title

Journal ISSN

Volume Title



We now have the ability to digitize and make available for scholarship large collections of privacy-sensitive collections, such as records of mental health institutions, hospitals, and prisons. Public availability of such records has the potential for negative impact not only on individuals named within but on their families and descendants as well. Furthermore, the dissemination of such records is governed by a variety of federal, state-level, and local statutes and the desire of scholars and families to access this information must be balanced against ethical as well as legal concerns.

With funding from the Mellon Foundation, an interdisciplinary team consisting of scholars in mental health, information studies, and law, is exploring the issues involved in digital preservation of and providing access to privacy-sensitive records. Records of the Central State Hospital (CSH)—the first mental health institution to serve black people in the USA—located in Petersburg, VA serve as a model collection for exploring these issues. The hospital has been operational since 1870 and the project focuses on its records since inception till its integration in the 1970s. In addition to records of patient admission, treatment, discharge, readmission, and death, these impeccably maintained records include administrative records of the hospital, such as board meeting minutes, reports to the governor, photographs, financial records, newsletters, and contemporary medical literature. The project has digitized over 600,000 pages and the records occupy approximately 10TB of disk space. The records are of interest to families of former patients, scholars in disciplines such as mental health, social work, history, law, and policy.

In this presentation, we will discuss the multi-dimensional complexity in balancing privacy against providing access to several demographics to this collection. We will showcase developments along three directions: a) complying with relevant federal and state statutes; b) policies to model mental health collections; and c) development of a standards-based dark archive to host the CSH records. The presentation will include workflows, metadata schema that includes descriptive, administrative, preservation, compliance, and technical metadata, as well as the principles and policies for making documents available to various demographics with appropriate protections in place. We will describe the various options that the project team considered before making decisions related to hardware and software infrastructure, digital archive infrastructure, and for designing the conceptual models. The project is committed to using free, standards-based software such as Archivematica, Fedora, and Hydra and developing FLOSS software for processing documents and generating metadata. Finally, we will also describe our initiative to unlock the data in structured and free-form handwritten documents, which comprise about a third of our collection.


Presentation slides for the 2017 Texas Conference on Digital Libraries (TCDL).