Archiving on the Go: Facilitating Auto-Archiving of Evolving Digital Collections




Scott, Bethany

Journal Title

Journal ISSN

Volume Title



For more than 10 years now, archivists have proclaimed the importance of early intervention during the records creation process in order to assure their long-term preservation. In the academic context, the infrastructure and services needed are still in process of definition – while librarians are ready to provide instructional materials and guidance to implement metadata management plans, ongoing support for researchers creating their collections is not in place. Moreover, typical institutional repositories do not provide storage services for working /ongoing collections, or widespread support for issues like bulk uploads, overall amount of storage space, metadata creation, or privacy protection that such a collection requires.

This project presents a case study of guiding an evolving digital collection that has expanded beyond the creator’s (and the IR’s) capability to easily manage and preserve it.

In this presentation, we will first describe a unique collection of digital fine art photography, the working process and information management actions of the creator, and his needs for a digital archive system to easily store, search, and retrieve files for further editing. Through a detailed interview, we gained information about the artist’s process of working, from taking photographs, digitally processing them, and storing them on external hard drives. The current collection is very large, both in the number of individual files and in the typical file size – often over 3 GB per image. Because the collection currently spans over 100 individual hard drives, it is both unwieldy to search and manage images, and it is more vulnerable to data loss through hardware failure. A secure and easy to use remote storage solution will allow him to organize and view his entire collection at once, and this improvement will save time in processing activities, so that the artist can devote more time to creating new images. TACC provides the computational resources and research and development expertise to implement this system. We will discuss the benefits (such as the ability to work with developers to improve bulk uploads and metadata mapping ) and the limitations of this case study (such as the slow transfer speeds encountered through some networks, and the problems of human error in applying file naming conventions for automated metadata extraction).

By becoming involved in the artist’s information management processes during this early point in the file life cycle, we not only allow him to more efficiently manage his own time, but also ensure that the files are accessible and well organized for the archivists and researchers who may be dealing with the collection in the future. As digital collections continue to evolve it will be crucial to provide long-term, secure storage and preservation. The increased high-performance storage resources now available facilitate this goal. More proactively approaching the creators of research collections to provide data management services complement the storage availability, allowing researchers to continue to create, curate and preserve their own collections.


Presentation slides for the 2012 Texas Conference on Digital Libraries (TCDL).