2014 Texas Conference on Digital Libraries

Permanent URI for this collectionhttps://hdl.handle.net/2249.1/67018

Browse

Now showing 1 - 20 of 42

Applying Visual Arts Pedagogy in the Training of New Digital Imaging Technicians
(2014-03-14) Rankins, Derek; Moore, Jeremy; University of North Texas
We propose a 24x7 presentation that provides insight into the training methods of UNT’s Digital Projects Lab. Derek Rankins and Jeremy Moore apply pedagogy from their visual arts backgrounds while training new digital imaging technicians. When students learn to scan items first, we assert they internalize a series of steps that, when completed, signify the job is done regardless of how the final image appears. Instead, student assistants now begin with quality control tasks so that they learn the difference between “good” and “bad” images before they are asked to create anything. This is further extended by having the students participate in peer reviews and buddy training sessions. The presentation will include sample images of common mistakes across a variety of imaging platforms.
The Archive of the Indigenous Languages of Latin America
(2014-03-25) Kung, Susan; University of Texas at Austin
The Archive of the Indigenous Languages of Latin America (AILLA) is a completely digital repository at the University of Texas at Austin, LLILAS Benson Latin American Studies and Collections. AILLA has no physical presentation space; its collections are accessible only through its website (www.ailla.utexas.org) via parallel interfaces in both English and Spanish. AILLA's primary mission is the preservation of irreplaceable linguistic and cultural resources in and about the indigenous languages of Latin America, most of which are endangered. Most of the materials in the archive are primary field data that were collected and deposited (donated) by linguists and anthropologists for whom audio and video recordings are a central part of their research methodology. Many indigenous organizations have also donated the results of their investigations to AILLA. The majority of AILLA's collection consists of audio and video recordings of discourse in a wide range of genres, including conversations, many types of narratives, songs, political oration, traditional myths, curing ceremonies, etc. Many recordings are accompanied by transcriptions and translations of the speech event. Other textual resources include dictionaries, grammars, ethnographic sketches, fieldnotes, articles, handouts and PowerPoint presentations. The collection also contains hundreds of photographs. AILLA's secondary mission is to make these valuable and useful resources maximally accessible via the Internet while simultaneously protecting personally, culturally and politically sensitive materials from inappropriate use and supporting the intellectual property rights of the creators. AILLA's system of access levels allows creators and depositors to have finely-grained control over their materials, which lets them restrict their entire collections or only certain files within the collections. For example, recordings might be public while transcriptions might be restricted or vice versa. Sensitive materials are protected; however, AILLA's directors, manager and depositors believe strongly that accessibility is equally important. Historically, very little of the fruit of linguistic and anthropological research has been genuinely available to the indigenous communities in which the research was done; AILLA aims to rectify that imbalance. Restrictions tend to keep speakers out, while researchers can generally gain access to archival materials through the academic network. Resources that are publically accessible can be heard and read by all speakers. Our policy is that if a resource can be made public, it should be made public; but if it is sensitive, it should be protected. Our goal is to ensure that the unique and wonderful resources preserved at AILLA can be used to maintain, revitalize and enrich the communities from which they arise. AILLA was intended from the outset to function as a partner with its depositors, providing them with a means of both preserving and sharing, under appropriate terms, the fruits of their work with the indigenous peoples of Latin America. The archive accepts any legitimate resources that can be housed in a digital format.
Being an 'a11y': Increasing Accessibility in Born Digital Preservation
(2014-03-25) Snider, Lisa; Harry Ransom Center, University of Texas at Austin
In the past few years, archivists and librarians have grappled with issues associated with the long term preservation of born digital materials. Are we considering the needs of people with disabilities when preserving these materials? This presentation will explore how we can increase accessibility when preserving born digital materials. Taken from an archival point of view, the presentation will focus on one solution that may make our born digital material more accessible to people with disabilities.
Beyond Web-based Scholarly Works Repositories: The effect of institutional mandates on the faculty attitudes towards Institutional Repositories
(2014-03-25) Tmava, Ahmet Meti; Alemneh, Daniel; University of North Texas
In the last decade there has been a push from academic institutions to encourage faculty to deposit their work in web-based scholarly work repositories, commonly known as institutional repositories (IR). IRs are responsible for collecting and preserving the intellectual works of faculty and students and making them widely available. In light of the ever-evolving landscape of higher education, IRs seek to move beyond the custodial role and actively contribute to the advancement of scholarly communication. Understanding and addressing the issues faced by IRs requires a multidimensional approach that involves all stakeholders including: individual scholars and researchers, academic institutions and librarians, scholarly and scientific society publishers, commercial publishers, and government institutions. However, most researcher (Kim, 2010) agree that the main players are faculty members that can make-or-break an IR. In spite of the fact that IRs are an innovation in scholarly communication they have been met with a resistance from faculty members. Academics have been slow to embrace the concept of IRs, according to recent studies by Primary Research Group (2014), only 5% of journal articles published by the faculty members of the organizations have been archived in the IR. While a range of factors seem to influence use of repositories by researchers there is still no agreement how to resolve the challenge of getting authors to deposit content. The most recent survey by Nicholas et al (2014) suggested that while the size and use of repositories has been relatively modest, almost half of all institutions either have, or are planning, a repository mandate requiring deposit. However, Crow (2002) warned that faculty submission will have to be voluntary or risk encountering resistance from faculty members who might otherwise prove supportive. The current situation of IRs is rather bleak and calls to question the effectiveness of the current ways of recruiting content, including institutional mandates. Nicholas et al argue that mandates vary based on the research community and/or institution. Their findings reveal that none of the participating institutions reported any attempt to force researchers to comply with the mandate and describe the current mandates as more educational rather than binding. The same study concludes that 22 percent of the researchers were directly influenced by mandate to deposit their work, and this varied based on the age. Thus, the hope remains that with the mandates in place the new generation of researchers will get used to the idea of depositing their work. This poster will revisit the content recruitment issues in general. Although there is an extensive body of relevant knowledge, discussions about IRs transformations, they are often based on opinion, and isolated experience of commentators, leaving out the main issue (i.e. institutional policies) and the main players (i.e. faculty). This paper will attempt to assess the effect of institutional mandates on the faculty attitudes towards IRs. We believe that analyzing and spotlighting the possible correlations between and among various factors are pertinent for understanding and shaping the ongoing transformation of IRs.
Centralized to Scattered: Designing Project Workflows for a Dynamic Staff
(2014-04-04) Wills, Faedra; Schenk, Krystal; University of Texas at Arlington
How can staff collaborate on digital projects when they are dispersed throughout the library? This is the challenge the new Digital Creations department was faced with after a library wide reorganization in the summer of 2013. In 2011, the UT Arlington Libraries began mining faculty CVs for articles that we could add to our local institutional repository. After the re-organization the staff previously working on this project were now scattered between three departments. By leveraging the project management features of the newly adopted tool SharePoint, we are able to distribute the work of this project across staff, and departments. In this presentation we will demonstrate how we are using SharePoint’s workflows, custom lists, task lists and shared calendars to help keep staff informed, generate reports and manage projects. In particular, we will show how we use these features to help keep staff on task, and faculty informed of our progress.
Collection Size Descriptions as Archival Data: The Spectrum of physdesc
(2014-03-25) Buchanan, Sarah; Li, Haoyang; University of Texas at Austin
This poster presents insight into the functional vocabulary with which repositories describe the physical extent of their collections. The structured standard Encoded Archival Description (EAD) has provided repositories with a XML basis for representing archival finding aids since its creation and adoption during the 1990s. As one measure of its widespread adoption by collecting repositories, consider that the nationwide corpus of ArchiveGrid currently comprises over 120,000 EAD documents. The public database Texas Archival Resources Online similarly facilitates discovery of historical collections by displaying the contributions of EAD-structured finding aids from Texas repositories. The current version of EAD consists of 146 elements – an EAD tag and its formal element name – which provide the basis for these structured descriptions of collections. In this research we focus on one component of collection description, the tag, and report on the range of format types that appear in Texas collections. Beyond the colloquial names of box, photograph, and painting exist many outlier terms which present unique challenges and opportunities. The variation within the tag may be painless to the human reader during display, yet becomes problematic during natural language processing which requires normalization of collection sizes in order to perform statistical analysis. Through the one element of Physical Description, repositories are charged with summarizing both the materiality and the quantity of the items contained in an entire collection. These descriptions speak to the physical form and enumerative values of all information artifacts in the collection through the use of four optional subelements: dimension, extent, genre characteristic, and physical facet. We demonstrate the effect of having relative leeway in terms of data structure requirements built into the formal definition of this element. Because "the information may be presented as plain text," the end result of this definition is a dataset with wide internal variation that could impede the goal of assessing such collections through actionable data and its reuse in a broader context, such as by repository or region. With the third EAD Revision currently in gamma release (and set to replace EAD 2002 this spring), we consider our study in parallel with the following two developments: the continuation of the element as an unstructured option, and the creation of a new element which will formally adopt, rename, and add a fifth subelement to the four optionals listed above. In addition to version compatibility, EAD developers and adopters should facilitate integration of the legacy data corpus alongside data requirements to meet the dual goals of analysis and discovery. The Visualizing Archival Data / Augmented Processing Table project, of which this study is a part, aims to understand how such finding aid data can reveal the quality and granularity of collection arrangements, and through this, the layers of historical evidence that are made available to researchers seeking resources on specific topics, people, and organizations.
Conceptualizing and implementing a webinar series: lessons learned from the Mountain West Digital Library Webinar Series
(2014-03-14) Cummings, Rebekah; University of Utah
Webinars are a low-cost and efficient training model that allow librarians to disseminate valuable information, connect with colleagues, and build and expand their communities beyond geographic and institutional boundaries. Yet, while many information specialists attend webinars on a regular basis, the task of hosting a webinar series may seem like a daunting and opaque challenge, even for enthusiastic webinar participants. In this poster session, Rebekah Cummings, Outreach Librarian at the Mountain West Digital Library, will demystify the process of implementing a successful webinar series including content creation, recruiting guest speakers, software selection, promotion, hosting the webinar, and follow-up. This session will include practical advice on how to host a webinar or webinar series, the costs and benefits associated with hosting webinars, and lessons learned from the Mountain West Digital Library’s Webinar Series.
A Consortial Response to Data Sharing: The TDL Data Management Pilot Project
(2014-03-25) Hanken-Kurtz, Debra; Texas Digital Library
In February 2013 the federal Office of Science and Technology Policy issued a mandate requiring federal funding agencies that spend at least $100 million per year on research and development to mandate public access to the metadata, published research, and data outputs that result from this funding. In response to the OSTP mandate and to the stated needs of its member libraries, the Texas Digital Library began to plan for a consortially developed and run data management service that would meet the requirements of the mandate and position libraries to play a crucial role in on-going conversations about data management at their universities. A working group of representative TDL member schools and the Texas Advanced Computing Center began meeting in Fall 2013 to create a cross-institutional pilot project to ingest and make data accessible on the web. The goals of this pilot are: To create services that meet emerging federal requirements for data and research publication for federally-funded research projects. To design and integrate a system for curating and managing data that support novel interdisciplinary research. To design services that will support the dissemination of research to the public in ways that are useful and effective in meeting the goals of the member institutions. The group is working with environmental science research groups identified at Texas A&M University to ingest data in a variety of formats, develop and apply metadata to maximize discovery, measure access and usage, and track costs. The project will build on existing TDL technologies and resources, including hosted DSpace institutional repositories, DuraCloud, and large-scale storage at the Texas Advanced Computing Center. It will deploy these resources strategically to develop a working service and identify areas of need for future development. The pilot project will be completed in the fall of 2014. This presentation will provide an overview of the project and the group’s assumptions in taking it on, our progress to date, and information about the challenges faced thus far.
Content Management Systems and 3D Models: Creation, Interaction and Display
(2014-03-25) Wackerman, Dillon; Thompson, Ashley; Stephen F. Austin State University
This presentation will explore methods of creating and displaying 3D images in relation to Content Management Systems and online collections. Examining the creation of 3D models through various platforms, we will discuss the interaction and feasibility for implementation of several common 3D formats. The online display of 3D models has been in use for several years, most notably in archaeological reconstruction projects and more recently in digital imaging within the field of medical science. For this presentation, a 3D model is a visual representation that can be manipulated with various tools, which enable it to be turned, rotated and magnified among other functions. Apart from large-scale examples, the use of such models has yet to be fully utilized for the online display of cultural heritage objects and in particular within Content Management Systems such as CONTENTdm and Digital Commons. The 3D file formats that this presentation will address necessarily carry over into the display of 3D models. Discussion will then also consider the relationship between these file formats and external or internal methods of display. This presentation will also address recent developments in the area of 3D model representation and how subsequent applications may change.
Developing a Library Open Access Portal That Bypasses the Need for Authentication
(2014-03-25) Herbert, Bruce; Potvin, Sarah; Ponsford, Bennett; Highsmith, Anne; Texas A&M University
Texas A&M University was established as Texas’ only land grant university through the First Morrill Act (1862), which sought to provide a broad segment of the population with a practical education that had direct relevance to their daily lives. Our impact on society was later expanded through the creation of the agricultural experiment stations and the Cooperative Extension Service, which disseminate the results of experiment station research to improve the state’s agricultural industry. The Sterling C. Evans Library at Texas A&M is building upon this history to help bring all of Texas A&M’s scholarly work to bear on many of society’s greatest challenges by promoting open access. We are working to identify and advance appropriate information systems, practices, and policies that improves societal access to the scholarly and creative work at Texas A&M. The Texas A&M University Libraries, has begun work to design a portal that bypasses the need for authentication and allows a user to search through a collection of open access materials. Working with Ex Libris, the vendor from which we license our Primo discovery layer, we have installed a separate instance of Primo aimed at aggregating open access materials and making them accessible to the public. This dedicated portal will draw materials identified as open access from the Primo Central Index, a “meta-aggregation of hundreds of millions of scholarly e-resources of global and regional importance,” including “journal articles, e-books, reviews, legal documents and more.” We are currently working to have OAK Trust open access items harvested into Primo Central and made available alongside harvests from other institutional repositories. In establishing this Portal to Open Access Resources, we will also work to identify materials that are legitimately open access (gratis) and that meet basic quality standards. This poster presentation will discuss the technical aspects and policy decisions made during the design and implementation phase of the project, and show how the portal supports a Texas A&M University – K12 School District reforming their science, technological, engineering and mathematics (STEM) education.
Did We Scan That Book Twice?: Weeding the Texas Tech Dark Digital Archive
(2014-03-25) Winkler, Heidi; Texas Tech University
The Texas Tech University Libraries' digital collections began in 2004 with the intent to digitize as many books as possible in the name of open access. By the fall of 2013, that mission had been revised to focus on the preservation of materials unique to Texas Tech. We decided it was not in the institution’s best interest to devote resources to files in our digital dark archive that did not meet this mission. Using the HathiTrust catalog as our guide, we set out on an online trek to discover just how many digitized books being preserved on our servers were, in fact, distinct items not held elsewhere. Along the way, we tackled questions of to what do we provide access on our DSpace versus archiving on our servers and just how unique is "unique"? Weeding a digital resources library requires a different process of consideration than the weeding of a physical library. Further, we used this project to refine our digital archiving and preservation practices, the most important of which was the establishment of an archive change log.
Digital Collections in a Small Archives: Using Google Services to Help Present and Promote An Oral History Project
(2014-03-14) Wolfe, Erin; University of Kansas
Providing online access to media collections, such as oral histories, can be challenging to do well, particularly for smaller institutions with limited resources. This presentation will focus on a recently completed project in which the Dole Archives leveraged freely available tools to provide access to a high profile oral history collection in a variety of formats, including streaming audio/video, full text searching capabilities, and a finding aid with direct links to digital content. By integrating Google services into our own website, the project receives benefits both from (a) local branding and exhibit/content hosting and (b) the increase of visibility of the materials to a wider audience through Google-based searches. Designed with end-user access in mind, it is our hope that this project will help to expand our audiences beyond the academic and be useful (and usable) for a variety of purposes, from K-12 student research to serving as a case study for future fundraising opportunities. This presentation should be of interest to institutions looking for a low-cost approach to providing online access to media collections or those who may be interested in seeing a new approach to using web-based tools to provide access to archival materials.
Digital Repository for Beach Management Data
(2014-03-25) McElfresh, Laura; Baca, David; Texas A&M University at Galveston
The Galveston Island Park Board of Trustees, a governmental entity created in 1962 by the Texas Legislature, is responsible for preserving and promoting the Island's natural resources, including its beaches. The Park Board produces data and documents -- studies, reports, policy advisories, and other information -- which may not necessarily fall under the purview of government document depository mandates, but should still be openly accessible to citizens. Texas A&M University at Galveston, as an institute dedicated to higher education and scholarship in the marine sciences, marine engineering, and maritime professions, is a natural home for this kind of scientific and economic information. In January 2014, the Jack K. Williams Library at Texas A&M - Galveston and the Galveston Island Park Board of Trustees formed a partnership to create a repository for preservation and open sharing of these documents. This brief presentation will outline our progress to date.
Digitizing San Antonio’s LGBTQ publications: A Portal to the City’s Queer Past
(2014-03-25) Gohlke, Melissa; University of Texas at San Antonio
Too often in the past, records of gay, lesbian, and transgender persons have been discarded or destroyed sending important filaments of history into the trash bin of time. Fortunately, queer publications that survive provide vital glimpses into the evolution of the communities that produced them and are an important source of ascertaining how gay, lesbian, and transgender organizations and individuals perceived and reacted to the world around them, built communities, and captured the pulse of their evolving culture. As interest in queer history and culture grows, efforts to collect, preserve, and digitize LGBTQ materials have intensified. The long-term benefits of preserving queer records such as LGBTQ serials through the digitization process cannot be understated. As more materials are digitally preserved and made available, opportunities for access and conservancy are greatly expanded. This presentation will cover one such opportunity at the UTSA Libraries Special Collections. In 2012, UTSA Libraries Special Collections began a collaborative project with the Happy Foundation, a San Antonio non-profit GLBT archives. The project entailed digitization of several decades of queer periodicals housed at the foundation. This effort coincided with the purchase of a Zeutschel overheard scanner by the UTSA Libraries. The process included pickup, transport, digitization, and return of loaned periodicals and finally, ingest of digital objects and metadata into CONTENTdm. Two challenges came to light during the project: 1) tracking down publication creators to secure permission to digitize items and make them available on the internet 2) handling content that might be perceived as extremely provocative, pornographic, or possibly offensive. At present, the UTSA Libraries Special Collections staff has digitized the bulk of local queer serials held at the Happy Foundation. These represent the basis of UTSA Special Collections Digital GLBTQ Publications collection which includes the Calendar, the Marquise, River City Empty Closet, Out in San Antonio, and San Antonio Community News. WomanSpace and Rainbow Garden Club newsletter, also included in the digital collection, are physical records held at UTSA Special Collections. While the Digital GLBTQ Publications collection features primarily San Antonio periodicals, issues of queer serials from elsewhere are also represented. Several issues of One magazine, the nation’s first homosexual publication, are housed at the Happy Foundation and are available digitally through UTSA. Records donated by local and regional LGBTQ organizations and individuals, such as Lollie Johnson, San Antonio Lesbian Gay Assembly, the Texas Lesbian Conference, and San Antonio activist Michael McGowan augment UTSA Libraries Special Collections digital holdings of queer publications and provide research opportunities for scholars, students, and members of the community.
Digitizing the Fred Fehl Dance Collection
(2014-03-25) Weathers, Chelsea; Mitchell, Jordan; Roehl, Emily; Harry Ransom Center; University of Texas at Austin
The Harry Ransom Center’s performing arts department holds two vast collections of photographs by Fred Fehl—a prolific mid-twentieth century photographer of theater and dance based mainly in New York City. The Fred Fehl Theater Collection and the Fred Fehl Dance Collection each contain tens of thousands of 5 x 7 prints of various productions by multiple companies. For the past six months, a team of employees, interns, and volunteers has been working to digitize and catalog 5,000 of the 30,000 photographs in the Fred Fehl Dance Collection. Once digitized, the images and their metadata are uploaded onto the Ransom Center’s new digital collections website, which uses the platform CONTENTdm. Providing access to Fehl’s photos of dance productions, which run the gamut from the classical offerings of the American Ballet Theatre to Martha Graham’s groundbreaking modern dance, is a significant contribution to the fields of dance history, art history, cultural studies, and costume design. No other online library or archive currently provides images of Fehl’s photos in such breadth or depth, and the Ransom Center is in a unique position to do so because it holds the copyright to all of its Fehl photographs. To execute the complex task of preparing the photographs for digitization, the performing arts curator Helen Baer, her associate Chelsea Weathers, and graduate interns Jordan Mitchell and Emily Roehl developed a workflow that entails two main streams. One focuses on the creation of consistent metadata, and the other focuses on the digitization of the photographs. After the institution of the workflow, undergraduate work study students and volunteers also began to contribute to the project. To date, nearly 1500 photographs from three different dance companies have been uploaded via CONTENTdm to the Ransom Center’s digital collections website. Access to this enormous collection of visual materials will be an invaluable resource for dance scholars, enthusiasts, historians, and the general public.
Finding Roots, Gems, and Inspiration: Understanding Ultimate Use of Digital Materials
(2014-03-14) Thompson, Santi; Reilly, Michele; University of Houston; Central Washington University
The University of Houston Digital Library (UHDL) is the point of virtual access for digitized cultural, historical and research materials for the university’s libraries. UHDL developed a "digital cart” system (DCS) that allow users to download high resolution images from its collections. The DCS records important information supplied by the user regarding the ultimate use of the downloaded images. Until now, no formal analysis of the transaction log for the DCS has been completed. This research is significant because little is known about the ultimate use of digital library materials. Current literature suggests that this problem is not uncommon among digital libraries around the world. Our analysis begins to fill a critical gap in the professional conversation on digital libraries by directly contributing to the small body of literature that is asking who uses digital libraries and for what purposes. This presentation will outline how researchers analyzed data from portions of the transaction logs from the DCS from 2010 to 2013. From this analysis, they will highlight some of the interesting and innovative ultimate uses by patrons. The researchers will discuss the study and offer audience members approaches for analyzing data to determine ultimate use and its ramifications inside and outside of the classroom.
Flowcharting a Course Through Open-Source Waters, an eMOP guide to OCR
(2014-03-14) Christy, Matthew; Texas A&M University
The Early Modern OCR Project (eMOP), an Andrew W. Mellon Foundation funded grant project running out of the Initiative for Digital Humanities, Media, and Culture (IDHMC) at Texas A&M University, intends to use font and book history techniques to train modern Optical Character Recognition (OCR) engines. eMOP’s immediate goal is to make machine readable, or improve the readability, for 45 million pages of text from two major proprietary databases: Eighteenth Century Collections Online (ECCO) and Early English Books Online (EEBO). Generally, eMOP aims to improve the visibility of early modern texts by making their contents fully searchable. The current paradigm of searching special collections for early modern materials by either metadata alone or “dirty” OCR is inefficient for scholarly research (Mandell, 2013). Now in year two, eMOP is turning towards one of their main goals: to produce a workflow, published in Taverna, for use by individuals and institutions with similar projects. Matthew Christy and Liz Grumbach, eMOP Co-Project Managers for Year Two, will present a series of interconnected workflows that represent the work being done by eMOP and give an idea of how eMOP work will benefit the library, and larger academic, communities. Our presentation will include flowcharts covering: Wrangling the eMOP data and metadata. Our data set consists of the 45 million pages that make up the Eighteenth Century Collections Online (ECCO) and Early English Books Online (EEBO) commercial database, as well as over 46,000 had transcribed texts from the Text Creation Project (TCP). We have created our own DB and query/download tools to manage and access that data. The eMOP Font History database being created. This DB is based on parsing the natural-language imprint lines of every document in EEBO. Training Tesseract. We have developed our own tools and methods to optimize training of Google’s open source OCR engine Tesseract for work on pre-modern printed texts. The eMOP controller. The controller is a software process that controls work from OCR’ing to scoring of results The eMOP post-processing process. This process will score OCR results per page, and then decide which of two post-processes to route the page through. Pages that score well will be routed for further correction. Pages that score badly will be routed to a triage system which will determine what is causing the page to fail OCR’ing and tag them for appropriate pre-processing to rectify problems and later re-OCR’ing. The eMOP post-processing scoring method. The process for training eMOP’s triage system’s machine learning applications. We will conclude with information where to find out more information about eMOP, as well as our open source code and workflows.
Harvesting Quality: Evaluating Metadata for Digital Collections
(2014-03-25) Biswas, Paromita; Western Carolina University
Metadata creation practices for digital library projects vary widely amongst libraries. Digital library projects often have to deal with multiple metadata creators, new formats and resources, and dynamic metadata standards for different communities (Park & Tosaka, 2010). As a result while accuracy and consistency in metadata are prioritized by field practitioners, metadata records created for specific digital projects may lack the quality needed to support successful end-user resource discovery and access. Park and Tosaka’s survey of metadata quality control in digital repositories and collections reveal that digital repositories often rely on periodic sampling or peer review of original metadata records as mechanisms for quality assurance (Park & Tosaka, 2010). This poster proposal presents another means of running quality checks on metadata created for digital projects based on Hunter Library’s experience with the WorldCat Digital Collection Gateway tool used for harvesting metadata for digital collections into WorldCat. Hunter Library’s digital collections are described using Dublin Core in Contentdm and the Library has recently started harvesting its collections into WorldCat using Gateway. During harvesting the Gateway, by default, places the names of “creators” and “contributors” recorded in separate fields in the local metadata environment into one broad “Author” field for WorldCat users. A cursory review of this “Author” field in WorldCat for several harvested items from one of the library’s collections revealed an unexpected presence of corporate body names alongside personal names. Consequently this led to an evaluation of how the “creator” and “contributor” fields had been used in that collection. The “Frequency Analysis” feature in Gateway proved to be particularly useful in this evaluation since it provided a breakdown of each field in a particular collection by the values used in that field and the number of times they had been used. For example, a high frequency usage of a particular name indicated that the usage had not been a random mistake but had been consistent. A subsequent analysis of the library’s digital collections’ metadata using “Frequency Analysis” revealed that for some collections, the “contributor” field had been used to record entities whose roles, in relation to the item described, spanned from publisher, printer, editor, or recipient of letter. However, the library’s then current metadata schema had limited the definition of the “contributor” field to entities who had a direct but secondary role in the creation of an item like editors or illustrators. This discrepancy between the library’s metadata schema and the usage of the “contributor” field led to a redefinition of the role of the “contributor.” The schema now incorporates the plethora of roles that “contributors” could have in relation to an item and recommends that the role of each “contributor” be explained in the “description” field to account for the diversity of roles. Updating of the schema has thus promoted consistency in recording the “contributor” field across the library’s digital collections while also possibly benefitting users searching for an item by the various names associated with it.
Inside the Digital Public Library of America
(2014-04-15) Cohen, Dan; Digital Public Library of America
DPLA Executive Director Dan Cohen goes behind the scenes to discuss how the DPLA was created, how it functions as a portal and platform, what the staff is currently working on, and what's to come for the young project and organization.
Introducing Piper, a Repository-Agnostic Batch Deposit Tool
(2014-03-13) Cooper, Micah; Creel, James; Hahn, Doug; Herbert, Bruce; Huff, Jeremy; Li, Yu "Lilly"; Maslov, Alexey; Potvin, Sarah; Texas A&M University
Abstract: Applications developers and librarians from the Texas A&M University Libraries will introduce Piper, a repository-agnostic content deposit tool. In addition to providing background on the impetus behind its creation and the intended/anticipated user base, we will demonstrate the tool and explain the process of its development. Impetus. Prior to Piper’s deployment, batch loads to our DSpace institutional repository were being handled primarily by one developer in the Digital Initiatives unit. DSpace affords various submission workflows for single-item submission, but batches of items must be loaded via the command line on the DSpace server. This server can be an extremely sensitive environment in large organizations whose business cases require backups, firewalls, and high uptime. As part of the workflow for batch loads, which came from diverse sources both inside and outside the Libraries, the developer had engineered procedures for metadata quality control prior to deposit. The developer frequently confronted batch loads with missing files or with incomplete or ill-formed metadata. Design and goals. The initial goal of Piper is to allow greater flexibility in our metadata workflow and enable a small group of non-technical staff to perform batch loads. The tool empowers staff with the privileges to assemble, check, and deposit batch loads through a graphical user interface. A central feature of Piper is its ability to validate metadata and files prior to deposit. The tool relies on a suite of automated and customizable verifiers to confirm that metadata are properly encoded and that files are correctly specified. In its first phase, Piper is designed to mimic the work of the developer who had previously performed this work, with procedures for validating metadata and files and the flexibility to upload multiple content files and specialized licenses as part of item records. Once Piper has been honed for use as a tool for this specialized group, we plan to expand its functionality and facilitate and promote its usage by the larger Texas A&M community, as part of ongoing efforts to populate our repository with open access publications. We have developed Piper in an iterative process whereby the customer chooses what features and fixes to be handled in a cycle (typically two weeks) and accepts or rejects the implementations after live testing and demonstration at the end of the cycle. These practices are informed by the Agile school of project management popular in software development and other technical industries. In this way we seek to minimize wasted development on unneeded features and enable continuous delivery of value to stakeholders.

Browse

Browsing 2014 Texas Conference on Digital Libraries by Title

Results Per Page

Sort Options