Texas Conference on Digital Libraries Proceedings

Permanent URI for this communityhttps://hdl.handle.net/2249.1/4513

Browse

Now showing 1 - 20 of 646

The Texas Digital Library Preservation Network
(2007-05-30) Maslov, Alexey; Texas A&M University
The Texas Digital Library is a collaborative project between public and private institutions across Texas that aims to provide curation, preservation, and access to digital scholarly information for the State. The preservation component of this mission means that TDL is committed to the long-term maintenance of its digital assets. Accomplishing this goal necessitates the creation of a TDL-wide preservation network. An effective preservation solution would encompass the following characteristics: • No single point of failure: by sharing copies of the same data between multiple geographically distributed locations, we ensure that failure of any one location does not result in permanent data loss. • Local allocation of resources: any member institution that joins the network would retain full control over the utilization of the resources they commit to the network. • Shared responsibility: responsibility for preserving digital assets is shared across all of the members of the network, eliminating reliance on any one institution’s resources. • Architectural flexibility: new locations can efficiently be added to the network, allowing for unforeseen growth. The TDL Preservation Network is a current project that seeks to address these issues. To accomplish these goals, we have designed a system with the following layered architecture: • User layer: represents the pool of users that have access to the preservation network system. This pool will be determined by the established policies and submission agreements at the institution level. • Application layer: contains the set of applications that can generate the data for the network, such as institutional repositories, e-journals, courseware management systems, and faculty archives. • Service layer: consists of a federation of data locations that implement preservation polices. This is the layer where the actual replication of data is performed and agreements between locations are brokered and recorded. • Storage layer: responsible for maintaining the individual copies of the preserved artifacts, and can be implemented with any number of standard technologies. This presentation will describe the current progress toward the implementation of the TDL Preservation Network, and the long-term goals for data preservation in the Texas Digital Library.
Shibboleth in the Texas Digital Library
(2007-05-30) Paz, Jay; Texas Digital Library
The Texas Digital Library (TDL) is taking advantage of the Shibboleth architecture to authenticate and authorize its users within its services. During the beginning of this year we have worked with UT Austin and Texas A&M Universities to create a TDL Federation and setup Identity and Service Providers at both locations. It is our goal to have each participating institution to join the TDL Federation and install an Identity Provider. This will allow each of the TDL services to be accessible to the entire set of member institutions. The presentation will describe the federation, identity and service providers, and describe their current implementation within the TDL Repositories Project.
Manakin Architecture: Understanding Modularity in Manakin
(2007-05-30) Phillips, Scott; Green, Cody; Maslov, Alexey; Mikeal, Adam; Leggett, John; Texas A&M University
Manakin is the second release of the DSpace XML UI project. Manakin introduces a modular interface layer, enabling an institution to easily customize DSpace according to the specific needs of a particular repository, community, or collection. Manakin’s modular architecture enables developers to add new features to the system without affecting existing functionality. This presentation will introduce Manakin’s modular architecture from a technical perspective, with an emphasis on extending Manakin’s feature set to meet local needs. First the project’s goals will be introduced, followed by a discussion of Manakin’s relationship with DSpace. Next an architectural overview of the primary components will be given: • DRI: The Digital Repository Interface (DRI) is an XML schema defining a language that allows aspects and themes to communicate. Manakin uses DRI as the abstraction layer between the repository’s business logic and presentation. The schema is adapted for digital repositories through the use of embedded METS-based metadata packages. • Aspects: Manakin aspects are components that provide features for the digital repository. These modular components can be added, removed, or replaced through simple configuration changes, enabling Manakin’s features to be extended to meet the needs of specific repositories. Aspects are linked together forming an “aspect chain”. This chain defines the set of features of a particular repository. • Themes: Manakin themes stylize the look-and-feel of the repository, community, or collection. The modular characteristics of themes enable them to encapsulate all the resources necessary to create a unique look-and-feel into one package. Themes may be configured to apply to a range of objects, from an entire repository down to a single page. Finally the presentation will close with a walkthrough of the Manakin architecture detailing how these components work together to form the Manakin framework.
When they show up on your driveway with burning torches, make sure you have some marshmallows on hand
(2007-05-30) Dyal, Donald H.; Texas Tech University
A Manakin Case Study Visualizing geospatial metadata and complex items
(2007-05-30) Maslov, Alexey; Green, Cody; Mikeal, Adam; Phillips, Scott; Weimer, Kathy; Leggett, John; Texas A&M University
Increasingly, repositories are responsible for preserving complex items, and items with specific/unique metadata, such as geospatial metadata. These collections present unique challenges for the repository interface, and traditional approaches often fail to provide adequate visualization mechanisms. This presentation is a case study of a particular collection that exhibits a Manakin solution to both of these challenges. The Geologic Atlas of the United States is a series of 227 folios published by the USGS between 1894 and 1945. Each folio consists of 10 to 40 pages of mixed content -- including maps, text, and photographs -- with an emphasis on the natural features and economic geology of the coverage area. Complex Items: The current visualization model in DSpace offers a cumbersome browsing experience for complex items, as the default item view in DSpace is not optimized for items that contain more than a few bitstreams. The logical organization of the folio collection was as a single DSpace collection with 227 items, where each item contained multiple bitstreams representing each page of the folio. The result was an uninformative list of filenames, each linking to a very large (approximately 100 MB) image file. Manakin allowed us to create a new detail view for the folio items using an image gallery-style viewing interface. This new view has thumbnails for each page and lower-resolution surrogates for screen viewing. It also allows a viewer to download either the full archival-quality TIFF or a reduced-quality JPEG. The combination of thumbnail surrogates and the ability to see all pages of a folio at once serves to increase the ease with which the collection is navigated and understood. Unique Metadata: The current DSpace interface is unable to leverage the potential of atypical metadata, such as the geospatial metadata attached to the folio collection. Although geographic elements were added to the DSpace metadata registry following Dublin Core Metadata Initiative (DCMI) recommendations, the only visualization mechanism DSpace could offer was a flat listing of the metadata values. Manakin allowed us to exploit the unique geospatial properties of the folio collection. It was determined that a map-based interface for browsing and searching would help a user to quickly determine the coverage area of a particular folio visually, as well as place the title in its geographic context. Both of the challenges presented by this case study could have been addressed using the existing JSP interface. However, the awkward nature of such an implementation would be impractical to create and maintain; furthermore, no mechanism exists to restrict such changes to an individual collection. Manakin's modular architecture made the creation of this interface achievable by a small team in a matter of days. Currently, the interface is available online at http://handle.tamu.edu/1969.1/2490, and has been featured as an Editor's Pick on Yahoo.com for its use of the Yahoo! Maps API.
Digital Archive Services (DASE) at UT Austin
(2007-05-30) Keane, Peter; University of Texas at Austin
Digital Archive Services (DASE) is a web-based application for managing digital images, sound files, video, documents, and web resources. In addition to the search interface and built-in presentation tools, DASE includes a set of simple, dynamic web services on top of which new applications can be built. Developed by Liberal Arts Instructional Technology Services at UT Austin, the initial goal for DASE was to provide a way for faculty members to get web-based access to image collections (both physical and digital) scattered around departments within the College of Liberal Arts. As we worked through specific issues regarding metadata schemes and desired search, browse, and save functionality, it became clear that other more general issues related to the management of digital content could be addressed as well. Every attempt was made to keep the DASE application architecture and data model as simple as possible. Towards that end, and given the diversity of collections involved, we allowed each collection to define its own set of metadata attributes. Attributes are simply flagged as 'searchable' (or not), thus allowing efficient cross-collection searching. For more fine-tuned searching, users can go to an individual collection and quickly and easily browse all of the available attributes. For those instances when a standard metadata schema IS necessary for proper interoperability with other systems, as in the case of RSS feeds, we simply "map" the attributes in the collection to the appropriate attributes in the RSS specification. Thus, collections that include audio or video files (as many now do) have a built-in means to provide "podcasting" functionality. In addition, by defining a simple set of web services (both RSS feeds and DASE-specific XML-over-HTTP) we have found new uses for DASE collections. DASE can easily serve the functions of a database-in-a-box for web site developers who would like to add simple dynamic capabilities to media-rich web sites. In working on the DASE project, we have seen time and again the same questions arising: How do we get our digital content on the web so as to share it with students and colleagues? How do we manage the huge amount of new content being produced and discovered every day? How do we maximize the opportunities for "repurposing" our content? How to we organize and preserve our digital assets? All of these are questions that we have attempted to address with DASE. While DASE does not pretend to provide a single comprehensive solution, it does provide solutions to a myriad of immediate problems and minimizes the risk involved in two ways: One, DASE is simple and offers a low barrier to entry. The technologies are all free and open source, and therefore can be implemented quickly and inexpensively. Even aside from the actual DASE application, the principles and architecture underlying DASE can be applied wholesale or in parts to address the challenges of content management. Two, DASE is based on well-defined and open standards and exposes a clear and transparent architecture. Moving from DASE to some other system in the future should be a very simple and straight-forward process.
From books to bytes: Accelerating digitization at TTU Libraries with Kirtas BookScan APT 2400
(2007-05-30) Lu, Jessica; Callender, Donell; Texas Tech University
In 2006, the Texas Tech University Libraries purchased a high speed book scanner capable of scanning over 2000 pages per hour from Kirtas Technologies Inc. Funded in part with a $130,000 grant from the Lubbock-based Helen Jones Foundation, the purchase of the BookScan APT 2400 is the first of its kind by a university in the United States. The Kirtas scanner turns book pages with a vacuum head, delivering puffs of air that lift and separate pages. Books are secured on a cradle that uses laser technology to maintain focus for dual, 16-megapixel cameras that capture high-resolution page images in color. Because it uses picture technology rather than scanning technology it operates faster than its scanning counterparts. Because the book cradle and automatic page turning device is a mechanical set up that requires a clamp to hold down the pages, only books within a certain size range can take full advantage of the high speed machine. However, TTU libraries have successfully digitized tiny books with manual page turning – and it still saves significant time and facilitates post processing because of the dual-camera set-up. The companion software (APT Manager) enables automated processing in which the user can set up templates to crop pages, remove clamps, de-skew, center, sharpen images, etc., for the entire book. The default output of the book scanner is JPEG. You can determine with the template specific file formats you want the software to convert to. When templates are set for each book, the processing instructions are saved to an XML file and all the files can be run through super batch without human interference, thus significantly increasing production. Technical metadata is automatically generated during the operation and saved to an XML file. Descriptive metadata can be retrieved through a catalog search or manual data entry and output to the designated content management system. The software package also includes an OCR (Optical Character Recognition) Manager that outputs to a variety of file types: PDF, WORD, TEXT and XML. To preserve the look of the original, the “image over text” option allows users to see the PDF document as the original photographed image, but it enables full-text searching with the underlying OCRed text. Super batch can be applied to the OCR process to speed up production. All the scanned files and associated metadata files are directly saved to SAN storage via fiber connection. This acquisition has significantly accelerated TTU Libraries’ digitization efforts. The Libraries already have a variety of book scanning projects in queue ranging from rare books to theses and dissertations. Currently the digital lab is testing the new workflow and set up with a pilot project featuring donor materials. We look forward to sharing lessons learned with anyone interested.
Map and GIS Resources in an Institutional Repository : Issues and Recommendations
(2007-05-30) Weimer, Kathy; Texas A&M University
Map librarians are increasingly digitizing and making available scanned map images over the internet. These digitized map collections are growing quickly in size and number. The issue of access and long term preservation to these scanned map collections is still in the early stages. Libraries suffer from a communication gap between the groups actively scanning maps and their IR staff. This is evident with the number of map scanning registries which are not part of an IR nor larger digital library initiative. The registries are increasing and both overlap and compete with each other. The benefit of an IR over both a basic web presentation and a digitized map registry is clear, due to the Google Scholar search capability and those configured as an OAI-PMH data provider, which result in freely harvested metadata. CNI conducted a survey to assess the deployment of IRs in the United States and among their findings was that nine repositories had map materials in their IR, twelve planned to include maps in the next by 2008. One example of a successful collaboration between a map librarian and IR staff is the Geologic Atlas project at Texas A&M University Libraries. In 2004, the Texas A&M University Libraries deployed dSpace. The Libraries digitized and uploaded the complete 227 folio set of the Geologic Atlas of the United States to dSpace. It was published by USGS between 1894 and 1945, and contains text, photographs, maps and illustrations. This collection serves as a pilot project to study scientific map and GIS resources in an IR, generally, and specifically, the use of geographic coordinates in metadata in building a map-based search interface, and the addition of GIS files in an IR environment. For this set, geographic coordinates were added to the metadata, including “coverage.spatial,” “coverage.box” and “coverage.point”. Fortunately the maps in this set are a very regular rectangle and coordinates were readily available. The map coordinates supported the creation of a YahooMap! interface. Each folio is located on a map of the US and can be readily found with a visual interface. The digitized maps are being converted into GIS files, and will be used to assess feasibility of GIS resources in the IR. These are some excellent examples of advanced geospatial data libraries which can serve as a model: NGDA (National Geospatial Digital Archive- UCSB and Stanford libraries), NCGDAP (North Carolina Geospatial Data Archiving Project), CUGIR (Cornell University Geospatial Information Repository) and GRADE (Geospatial Repository for Academic Deposit and Extraction) project. These groups and others are tackling the issue of long term preservation of GIS data in digital libraries. There are increasing numbers of map resources in digital libraries and IRs. The maps serve an important role in communicating scholarly information. Map librarians should collaborate on scanning standards and metadata creation. Map librarians and digital libraries staff should increase their communication and collaborate in order to improve the access to these collections.
Digital Initiatives at the University of North Texas Libraries
(2007-05-30) Hartman, Cathy Nelson; University of North Texas
Developing a Common Submission System for ETDs in the Texas Digital Library
(2007-05-30) Mikeal, Adam; Brace, Tim; Texas A&M University; University of Texas at Austin
The Texas Digital Library (TDL) is a consortium of universities organized to provide a single digital infrastructure for the scholarly activities of Texas universities. The four current Association of Research Libraries (ARL) universities and their systems comprise more than 40 campuses, 375,000 students, 30,000 faculty, and 100,000 staff; while non-ARL institutions represent another sizable addition in both students and faculty. TDL's principal collection is currently its federated collection of ETDs from three of the major institutions; The University of Texas, Texas A&M University, and Texas Tech University. Since the ARL institutions in Texas alone produce over 4,000 ETDs per year, the growth potential for a single state-wide repository is significant. To facilitate the creation of this federated collection, the schools agreed upon a common metadata standard represented by a MODS XML schema. Although this creates a baseline for metadata consistency, there exists ambiguity within the interpretation of the schema that creates usability and interoperability challenges. Name resolution issues are not addressed by the schema, and certain descriptive metadata elements need consistency in format and level of significance so that common repository functionality will operate intuitively across the collection. It was determined that a common ingestion point for ETDs was needed to collect metadata in a consistent, authoritative manner. A working group was formed that consisted of representatives from five universities, and a state-wide survey of the state of ETDs was conducted, with varied levels of engagement with ETDs reported. Many issues were identified, including policy questions such as open access publishing, copyright considerations and the collection of release authorizations, the role of infrastructure development such as a Shibboleth federation for authentication, and interoperability with third-party publishers such as UMI. ETD workflows at six schools were analyzed, and a meta-workflow was identified with three stages: ingest, verification, and publication. It was decided that Shibboleth would be used for authentication and identity management within the application. This paper reports on the results of the survey, and describes the system and submission workflow that was developed as a consequence. A functional prototype of the ingest stage has been built, and a full prototype with Shibboleth integration is slated for completion in June of 2007. Demonstrators of the application are expected to be deployed in fall of 2007 at three schools.
University of Texas at Austin’s Texas Digital Library Bridge Group
(2007-05-30) Thompson-Young, Lexie; University of Texas at Austin
Interoperability Options: Lessons Learned from an IMLS National Leadership Grant
(2007-05-30) Plumer, Danielle Cunniff; Texas State Library and Archives Commission
In 2005, the Texas State Library and Archives Commission received an IMLS National Leadership grant to develop a multi-component federated search tool on behalf of the Texas Heritage Digitization Initiative that can search across digital collections of cultural heritage materials in Texas libraries, archives, and museums. Successful digitization projects in other states have focused on creating one or more centralized repositories of electronic resources and associated metadata. In contrast, the THDI project provides a single interface to decentralized repositories across the state. This interface, which will be available June 1, 2007 at http://www.texasheritageonline.org, has three components: a federated or broadcast search application, which uses Z39.50 to interact with library systems in real time; an OAI harvester operated by the University of North Texas Libraries to harvest metadata from institutions that do not have a Z39.50 front end; and increasing amounts of support for other APIs such as those used by A9 and Yahoo! In the process of developing this application and connecting collections, the THDI development team has learned some useful lessons. In particular, this presentation will focus on OAI-PMH implementation issues and the need for sharable metadata. Many projects, including the National Science Digital Library, have reported on the difficulty of combining metadata from multiple institutions, even when commons standards and controlled vocabulary sources are required. The solutions we have developed, including automated segmentation of OAI harvests and development of custom XSL transformations to map harvested metadata into common formats, are relevant to institutional repositories as well as to participants from the cultural heritage sector. The THDI development team has also gained experience working with lightweight search protocols including SRU and RESTful APIs, such as those from Yahoo! and A9, which are remarkably simple to implement when contrasted with the Z39.50 protocol, still widely used in library catalogs. REST, or Representational State Transfer, is a stateless, cacheable client/server architecture allows collections to share data over HTTP. Because THDI has worked with a wide variety of institutions, we are confident that this approach is both scalable and transferable to other types of digital library architectures. In a state the size of Texas, digital libraries cannot be "one size fits all." Instead, they must be flexible, adaptable, and offer institutions local control. The lessons learned from the THDI IMLS National Leadership Grant can help institutions develop new models of collaboration and distributed interaction in digital library development.
To Stream the Impossible Stream: Liberating the Texas Tech University Libraries' Sound Recording Collection
(2007-05-30) Thomale, Jason; Starcher, Christopher; Texas Tech University
The Texas Tech University Libraries’ sound recording collection consists of more than 4,000 compact discs that feature art music, jazz, and folk music from around the world. The collection sees substantial use from students and faculty alike, but the medium on which the recordings exist is not optimally accessible—it requires patrons to come to the library building and allows only one patron to listen to a recording at a time. For this reason the collection was a prime candidate for being incorporated into the Texas Tech digital library; as a digital library collection, it would be accessible anytime, anywhere via the web. Thus, the concept for the Streaming Sound Collection (SSC) was born. In implementing the SSC, the project team faced a wide variety of challenges that are common to many digital-collections-building projects—the ways in which the team overcame these challenges are instructive for others embarking on similar journeys. The initial complication was the most obvious: copyright. In an age when corporations feel compelled to prosecute children and the elderly for relatively minor offenses, it was hardly an issue that a large state university could ignore. It was imperative that the content be protected. There were two areas of concern for which we could not equivocate—who shall have access and what type of access they shall have. These two issues drove many of the decisions that were made, including such crucial decisions as format and delivery mechanism of the content. The objects that make up the SSC are not simple. Providing access to them so that they would be both findable and usable was a key consideration in building the collection. The initial step toward this end was to decide on the system where the objects would reside. The project team first considered putting them in the catalog and later toyed with contracting a programmer to invent a completely customized web application—but both of these solutions proved untenable because neither comprehensively served the complete set of library needs, digital library needs, and collection needs. In the end, the project team developed a solution that successfully balanced all of these needs sets. Efficiently creating quality metadata for the collection was the third major challenge. Jane Greenberg, E. D. Liddy, and others have deftly described this as the “metadata bottleneck.” Indeed—if one views metadata creation similarly to library cataloging, in which a trained expert must carefully examine an object and use an arcane syntax to record minute details about it, then the process quickly gums up what might otherwise be an efficient project. The SSC project, however, by the way it leverages existing catalog records and workflows, serves as an example of how creative automatic metadata processing can help widen the bottleneck. It also demonstrates how an early understanding of the collection’s metadata needs and foresight about how one might process existing data has helped the resulting metadata become more than the sum of its MARC.
Energy Systems Laboratory:Building a Repository Collection and Planning for the Future
(2008-06-09) Koenig, Jay; Haberl, Jeff S.; Gilman, Don; Hughes, Sherrie; Texas A&M University; Energy Systems Laboratory
The Energy Systems Laboratory (ESL) is a division of the Texas Engineering Experiment Station and part of the Texas A&M University System. First established in 1939, the ESL maintains a testing laboratory on the Riverside Campus in Bryan, Texas, and offices on the main campus of Texas A&M. The group consists of five faculty members from the Department of Mechanical Engineering, as well as three faculty members from the Departments of Architecture and Construction Science. The lab currently employs approximately 120 staff members, including mechanical engineers, computer science graduates, lab technicians, support staff, and graduate and undergraduate students. The Lab focuses on energy-related research, energy efficiency, and emissions reduction, and has a total annual income for external research and testing exceeding $4.5 million. With energy research and policy at the forefront of public discussion, both academic and political, the urgency of making this research publicly available is very high. The Energy Systems Laboratory collection in the Texas A&M Digital Repository is unique in a number of ways. After first contacting the library in March 2005, the ESL became one of Texas A&M's earliest adopters of the repository. The collection is very diverse, and contains conference proceedings, published articles, technical reports, and electronic theses and dissertations produced by students affiliated with the ESL. The ESL is also the first repository client to take the initiative of assigning staff members to learn the batch loading process for themselves, both relieving library staff of the burden and allowing the collection to expand even more rapidly. The collection has also successfully made the transition, despite some challenges, from the original DSpace interface to the Manakin-themed repository now in place. After three years, the collection remains one of the largest collections in the system, continues to grow as more of the group's research and publications are added to the collection, and is held forth as a model collection to prospective repository clients in the Texas A&M community. This is a testament to the Energy Systems Laboratory's dedication to the building of their repository collection, and their clear understanding of the advantages of open access. This presentation will discuss the excellent working relationship built between the Energy System Laboratory and the library, and how much relationships can be fostered with other collections as the repository expands. It will also recount the events leading up to the ESL's original adoption of the repository, and will chronicle the evolution of the repository collection, the addition of new content, the transition and adaptation to new technology, the copyright and other challenges faced, and the group's future needs for additional tools and services.
Embedding A Digital Repository within the Texas A&M University Library Web Services
(2008-06-09) Leggett, John; Tarpley, Jeremy; Ponsford, Bennett; Phillips, Scott; Mikeal, Adam; Maslov, Alexey; Messinger, Tina; Armstrong, Tommy; Creel, James; Texas A&M University; Texas Digital Library
The development and deployment of the Manakin theme for the digital repository at Texas A&M University provides an informative case study in embedding DSpace repositories within an institutional web presence. Last year, the Texas A&M University Libraries began a redesign of the existing web interface in accordance with a new institution-wide branding initiative. A collaborative effort between administrators, designers, and developers has yielded a look and feel for the institutional repository that integrates seamlessly with the library's and university's other web services while providing the unique functionalities required by various and diverse collections. The use of Manakin themes ensured that the development process was modular and employed well-established web development techniques and technologies. The design of the digital repository theme began with consultations between library designers and TAMU branding authorities. The designers used Photoshop to produce mock-up pages for primary use cases with colors, fonts, and graphics that adhered to the institutional branding mandates while satisfying usability heuristics. These designs underwent iterative refinement with comments from administrators and developers. When all parties were satisfied, the design team translated the images into HTML and CSS mock-ups for web browser rendering. Designers handed off the HTML code to the Manakin theme developers, who coded XSL to produce such HTML from XML DRI data generated from the repository. Developers coded additional Javascript to implement the UI vision of the designers. Developers produced two Manakin themes of different specificity - A theme for the repository in general, and one that specifically applied to the Geologic Atlas of the United States map collection. That theme, known as "Geofolios," employs the Yahoo! Maps API and Google Earth overlays to allow patrons to browse the collection in the context of manipulable maps indicating the geographic context of the folios. In summary, embedding the digital repository in the institutional web presence required no more effort than other XML-based content would have. The pre-development design process and use of XSL transforms are standard practices in institutional web development. Manakin's ability to apply themes to specific content enabled a neat separation of development between the Geofolios theme and the general theme. THe augmentation of additional collections with customized interfaces in the future would be a similarly modular activity. Importantly, the use of Manakin themes provides a seamless integration between the repository and the library's existing web presence, reducing patrons' cognitive overhead in navigating between the repository and other services.
Preservation of the Texas Agricultural Experiment Station Bulletin in the Digital Repository
(2008-06-09) McGeachin, Rob; Texas A&M University
The 'Bulletin of the Texas Agricultural Experiment Station' is being digitized for preservation, archived in the Texas A&M University Digital Repository, and made accessible to a worldwide audience. This bulletin series, which began publication in 1888, covers a wide variety of agricultural research reports. Subjects include grain, forage, fruit, and nut crop varieties, animal production and feeding, veterinary medicine, agricultural engineering and innovation, agricultural economics, and other information of scientific and historical value. Many bulletins contain photographs and figures with historical significance, for example, some of the first crop dusting airplanes used in 1925, and early cotton picking machinery from the 1920's. Each page of the original print bulletin is scanned with an OpticBook 3600 book edge scanner and saved as an archival gray scale tagged image file format (TIFF) file at 400 dpi for text pages or 600 dpi for figure or illustration pages. The page images are combined into a PDF document of the pages and optical character recognition generated searchable full text. Dublin Core metadata records are created for each Bulletin and National Agricultural Library Thesaurus subject terms are added to the records. The TIFF files, PDF file, and metadata record for each Bulletin are uploaded to the Digital Repository operated by the Texas A&M University Libraries. In addition to the search functionality of the repository, the metadata records are harvestable by web crawlers and incorporated in many other web search indexes making them more discoverable worldwide. This digital content is also being created in support of a distributed effort among academic libraries to provide content for a National Digital Library for Agriculture.
The Digital Assembly Line: Renning an effective and efficent digital lab
(2008-06-09) Perrin, Joy; Henry, Cynthia; Texas Tech University
Over the past year and a half, the Texas Tech University Libraries Digital Lab has experienced a huge learning curve. When the digital lab was first developing, the focus was on learning the equipment, the digitization process, developing workflows, and striving for quality. The lab typically ran with only 3-7 students, matching individual students to individual projects. This allowed detailed knowledge of the digitization process to be documented and to establish workflows for a variety of different types of formats that could be produced from the lab. Quality was highly coveted, regardless of the lack for speed in the lab. At this point, the library administration brought in project management training. This helped to streamline how projects were planned and allowed employees to explore how a project would be implemented before actually beginning the project. As the digitization efforts of the library increased, the number of projects increased and therefore the demand on the lab increased as well. Consequently the digital lab needed to explore new ways in order to become more efficient: by using human resources more effectively and by matching equipment to the digitization needs of the library. The lab was able to effect change by increasing the number of students to 24-36 and by increasing the number of hours the lab was staffing individuals. The lab was then forced to look at how students were assigned to projects. Instead of matching an individual student to an individual project, the lab moved to a system that uses the priority matrix for digital projects set by the DLI Team to assign a portion of the number of hours in the lab to each project. At the same time, the lab looked into increasing the productivity level as workflows had been established for several formats and quality was replaced for quantity; concurrently the lab manager evaluated the equipment in the lab and was able to identify ways to streamline efficiency by increasing the number and speed of the computers in the lab. In conclusion, Texas Tech University Libraries Digital Lab was grown significantly over the last year. We have moved from exploring how to develop digital projects to a fully functioning effective digital assembly line.
Using Rich-Media in Digitization Education: GLIFOS-media toolset projects at the School of Information
(2008-06-09) Stewart, Quinn; Arias, Rodrigo; University of Texas at Austin; GLIFOS
This presentation will examine the use of the GLIFOS-media toolset in 3 classes at the School of Information at The University of Texas, "Creating and Using Digital Media Collections", "Advanced Digitization: Creating Sustainable Collections" and "Understanding and Serving Users". Each of these courses involves students working with digitized materials to create rich-media access copies of the materials using XML-based tools created by GLIFOS. An ongoing problem for both tenured and tenure-track faculty at the School of Information is not only staying current with digital library technologies, but trying to teach and implement them in the classroom. As part of an IMLS-funded digitization curriculum, Dr. Grete Pasch and Quinn Stewart co-developed the "Creating and Using Digital Media Collections" course in the spring semester of 2006. This course utilized the Glifos-media toolset in creating an indexed, synchronized, searchable collection of 14 historic kinescope films from the Harry Ransom Center of "The Mike Wallace Interview" from 1957-58. Students of Dr. Gary Geisler did further work on the collection in spring 2007 and 2008, with the assistance of Quinn Stewart. Positive feedback from students, and successful interaction with the software developers led to the inclusion of the "Texas Legacy Project" in spring 2007. This project is a collection of over 200 video oral history interviews with Texas conservationists, maintained by the Conservation History Association of Texas. Students indexed the content and synchronized the transcript with the video for 12 of those interviews, using the gmCreator rich-media creation tool. They then created full-size access versions of the interviews, each with a table of contents, synchronized transcript, annotation page, and search page. Based upon student and end-user feedback, and experience gained using this "co-instructor" model with 10-15 students, Stewart approached Dr. Phil Doty and Dr. Luis Francisco-Revilla about incorporating the Texas Legacy Project into the core course "Understanding and Serving Users". Using a tutorial-based teaching method, Stewart guided 54 students through preparing approximately 75 hours of video content for public use using the GLIFOS-media toolset. The Information Technology Lab in the School was setup to support the students in the two classes, and a workflow to handle the student-generated files was created. Output from the two classes was cataloged into GLIFOS-media library, the rich media digital library component of the GLIFOS-media toolset. Students could then search both within each interview, as well as across all of the interviewers. The gmCreator tool was also used to add geographic information to each of these interviews, allowing the viewer to simultaneously view the rich media presentation as well as geographic information using Google Earth. The same tutorial-based teaching method was used in the "Advanced Digitization: Creating Sustainable Collections" course taught by Ellen Cunningham-Kruppa. The video digitization portion of this course involved digitizing Umatic videotapes from the UT Ex-Students Association Distinguished Alumni Awards ceremonies for inclusion into the GLIFOS-media library at the School of Information.
NIH Public Access Policy: What It Means for Authors and for Universities
(2008-06-09) Furrh, Jamie L.; University of North Texas
Part of the 2008 Consolidated Appropriations Act that Congress passed and President Bush signed includes a provision requiring the National Institutes of Health (NIH) to make its voluntary Public Access Policy mandatory. This is a landmark achievement, as it is the first Open Access initiative to be mandated by U.S. government. As with anything that is implemented for the first time, there are some questions and concerns regarding how this new law will work, and the pieces that need to be in place for it to be successful. This presentation will provide a description of what the current NIH Public Access Policy is(1); a brief history of the policy from 2004 to present day(2); discuss how the policy effects research authors and the institutions they work at; consider the six options on institutional compliance as presented by the Scholarly Publishing and Academic Resources Coalition, Science Commons, and Association of Research Libraries White Paper(3); examine actions taken by other Universities regarding Open Access; explain the work currently underway by the University of North Texas Health Science Center to ensure compliance; and discuss the future of scholarly communication(4) as it relates to the ultimate goal of UNTHSC regarding Open Access and compliance with NIH policy. References: (1) Public Access Homepage, http://publicaccess.nih.gov/, Accessed 4/7/2008. (2) English, Ray and Joseph, Heather. The NIH mandate: An open access landmark; 69; http://www.ala.org/ala/acrl/acrlpubs/erlnews/backissues2008/february08/nihupdate.cfm. Accessed 4/7/2008. (3) Carroll, M. W. Complying with the national institutes of health public access policy: Copyright considerations and options. SPARC, Science Commons, ARL; February 2008; Accessed 4/7/2008. (4) Hahn, K. L., Talk about talking about new models of scholarly communication. JEP, Winter, 2008; 11, pp. 1-14, http://hdl.handle.net/2027/spo.3336451.0011.108.
Supporting a Research Agenda: Using Library Funds for Access to Datasets in Management and the Social Sciences
(2008-06-09) Safley, Ellen; Venetis, Mary Jo; University of Texas at Dallas
Over the past 8 years, the University of Texas at Dallas Libraries made a concerted effort to support the quantitative research of the faculty through the licensing of social and business datasets. For research efforts to be competitive, the faculty in the School of Management requested access to standard financial datasets so they could explore questions posed by larger, more established business programs. If the faculty was unable to use the same data resources and the same level of observation, their research would not be credible. Rather than purchase more monographs, the faculty provided a strong argument for using some funds to improve their research quality. While their use of journal collections was strong, the need for access to datasets was deemed essential for quantitative programs. Secondly, the School of Economics, Political and Policy Sciences wanted access to social and country macroeconomic data to support programs in geographic information systems, political science, criminology, and economics. In addition to supporting faculty and graduate student research, the students developed skills that could be used in their future work on real problems impacting cities and social organizations. Finally, a Data Librarian was added to the library staff to help customers gain access to the files and to market the products. The Librarian worked individually with faculty and students to find the appropriate files, gain access to various platforms, and show them how to extract and organize the datasets. WIthout someone in place, the resources would be underutilized. The Library developed the expertise to negotiate licenses to dataset resources. Since most datasets are licensed to commercial firms, the contracts are very different and extended negotiations can occur. In addition, how the datasets will be used and controlled requires cooperation from the Library and the Dean of the programs. The School of Management developed an internal committee of faculty members to reduce duplication of datasets, to create a priority for products, and to work with the Library to control the acquisition process. Rather than responding to individuals needs from faculty members, the Library deals directly with one representative from the committee. The success of the Data Services program elevates the quality of the work of the faculty and the University, provides a means for the Library to partner with them and to share the collective organizational expertise of the librarians, and recognizes what information is needed to research a problem in the 21st century. During the 2008, the Library received special recognition from the Southern Association of Colleges and Schools for allocating funds to acquire and license datasets and for recognizing the need to provide information through a variety of means. In the future, the Library strives to incorporate the acquisition and storage of research datasets into its mission and to archive the research within the Texas Digital Library.

Browse

Browsing Texas Conference on Digital Libraries Proceedings by Issue Date

Results Per Page

Sort Options