Browsing by Subject "web archiving"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Break, Drift, Rot: How Academic Librarians Can Weatherproof References in Electronic Theses and Dissertations(Texas Digital Library, 2023-05-17) Anders, KathyElectronic Theses and Dissertations (ETDs) suffer from reference rot in a manner similar to other scholarly publications (Massicotte and Botter), but involve a greater breadth of librarian involvement in their management, dissemination, and preservation. Indeed, reference rot in ETDs in disciplines where students are citing “web-at-large” (Klein, et al.) material is a particular problem, in that web-at-large sources generally are not preserved and archived to the same degree as scholarly journal articles. Because of this, cited material in ETDs is prone to rot either from a number of factors ranging from links that do not resolve to substantial content drift. In an effort to mitigate reference rot in ETDs, a team of researchers from Texas A&M University and Los Alamos National Laboratories came together to consider how to address the issue through socio-technical interventions, melding technical solutions (permalinks, web archiving, and, ideally, Vireo integration) with human awareness (instruction to authors). This presentation will discuss the researchers’ in-progress work about how both types of interventions can be deployed at academic libraries to help create ETDs that are more resistant to reference rot. While the particular focus of this presentation is on ETDs, this presentation will intersect with topics in digital preservation and web archiving. Mia Massicotte and Kathleen Botter, “Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library,” Information Technology and Libraries 36, no. 1 (2017): 11–28, https://doi.org/10.6017/ital.v36i1.9598. Martin Klein, Herbert Van de Sompel, Robert Sanderson, Harihar Shankar, Lyudmila Balakireva, Ke Zhou, Richard Tobin, “Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot,” PLoS ONE 9, no. 12 (2014), https://doi.org/10.1371/journal.pone.0115253.Item Intro to Web Archiving Texas: Web Archiving in the Library: Policies, Procedures, and Program Integration(Texas Digital Library, 2019-10-22) Barcelona, Leanna; Bondurant, John; Lamphear, Anna; Phillips, MarkItem Intro to Web Archiving Texas: Web Archiving Sampler: Examples from Texas Institutions(Texas Digital Library, 2019-09-05) Barcelona, Leanna; Bondurant, JohnItem Intro to Web Archiving Texas: Web Archiving Technology, Tools & Resources(Texas Digital Library, 2019-08-29) Mumma, Courtney; Ko, Lauren; Phillips, MarkItem Session 3B | Quantifying and Qualifying the Loss of Web-Based Evidence in Graduate Performance Studies Theses(Texas Digital Library, 2021-05-26) Budzise-Weaver, Tina; Christie Anders, KathyHow is web-based evidence referenced and documented in Performance Studies masters theses? How can Performance Studies, with its framework for analyzing performances in context and its heightened awareness of ephemerality, inform our understanding of evidence lost to link rot? Relying on a corpus of born-digital Texas A&M University theses, we analyze documentation practices and discuss the implications of lost evidence for these graduate publications. We will ground our meditation on these questions with observations about citation practices and the potential for intervention, with an eye towards a generalizable and potentially automated approach to guarding against rot in electronic theses.Item Session 3G | Collaborating to Build Web Archives in Texas(Texas Digital Library, 2021-05-26) Ko, Lauren; Manning, Mary; Rojas, KatieA collaboration-focused subcommittee of TDL’s Web Archiving Texas Interest Group (WATXIG) will present their work gathering data about the web archiving interests, practices, and tools used in Texas. The working group has also explored tools for site and seed nomination as a means to facilitate collaborative collecting. Understanding what others are collecting identifies gaps and highlights underrepresented topics, allowing institutions to build more diverse and geographically representative collections. We will also share information about the other work that WATXIG is doing, including an Archive-It discount available to TDL members, and how you can help us shape web archiving in Texas.Item Systems Interoperability and Collaborative Development for Web Archiving - Filling Gaps in the IMLS National Digital Platform(2016-05-25) Mumma, Courtney; Phillips, Mark; Internet Archive; University of North TexasThe Institute of Museum and Library Services (IMLS) awarded a National Leadership Grant, in the National Digital Platform category, to a proposal by Internet Archive’s Archive-It, Stanford University Libraries (DLSS and LOCKSS), University of North Texas, and Rutgers University. The $353,221 grant will support the project “Systems Interoperability and Collaborative Development for Web Archiving,” a two-year research project to test economic and community models for collaborative technology development, prototype system integration through development of Export APIs, and build community participation in web archiving development and new research and access tools. The project supports the National Digital Platform funding priority of IMLS by increasing access to shared services and infrastructure while building capacity for broader community input in technology development. Project outcomes will promote system integration, facilitate increased distributed preservation of archived data, and help support new global and local access models possible through export APIs, with an eye towards modeling post-grant interoperable systems architectures. Archive-It’s status as widely-used, shared web archiving infrastructure ensures broad community impact and makes possible the involvement of institutions of all sizes in project work. The involvement of Stanford University Libraries builds on their work in the Hydra community and with digital preservation services. UNT contributes experience in digital library and web archiving technology development and Rutgers’ work on research uses of web archives ensures the involvement of downstream user communities. Overall, the project will lay the groundwork for future collaboration around interoperability that will enhance the integration of disparate systems, increase local preservation, and improve the discoverability and use of web archives. Mark Phillips of UNT and Courtney Mumma of IA will describe the grant and provide an update about the work completed in the first six months of activity. Attendees will be invited to participate in an active and growing community, a key component in the grant’s success and the work’s sustainability.Item Web Archives and Large-Scale Data: Perliminary Techniques for Facilitating Research(2012-05-25) Woodward, Nicholas; Norsworthy, Kent; Texas Advanced Computing Center; University of Texas at AustinThe Latin American Government Documents Archive (LAGDA) is a collaborative project of the University of Texas Libraries, The Nettie Lee Benson Latin American Collection, and the Latin American Network Information Center (LANIC) at The University of Texas at Austin that seeks to preserve and facilitate access to a wide range of ministerial and presidential documents from 18 Latin American and Caribbean countries. Web crawling is conducted quarterly using the Internet Archive’s Archive-It application. The resulting Archive contains copies of the Websites of approximately 300 government ministries and presidencies between 2005 and the present. Currently, LAGDA is comprised of approximately 66.6 million documents archived from the Internet, totaling 5.6 terabytes of data. The collection increases in size by an additional 250 gigabytes with each quarterly crawl. Content in the Archive includes not only the full-text versions of official documents, but also original video and audio recordings of key regional leaders, all archived in the ARC file format produced by the Heritrix web crawler. Archive contents include thousands of annual and "state of the nation" reports, plans and programs, and speeches by presidents and government ministers. The data include HTML-formatted pages, Microsoft Word documents, Adobe PDF files and RTF documents, as well as various audio and video formats. The collection includes only sparsely populated metadata. Promoting research of the collection is a central component of the LAGDA project, and towards those ends staff has collaborated with researchers at the Texas Advanced Computing Center (TACC) using the LAGDA data to develop text-mining methods for document representation and classification. This includes implementing several strategies to mechanically classify and categorize information contained in the Archive in order to facilitate search and browse capabilities. Additionally, LANIC and TACC have worked together to create methods for research on sub collections in the Archive, e.g. presidential speeches or ministerial documents. Preliminary results of these efforts have been encouraging, and they are the initial steps on the path towards solutions that will make large-scale data more accessible to researchers. The challenges presented in LAGDA are similar to those faced by academic libraries across the country as they are increasingly faced with “big data” collections that necessitate new strategies for data analysis tools. Nascent projects such as LAGDA provide some initial insights into how academic libraries can work collaboratively to facilitate research on the types of large-scale collections that are increasingly prevalent in today’s digital world. The presentation will focus on the following components: Challenges presented by Web archived data “Big data” and data-driven research The role of libraries in data analysis The future of “big data” and libraries