Machine Learning Tools for Audio-Visual Transcriptions, Captions, and Text Analysis in Digital Libraries
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Rapid advances in inexpensive or free-to-use artificial intelligence and text-processing applications now make it possible for digital libraries to produce affordable, relatively high-quality text derivatives (captions, transcripts, subtitles, translations, etc.) of many audio-visual (AV) materials held in repositories and expose these materials to a wider audience than would otherwise be possible. While not perfect, recently released systems allow for outputs that often meet or exceed the accuracy of text-based OCR, and natural language processing on these outputs holds promise for generating metadata or performing other research-oriented tasks. Members of the UNT digital libraries team will discuss recent work they have explored in this area, comparing the quality of outputs, costs with other creation methods, resource commitments, and demonstrate other lessons learned along the way.