User-Configurable Discovery Across Collections

Date

2019-05-22

Authors

Bolton, Michael
Creel, James
Day, Kevin
Hahn, Doug
Huff, Jeremy
Laddusaw, Ryan
Savell, Jason
Welling, William

Journal Title

Journal ISSN

Volume Title

Publisher

Texas Digital Library

Abstract

A persistent challenge facing the digital library community is how to provide a single discovery interface for a large set of heterogeneous digital collections. To address this challenge, the development team at Texas A&M University Libraries has been actively developing a new open-source application called SAGE (Search Aggregation Engine) available at https://github.com/TAMULib/sage. SAGE functions to combine any number of Solr indices, crosswalk the fields, and generate one (or more) aggregated indices. SAGE is a Java web service with a few abstractions to accomplish the aggregation task. The Java model performs aggregation by way of “Jobs”, which themselves consist of “Readers” and “Writers”. Each of these Java entities is configurable through a browser-based user interface (UI). Each UI-configured Reader brings in a Solr core, a customizable Solr query to filter the core for desired documents, and a configurable mapping from the core's schema to SAGE's internal metadata representation. In the complementary role, Writers map from SAGE's internal metadata representation to a destination Solr schema. A Job can consist of any number of Readers and Writers. Jobs can be triggered through the UI, an API call, or periodic scheduling. When a Job runs, it reads from all of its associated Readers, combines the results by mapping to its internal metadata representation, then writes the result set using each Writer to convert the internal representation to the proper schema for that Writer's associated Solr core. The result is one or more Solr cores, each containing the filtered, crosswalked contents of the originating Solr cores. In addition to these aggregation features, preliminary work has yielded excellent prototyping for dynamic creation of “Discovery View” landing pages. The “Discovery View” feature set can be utilized via the UI or API, and enables dynamic creation of a custom UI for any given Solr core. The administrator may select what fields will be exposed to users as searchable, facetable, and displayable in result metadata. We are unaware of any other existing solution providing UI-based creation of discovery interfaces to arbitrary Solr indices. The plug-and-play styled arrangement of these features combined with SAGE's interface driven architecture provides flexibility and opens the door to future enhancements. One possibility could be drop-in processors for performing transform operations on the aggregated results before writing. Also, the application invites the enticing possibility of Reading/Writing from/to non-Solr sources, such as MARC. SAGE’s ability to combine indices and expose these indices through the UI as Discovery Views entails a significant advancement on existing discovery solutions."

Description

Presented by Texas A&M University, 2C | Technology & Tools, at TCDL 2019.

Keywords

Citation