Efficient Exploration Techniques On Large Databases

Date

2011-07-14

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Computer Science & Engineering

Abstract

Search, retrieval, and exploration of information have become some of the most intense and principal research challenges in many enterprize and e-commerce applications off late. The mainstay of this dissertation is to analyze and investigate different aspects of online data exploration, and propose techniques to accomplish them efficiently. In particular, the results in this dissertation widen the scope of existing faceted search and recommendation systems - two upcoming fields in data exploration which are still in their infancy. Faceted search, the de facto standard for e-commerce applications, is an interface framework with the primary design goal of allowing users to explore large information spaces in a flexible manner. We study this alternative search and exploration paradigm in the context of structured and unstructured databases. More specifically, motivated by the rapid need of knowledge discovery and management in large enterprize organizations, we propose DynaCet, a minimum effort driven dynamic faceted search system on structured databases. In addition, we study the problem of dynamic faceted retrieval in the context of unstructured data using Wikipedia, the largest and most popular encyclopedia. We propose Facetedpedia, a faceted retrieval system which is capable of dynamically generating querydependent facets for a set of Wikipedia articles. The ever-expanding volume and increasing complexity of information on the web has made recommender systems essential tools for users in a variety of information seeking or e-commerce activities by exposing them to the most interesting items, and by offering novelty, diversity, and relevance. Current research suggests that there exists an increasing growth in online social activities that leaves behind trails of information created by users. Interestingly, recommendation tasks stand to benefit immensely by tapping into these latent information sources, and by following those trails. A significant part of this dissertation has investigated on how to improve the online recommendation tasks with novel functionalities by considering additional contexts that can be leveraged by tapping into social data. To this end, this dissertation investigates problems such as, how to compute recommendation for a group of users, or how to recommend composite items to a user. Underlying models leverage on social data (co-purchase or browsing histories, social book-marking of photos) to derive additional contexts to accomplish those recommendation tasks. In particular, it focuses on techniques that enable a recommendation system to interact with the user in suggesting composite items - such as, bundled products in online shopping, or itinerary planning for vacation travel. We investigate the technical and algorithmic challenges involved in enabling efficient recommendation computation, both from the user (the interaction should be easy, and should converge quickly), as well as the system (efficient computation) points of view. This dissertation also discusses extensive performance and user study results, which were conducted using the crowd-sourcing platform Amazon Mechanical Turk. We conclude by briefly describing other promising problems with future opportunities in this field.

Description

Keywords

Citation