Comparison Of Search-based And Kernel-based Methods For Graph-based Relational Learning

Date

2007-08-23T01:56:34Z

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Computer Science & Engineering

Abstract

Graph-based relational learning has been the focus of relational learning for quite some time. As most of the real-world data is structured, and hence cannot be represented in a single table, various logic-based and graph-based techniques have been proposed for dealing with structured data. Our goal is to perform an in-depth analysis of two such graph-based learning systems. We have selected Subdue to represent the search-based approach and support vector machine (SVM) with graph kernels to represent the kernel-based approach. We perform a comparison between search-based and kernel-based approaches and evaluate their performance in various domains. A search-based approach to learning typically involves a search through a larger hypotheses space. The main concern of a search-based learning system is to search through the hypothesis space efficiently. Kernel-based approaches on the other hand do not involve generation and search of a hypotheses space. Instead, a kernel-based system maps the given input space to a higher-dimensional space to perform linear classification. Experiments are performed on the mutagenesis dataset which is one of the benchmark datasets for structured, relational learning and artificial domains. Our aim is to evaluate these two systems based on their classification performance on the mutagenesis dataset with and without background knowledge. Besides mutagenesis data, various synthetic data including concepts such as ring, tree, clique problem and simple geometric shapes were used to study the ability of the systems to learn the required concepts. Subdue uses an inexact graph match to compute the cost involved in transforming one graph to another. We have implemented this inexact graph match as a potential graph kernel with Support Vector Machine and study its performance on various data domains.

Description

Keywords

Citation