Genetic algorithms with functional mutation and mating operators in time series data mining

Date

2004-08

Journal Title

Journal ISSN

Volume Title

Publisher

Texas Tech University

Abstract

Recently, genetic algorithms (GAs) and artificial neural networks (ANNs) have been widely used in time series data mining (TSDM). Both GAs and ANNs are inspired from natural processes. A GA can be used to find optimized parameters for a given model, while an ANN has the ability to approximate unknown functions to any degree of desired accuracy without knowing the model. There are some limitations of using GAs or ANNs individually in TSDM. For example, ANNs generally use backpropagation learning algorithms, which are based on the deepest descent algorithm. Therefore, a solution from the .A.NN usually is a local optimized solution.

The purpose of this thesis work is to develop innovative algorithms which can overcome the limitation of using GAs or ANNs solely in TSDM. The first part of this research involves designing a new genetic algorithm (called mGA), which can analyze not only polynomial but also non-polynomial time series. The mGA automatically searches a polynomial function with minimal degree for a non-polynomial time series.

The rest of this research focuses on developing a neural network based genetic algorithm (called nGANN). The nGANN represents a chromosome as a neural network and uses genetic operators to select a global solution for a lime series. The nGANN introduces a new mating scheme (called NN _ mate), which uses a backpropagation learning network to produce offsprings. Therefore, NN mate can mate two parents with different models. The solution found by the nGANN has two attractive features: a network with small number of hidden neurons and a small mean squared error. From the solution network, h is possible to discover some relationships among different variables.

Three different types of lime series data are used to evaluate the performance of the above algorithms, the two algorithms work well for one-variable polynomial and one-variable non-polynomial time series data. For two or more variables, the above algorithms do not produce very good results. In the last part of this thesis, future work is discussed.

Description

Citation