(2010-05) Koenig, Lars; Youn, Eunseog; Rushton, J. Nelson; Cooke, Daniel E.

Show more

Clustering data is an integral part of data processing and pattern extraction. Most existing clustering techniques provide clusters from time series data, but several issues arise when applied to such datasets. Unless specifically tailored, the distance metrics used lack interpretability for these data: distances have no units, or the units are hard to comprehend. Many methods treat time series as n-dimensional points, which gives a distance that can be understood abstractly, but it lacks the ability to discern patterns in the data by comparing distances to one another. While some previous methods are concerned with matching levels, also of interest are series that change levels in a similar pattern but with different measured levels. These are not clustered together using a Euclidean metric and are indiscernible using a correlation metric, so we propose a more appropriate metric that clusters data with similar behaviors and allows for comparison of distances between clusters directly. This dissertation describes a new method that extends on prior methods of trajectory clustering allowing their application to longer time series and on short time series with poor trajectory distributions.