Inference and Visualization of Periodic Sequences

Sun, Ying

Inference and Visualization of Periodic Sequences

Date

2011-10-21

Authors

Sun, Ying

Abstract

This dissertation is composed of four articles describing inference and visualization of periodic sequences.

In the first article, a nonparametric method is proposed for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator.

The second article is the multivariate extension, where we present a CV method of estimating the periods of multiple periodic sequences when data are observed at evenly spaced time points. The basic idea is to borrow information from other correlated sequences to improve estimation of the period of interest. We show that the asymptotic behavior of the bivariate CV is the same as the CV for one sequence, however, for finite samples, the better the periods of the other correlated sequences are estimated, the more substantial improvements can be obtained.

The third article proposes an informative exploratory tool, the functional boxplot, for visualizing functional data, as well as its generalization, the enhanced functional boxplot. Based on the center outwards ordering induced by band depth for functional data, the descriptive statistics of a functional boxplot are: the envelope of the 50 percent central region, the median curve and the maximum non-outlying envelope. In addition, outliers can be detected by the 1.5 times the 50 percent central region empirical rule.

The last article proposes a simulation-based method to adjust functional boxplots for correlations when visualizing functional and spatio-temporal data, as well as detecting outliers. We start by investigating the relationship between the spatiotemporal dependence and the 1.5 times the 50 percent central region empirical outlier detection rule. Then, we propose to simulate observations without outliers based on a robust estimator of the covariance function of the data. We select the constant factor in the functional boxplot to control the probability of correctly detecting no outliers. Finally, we apply the selected factor to the functional boxplot of the original data.