Estimation of circadian parameters and investigation in cyanobacteria via semiparametric varying coefficient periodic models

Date

2009-05-15

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This dissertation includes three components. Component 1 provides an estima- tion procedure for circadian parameters in cyanobacteria. Component 2 explores the relationship between baseline and amplitude by model selection under the framework of smoothing spline. Component 3 investigates properties of hypothesis testing. The following three paragraphs briefly summarize these three components, respectively. Varying coefficient models are frequently used in statistical modeling. We pro- pose a semiparametric varying coefficient periodic model which is suitable to study periodic patterns. This model has ample applications in the study of the cyanobac- teria circadian clock. To achieve the desired flexibility, the model we consider may not be globally identifiable. We propose to perform local approximations by kernel based methods and focus on estimating one solution that is biologically meaningful. Asymptotic properties are developed. Simulations show that the gain by our proce- dure over the commonly used method is substantial. The methodology is illustrated by an application to a cyanobacteria dataset. Smoothing spline can be implemented, but a direct application with the penalty selected by the generalized cross-validation often leads to non-convergence outcomes. We propose an adjusted cross-validation instead, which resolves the difficulties. Biol- ogists believe that the amplitude function of the periodic component is proportional to the baseline function. To verify this belief, we propose a full model without any assumptions regarding such a relationship, and two reduced models with the ratio of baseline and amplitude to be a constant and a quadratic function of time, respectively. We use model selection techniques, Akaike information criterion (AIC) and Schwarz Bayesian information criterion (BIC), to determine the optimal model. Simulations show that AIC and BIC select the correct model with high probabilities. Application to cyanobacteria data shows that the full model is the best model. To investigate the same problem in component 2 by a formal hypothesis testing procedure, we develop kernel based methods. In order to construct the test statistic, we derive the global degree of freedom for the residual sum of squares. Simulations show that the proposed tests perform well. We apply the proposed procedures to the data and conclude that the baseline and amplitude functions share no linear or quadratic relationship.

Description

Citation