Topics in functional data analysis with biological applications

Li, Yehua

Topics in functional data analysis with biological applications

dc.contributor	Carroll, Raymond J.
dc.contributor	Hsing, Tailen
dc.creator	Li, Yehua
dc.date.accessioned	2010-01-15T00:16:00Z
dc.date.accessioned	2010-01-16T02:17:37Z
dc.date.accessioned	2017-04-07T19:56:50Z
dc.date.available	2010-01-15T00:16:00Z
dc.date.available	2010-01-16T02:17:37Z
dc.date.available	2017-04-07T19:56:50Z
dc.date.created	2006-08
dc.date.issued	2009-06-02
dc.description.abstract	Functional data analysis (FDA) is an active field of statistics, in which the primary subjects in the study are curves. My dissertation consists of two innovative applications of functional data analysis in biology. The data that motivated the research broadened the scope of FDA and demanded new methodology. I develop new nonparametric methods to make various estimations, and I focus on developing large sample theories for the proposed estimators. The first project is motivated from a colon carcinogenesis study, the goal of which is to study the function of a protein (p27) in colon cancer development. In this study, a number of colonic crypts (units) were sampled from each rat (subject) at random locations along the colon, and then repeated measurements on the protein expression level were made on each cell (subunit) within the selected crypts. In this problem, measurements within each crypt can be viewed as a function, since the measurements can be indexed by the cell locations. The functions from the same subject are spatially correlated along the colon, and my goal is to estimate this correlation function using nonparametric methods. We use this data set as an motivation and propose a kernel estimator of the correlation function in a more general framework. We develop a pointwise asymptotic normal distribution for the proposed estimator when the number of subjects is fixed and the number of units within each subject goes to infinity. Based on the asymptotic theory, we propose a weighted block bootstrapping method for making inferences about the correlation function, where the weights account for the inhomogeneity of the distribution of the unit locations. Simulation studies are also provided to illustrate the numerical performance of the proposed method. My second project is on a lipoprotein profile data, where the goal is to use lipoprotein profile curves to predict the cholesterol level in human blood. Again, motivated by the data, we consider a more general problem: the functional linear models (Ramsay and Silverman, 1997) with functional predictor and scalar response. There is literature developing different methods for this model; however, there is little theory to support the methods. Therefore, we focus more on the theoretical properties of this model. There are other contemporary theoretical work on methods based on Principal Component Regression. Our work is different in the sense that we base our method on roughness penalty approach and consider a more realistic scenario that the functional predictor is observed only on discrete points. To reduce the difficulty of the theoretical derivations, we restrict the functions with a periodic boundary condition and develop an asymptotic convergence rate for this problem in Chapter III. A more general result based on splines is a future research topic that I give some discussion in Chapter IV.
dc.identifier.uri	http://hdl.handle.net/1969.1/ETD-TAMU-1867
dc.language.iso	en_US
dc.subject	Functional Data Analysis
dc.subject	Nonparametric statistics
dc.title	Topics in functional data analysis with biological applications
dc.type	Book
dc.type	Thesis

Collections

Texas A&M University at College Station

Topics in functional data analysis with biological applications

Files

Collections