Browsing by Subject "Principal component analysis"

Now showing 1 - 8 of 8

A Systems Biology Approach to Develop Models of Signal Transduction Pathways
(2011-10-21) Huang, Zuyi
Mathematical models of signal transduction pathways are characterized by a large number of proteins and uncertain parameters, yet only a limited amount of quantitative data is available. The dissertation addresses this problem using two different approaches: the first approach deals with a model simplification procedure for signaling pathways that reduces the model size but retains the physical interpretation of the remaining states, while the second approach deals with creating rich data sets by computing transcription factor profiles from fluorescent images of green-fluorescent-protein (GFP) reporter cells. For the first approach a model simplification procedure for signaling pathway models is presented. The technique makes use of sensitivity and observability analysis to select the retained proteins for the simplified model. The presented technique is applied to an IL-6 signaling pathway model. It is found that the model size can be significantly reduced and the simplified model is able to adequately predict the dynamics of key proteins of the signaling pathway. An approach for quantitatively determining transcription factor profiles from GFP reporter data is developed as the second major contribution of this work. The procedure analyzes fluorescent images to determine fluorescence intensity profiles using principal component analysis and K-means clustering, and then computes the transcription factor concentration from the fluorescence intensity profiles by solving an inverse problem involving a model describing transcription, translation, and activation of green fluorescent proteins. Activation profiles of the transcription factors NF-?B, nuclear STAT3, and C/EBP? are obtained using the presented approach. The data for NF-?B is used to develop a model for TNF-? signal transduction while the data for nuclear STAT3 and C/EBP? is used to verify the simplified IL-6 model. Finally, an approach is developed to compute the distribution of transcription factor profiles among a population of cells. This approach consists of an algorithm for identifying individual fluorescent cells from fluorescent images, and an algorithm to compute the distribution of transcription factor profiles from the fluorescence intensity distribution by solving an inverse problem. The technique is applied to experimental data to derive the distribution of NF-?B concentrations from fluorescent images of a NF-?B GFP reporter system.
Characterization of the Cana-Woodford Shale using fractal-based, stochastic inversion, Canadian County, Oklahoma
(2016-05) Borgman, Barry Michael; Spikes, Kyle; Sen, Mrinal K; Wilson, Clark R
The past decade has seen a surge in unconventional hydrocarbon exploration and production, driven by advances in horizontal drilling and hydraulic fracturing. Even with such advances, reliable models of the subsurface are crucial in all phases of exploitation. This study focuses on the methods used for estimation of the elastic properties (density, velocity, and impedance), which play a key role in targeting reservoir zones ideal for hydraulic fracturing. Well-log data provides high-resolution vertical measurements of elastic properties, but a relatively shallow depth of investigation imposes spatial limitations. Seismic data provides broader horizontal coverage at lower cost, but sacrifices vertical resolution. Thin beds present in many unconventional reservoirs fall below seismic resolution. In addition, the band-limited nature of seismic data results in the absence of low-frequency content of the Earth model, as well as the high-frequency content present in well logs. Seismic inversion is a process that provides estimates of elastic properties given input seismic and well data. Stochastic inversion is a method that uses well-log data as a priori information, with an added aspect of randomness. The method generates many realizations using the same input model and takes an average of those realizations. We implement two separate stochastic inversion algorithms to estimate P-impedance in the Cana-Woodford Shale in west-central Oklahoma. First, we use a fractal-based, very fast simulated annealing algorithm that exploits the fractal characteristics found in well-log data to build a prior model. The method of very fast simulated annealing optimizes our elastic model by searching for the minimum misfit between observed and synthetic seismic traces. Next, we use a principal component analysis (PCA) based stochastic inversion algorithm to invert for impedance at all traces simultaneously. Comparison of the results with traditional deterministic inversion results shows improved vertical resolution while honoring the low-frequency content of the Earth model. The PCA-based inversion results also show improved lateral continuity of the elastic profile along our 2D line. The impedance profile from the PCA-based approach provides a better representation of the vertical and horizontal variability of the reservoir, allowing for improved targeting of frackable zones.
Differential sensing of hydrophobic analytes with serum albumins
(2012-05) Ivy, Michelle Adams; Anslyn, Eric V., 1960-
In the last decade, there has been a growing interest in the use of differential sensing for molecular recognition. Inspired by the mammalian olfactory system, differential sensing employs an array of non-selective receptors, which through cross-reactive interactions, create a distinct pattern for each analyte tested. The unique fingerprints obtained for each analyte with differential sensing are studied with statistical analysis techniques, such as principal component analysis and linear discriminant analysis. It was postulated that serum albumin proteins would be applicable to differential sensing schemes due to significant differences in sequence identity between different serum albumin species, and due to the wide range of hydrophobic molecules which are known to bind to these proteins. Consequently, cross-reactive serum albumin arrays were developed, utilizing hydrophobic fluorescent indicators to detect hydrophobic molecules. As such, serum albumin cross-reactive arrays were employed to discriminate subtly different hydrophobic analytes, and mixtures of these analytes, in the form of terpenes and perfumes, plasticizers and plastic explosive mixtures, and glycerides and adipocyte extracts. In this doctoral work, a detailed review of the field of differential sensing, and a thorough study of principal component analysis and linear discriminant analysis in various differential sensing scenarios, are given. These introductory chapters aid in better understanding the methods and techniques applied in later experimental chapters. In chapter 3, serum albumins, a PRODAN indicator, and an additive are shown to discriminate five terpene analytes and terpene doped perfumes. Chapter 4 describes an array with serum albumins, two dansyl fluorophores, and an additive which successfully differentiate the plasticizers found within the plastic explosives C4 and Semtex and simulated C4 and Semtex mixtures. Discrimination of these simulated mixtures was also achieved with this array in the presence of soil contaminants, demonstrating the potential real-world applicability of this sensing ensemble. Finally, chapter 5 details an array consisting of serum albumins, several fluorescent indicators, and a Grubb's olefin metathesis reaction, to differentiate saturated and unsaturated triglycerides, diglycerides, and monoglycerides. Mixtures of glycerides in adipocyte extracts taken from rats with different health states were then successfully discriminated, showing promise for clinical applications in differentiating adipoctyes from pre-diabetic, type 2 diabetic, and non-diabetic individuals.
Functional data analysis: classification and regression
(Texas A&M University, 2005-11-01) Lee, Ho-Jin
Functional data refer to data which consist of observed functions or curves evaluated at a finite subset of some interval. In this dissertation, we discuss statistical analysis, especially classification and regression when data are available in function forms. Due to the nature of functional data, one considers function spaces in presenting such type of data, and each functional observation is viewed as a realization generated by a random mechanism in the spaces. The classification procedure in this dissertation is based on dimension reduction techniques of the spaces. One commonly used method is Functional Principal Component Analysis (Functional PCA) in which eigen decomposition of the covariance function is employed to find the highest variability along which the data have in the function space. The reduced space of functions spanned by a few eigenfunctions are thought of as a space where most of the features of the functional data are contained. We also propose a functional regression model for scalar responses. Infinite dimensionality of the spaces for a predictor causes many problems, and one such problem is that there are infinitely many solutions. The space of the parameter function is restricted to Sobolev-Hilbert spaces and the loss function, so called, e-insensitive loss function is utilized. As a robust technique of function estimation, we present a way to find a function that has at most e deviation from the observed values and at the same time is as smooth as possible.
Optimization of an array of peptidic indicator displacement assays for the discrimination of cabernet sauvignon wines
(2010-08) Chong, Sally; Bielawski, Christopher W.; Anslyn, Eric V., 1960-; Umali, Alona P.
The research project, Optimization of an array of Peptidic Indicator Displacement Assays for the Discrimination of Cabernet Sauvignon Wines, describes the multiple step lab trials conducted to optimize an array of ensembles composed of synthesized peptides and PCV:Cu+2 complexes for the differentiation of seven Cabernet Sauvignon wines with different tannin levels. This report also includes the methods and analysis used. The analysis interpreted by principal component analysis.
Prediction of reservoir properties of the N-sand, vermilion block 50, Gulf of Mexico, from multivariate seismic attributes
(Texas A&M University, 2005-08-29) Jaradat, Rasheed Abdelkareem
The quantitative estimation of reservoir properties directly from seismic data is a major goal of reservoir characterization. Integrated reservoir characterization makes use of different varieties of well and seismic data to construct detailed spatial estimates of petrophysical and fluid reservoir properties. The advantage of data integration is the generation of consistent and accurate reservoir models that can be used for reservoir optimization, management and development. This is particularly valuable in mature field settings where hydrocarbons are known to exist but their exact location, pay, lateral variations and other properties are poorly defined. Recent approaches of reservoir characterization make use of individual seismic attributes to estimate inter-well reservoir properties. However, these attributes share a considerable amount of information among them and can lead to spurious correlations. An alternative approach is to evaluate reservoir properties using multiple seismic attributes. This study reports the results of an investigation of the use of multivariate seismic attributes to predict lateral reservoir properties of gross thickness, net thickness, gross effective porosity, net-to-gross ratio and net reservoir porosity thickness product. This approach uses principal component analysis and principal factor analysis to transform eighteen relatively correlated original seismic attributes into a set of mutually orthogonal or independent PC??s and PF??s which are designated as multivariate seismic attributes. Data from the N-sand interval of Vermilion Block 50 field, Gulf of Mexico, was used in this study. Multivariate analyses produced eighteen PC??s and three PF??s grid maps. A collocated cokriging geostaistical technique was used to estimate the spatial distribution of reservoir properties of eighteen wells penetrating the N-sand interval. Reservoir property maps generated by using multivariate seismic attributes yield highly accurate predictions of reservoir properties when compared to predictions produced with original individual seismic attributes. To the contrary of the original seismic attribute results, predicted reservoir properties of the multivariate seismic attributes honor the lateral geological heterogeneities imbedded within seismic data and strongly maintain the proposed geological model of the N-sand interval. Results suggest that multivariate seismic attribute technique can be used to predict various reservoir properties and can be applied to a wide variety of geological and geophysical settings.
Prediction of Tortilla Quality Using Multivariate Modeling of Kernel, Flour and Dough Properties
(2014-01-10) Jondiko, Tom O
Advances in high-throughput wheat breeding techniques have resulted in the need for rapid, accurate and cost-effective means to predict tortilla making performance for larger numbers of early generation wheat lines. Currently, the most reliable approach is to process tortillas. This approach is laborious, time consuming, expensive and requires large sample size. This study used a multivariate discriminant analysis to predict tortilla quality using kernel, flour and dough properties. A discriminant rule (suitability = diameter > 165mm + day 16 flexibility score >3.0) was used to classify wheat lines for suitability in making good quality tortillas. One hundred eighty seven hard winter wheat (HWW) varieties from Texas were evaluated for kernel (hardness, diameter, and weight), flour (protein content, fractions and composition), dough (compression force, extensibility and stress relaxation from TA-XT2i) and tortilla properties (diameter, rheology and flexibility). The first three principal components explained 58% of variance. Multivariate normal distribution of the data was determined (Shapiro-Wilk p > 0.05). PCA identified significant correlation between stress relaxation force and rollability. Canonical correlation analysis revealed significant correlation between kernel and tortilla properties (p? = 0.75), kernel diameter and weight contributed the highest to this correlation. Flour and tortilla properties were highly correlated (p? = 0.74). Glutenin to Gliadin ratio (GGratio), IPP and peak time contributed highest to this correlation and can explain > 60% of variability in tortilla texture (force, distance and work to rupture). The second canonical variate of flour properties is a measure of flour protein content and can explain 26% of the variability in tortilla rollability. Dough and tortilla properties were significantly correlated (p? = 0.82, 0.68, 0.54, 0.38 and 0.29). Dough stress relaxation force after 25 seconds is negatively correlated with tortilla diameter (r = - 0.73). Kernel hardness, diameter and weight are the best predictors of tortilla texture after 16 days. Glutenin to gliadin ratio and IPP contributed significantly to tortilla texture. This is the first study to identify the contribution of protein content on tortilla rollability score. Dough extensibility can explain 37% of tortilla rollability. Stress relaxation is the best predictor of tortilla diameter. Tortilla quality variation is attributed to kernel, flour, and dough properties. Logistic regression and stepwise variable selection identified an optimum model comprised of kernel hardness, GGratio, dough extensibility and compression force as the most important variables. Cross-validation indicated 83% prediction efficiency for the model. This emphasizes the feasibility and practicality of the model using variables that are easily and quickly measured. This is the first model that can be used to simultaneously predict both tortilla diameter and rollability. It will be a useful tool for the flat bread wheat breeding programs, wheat millers, tortilla processors and wheat marketers in the United States of America.
The uses of supramolecular chemistry in synthetic methodology development
(2009-05) Shabbir, Shagufta Hasnain; Anslyn, Eric V., 1960-
Enantioselective indicator displacement assays (eIDAs), was transitioned to a high-throughput screening protocols, for the rapid determination of concentration and enantioselectivity (ee) of chiral diols and α-hydroxycarboxylic acid. To improve the design of our previously established receptor based on o-(N,N-dialkylaminomethyl)arylboronate scaffolds for eIDAs. The rigidity of the receptor, which pertinent from the formation of an intramolecular N-B dative bond was investigated. o-(Pyrrolidinylmethyl)phenylboronic acid its complexes with bifunctional substrates such as catechol, [alpha]-hydroxyisobutyric acid, and hydrobenzoin was studied in detail by x-ray crystallography and ¹¹B NMR. Our structural study predicts that the formation of an N-B dative bond, and/or solvolysis to afford a tetrahedral boronate anion, depends on the solvent and the complexing substrate present. To simplify the operation of eIDAs, we introduced an analytical method, which utilize a dual-chamber quartz cuvette, which reduces the number of spectroscopic measurements from two to one and introduced artificial neural networks (ANNs) which simplifies data analysis. In a second example a high-throughtput screening protocol for hydrobenzoin was developed. The method involves the sequential utilization of what we define herein as screening, training, and analysis plates. Several enantioselective boronic-acid based receptors were screened using 96-well plates, both for their ability to discriminate the enantiomers of hydrobenzoin and to find their optimal pairing with indicators resulting in the largest optical responses. The best receptor/indicator combination was then used to train an ANN to determine concentration and ee. To prove the practicality of the developed protocol, analysis plates were created containing true unknown samples of hydrobenzoin generated by established Sharpless asymmetric dihydroxylation reactions, and the best ligand was correctly identified. The system was extended to pattern recognition for the rapid determination of identity, concentration, and ee of chiral vicinal diols. A diverse enantioselective sensor array was generated with three chiral boronic acid receptors and pH indicators. The optical response produced by the sensor array, was analyzed by two pattern recognition algorithms: principal component analysis (PCA) and ANNs. The PCA plot demonstrated good chemoselective and enantioselective separation of the analytes, and ANNs was used to accurately determine the concentration and ee of five unknown samples.

Browsing by Subject "Principal component analysis"

Results Per Page

Sort Options