Browsing by Subject "Spatial statistics"

Now showing 1 - 7 of 7

Analyses of relationships of human West Nile virus, confined livestock operations, and playa lakes in the Texas Panhandle and South Plains region
(2010-12) Stephens, Christena; Presley, Steven; Dixon, Ken; Gao, Weimin; Salice, Christopher J.
A total of 432 human West Nile virus (WNV) cases have occurred with 28 fatalities in the Panhandle and South Plains region from 2002 to 2008 in 41 counties. Of significant interest was determining if these WNV cases were spatially clustered near major ecological and economic features of playa lakes and confined livestock operations (CLOs). Another research interest was to identify chemicals used in mosquito control in regional cities to determine if mosquito control increased during the years of the initial WNV outbreak in the region. An important role of spatial statistics is to account for spatial dependence and search for spatial patterns in geographical data. Cluster investigations have long been an important tool in epidemiology and spatial statistics. To quantify WNV prevalence in the region for clustering around CLOs and playa lakes, SaTScan™ and ArcGIS™ were used in conjunction to determine spatial clustering. Spatial clustering results indicate that a spatial correlation and dependence exists in the geographical data between human WNV cases, beef cattle operations and playa lakes. Malathion was identified as the most common pesticide used in the region from 2002 – 2009.
Applications of spatial autocorrelation
(2011-08) Prematilake, Chalani C.; Hadjicostas, Petros; Ruymgaart, Frits; Toda, Magdalena D.
As we use time series analysis to study data with respect to their time of occurrence, we can also use spatial statistics to study data with respect to their locations of occurrence in space. Even though the history of spatial probabilistic analysis goes way back to the 18th century (Buffon's needle problem), a serious attempt to first study spatial statistics was first made at the beginning of the 20th century (Student, 1907). This study examines measures of overall spatial autocorrelation or association. The word ``autocorrelation" means the correlation of a variable with itself (over time or over space, or both). According to Griffith (1987), the quality and quantity of information contained in spatial data is reflected on spatial autocorrelation. For example, in the case of a numerical variable of interest, if most pairs of neighbouring localities have values of the variable of interest both above the average or both below the average, then spatial autocorrelation tends to be ``large" in some way (above certain number), while if, on the other hand, for most pairs of neighbouring localities, one locality has a value of the variable above the average and the other one has a value below the average, then the autocorrelation measure tends to be ``small" (below a certain number). The study of spatial statistics takes different forms according to the kind of data used. For example, when the data are nominal categorical we can use join counts as measures of spatial association. For example, we can find the number of neighbouring localities that are of the same ``type" (category of the nominal variable) and the number of neighbouring localities of different ``types." Moran is one of the first authors who studied join count statistics. He also calculated the moments of join counts in 1948. Similar studies had been carried out by P.~V.~Krishna Iyer in 1949 and 1950 and by Florence Nightingale David in 1971. They both came out with similar results but in different experimental environments. In the second chapter of the thesis, we review some of the work of Moran on join counts and their moments. Spatial autocorrelation of numerical data is usually carried out using Moran's $I$ coefficient and Geary's ratio $c$ (introduced in 1950 and 1954, respectively). In the third chapter of this thesis, we review some of the probabilistic properties of these spatial autocorrelation coefficients that show how a variable is correlated with itself over space. We use the statistical packages R and SAS to calculate and apply the above statistics to some examples with spatial data. In addition, we show the connection of the join count statistics with Moran's $I$ coefficient and Geary's ratio $c$, which is probably one of the new contributions in this thesis. Throughout most of the thesis, we show (using modern notation) the randomization properties of some of the above spatial statistics, that is, we review Moran's and Geary's calculations on the probabilistic behaviour of these statistics by conditioning on the observed data (values of the variable of interest in different localities), but not in the order they appear. In other words, we show how statisticians in spatial statistics derive the distribution of some autocorrelation statistics under the non-free sampling scenario.
Occurrence of Multiple Fluid Phases Across a Basin, in the Same Shale Gas Formation ? Eagle Ford Shale Example
(2014-04-29) Tian, Yao
Shale gas and oil are playing a significant role in US energy independence by reversing declining production trends. Successful exploration and development of the Eagle Ford Shale Play requires reservoir characterization, recognition of fluid regions, and the application of optimal operational practices in all regions. Using stratigraphic and petrophysical analyses, we evaluated key parameters, of reservoir depth and thickness, fluid composition, reservoir pressure, total organic carbon (TOC), and number of limestone and organic-rich marl interbeds of the Lower Eagle Ford Shale. Spatial statistics were used to identify key reservoir parameters affecting Eagle Ford production. We built reservoir models of various fluid regions and history matched production data. Well deliverability was modeled to optimize oil production rate by designing appropriate operational parameters. From NW to SE, Eagle Ford fluids evolve from oil, to gas condensate and, finally, to dry gas, reflecting greater depth and thermal maturity. From outcrop, the Eagle Ford Shale dips southeastward; depth exceeds 13,000 ft at the Sligo Shelf Margin. We divided Eagle Ford Shale into three layers. The Lower Eagle Ford is present throughout the study area; it is more than 275 ft thick in the Maverick Basin depocenter and thins to less than 50 ft on the northeast. In the Lower Eagle Ford Shale, a strike-elongate trend of high TOC, high average gamma ray values, and low bulk density extends from Maverick Co. northeastward through Guadalupe Co. Both limestone and organic-rich marl beds increase in number from fewer than 2 near outcrop to more than 20 at the shelf margins. Average thicknesses of Lower Eagle Ford limestone and organic-rich marl beds are low (< 5 ft.) in the La Salle ? DeWitt trend, coincident with the most productive gas and oil wells. Eagle Ford Shale was divided into 5 production regions in South Texas that coincide with the regional, strike-elongate trends of geologic parameters, which suggests that these parameters significantly impact Eagle Ford Shale production. Eagle Ford Shale production (barrels of oil equivalent, BOE) increases consistently with depth, increases with Lower Eagle Ford thickness (up to 180-ft thickness), and increases with TOC (up to 7%). P values analyses suggest high certainty of the relationship between the production and five reservoir parameters tested in regression models. Multiple good history matches of a gas condensate well suggest significant uncertainties in reservoir parameters. Oil production rate is not sensitive to oil relative permeability for the gas condensate well model. We were unable to match the production history for the volatile oil wells, possibly because gas of lift. Reservoir modeling suggests low bottomhole flowing pressure was the key to optimize cumulative oil production. Concepts and models developed in this study may assist operators in making critical Eagle Ford Shale development decisions; they may be transferable to other shale plays.
Spatial interpolation with Gaussian processes and spatially varying regression coefficients
(2015-08) Mitchell, Daniel Lewis; Keitt, Timothy H.; Scott, James G
Linear regression is undoubtedly one of the most widely used statistical techniques, however because it assumes independent observations it can miss important features of a dataset when observations are spatially dependent. This report presents the spatially varying coefficients model, which augments a linear regression with a multivariate Gaussian spatial process to allow regression coefficients to vary over the spatial domain of interest. We develop the mathematics of Gaussian processes and illustrate their use, and demonstrate the spatially varying coefficients model on simulated data. We show that it achieves lower prediction error and a better fit to data than a standard linear regression.
Testing for spatial correlation and semiparametric spatial modeling of binary outcomes with application to aberrant crypt foci in colon carcinogenesis experiments
(Texas A&M University, 2005-11-01) Apanasovich, Tatiyana Vladimirovna
In an experiment to understand colon carcinogenesis, all animals were exposed to a carcinogen while half the animals were also exposed to radiation. Spatially, we measured the existence of aberrant crypt foci (ACF), namely morphologically changed colonic crypts that are known to be precursors of colon cancer development. The biological question of interest is whether the locations of these ACFs are spatially correlated: if so, this indicates that damage to the colon due to carcinogens and radiation is localized. Statistically, the data take the form of binary outcomes (corresponding to the existence of an ACF) on a regular grid. We develop score??type methods based upon the Matern and conditionally autoregression (CAR) correlation models to test for the spatial correlation in such data, while allowing for nonstationarity. Because of a technical peculiarity of the score??type test, we also develop robust versions of the method. The methods are compared to a generalization of Moran??s test for continuous outcomes, and are shown via simulation to have the potential for increased power. When applied to our data, the methods indicate the existence of spatial correlation, and hence indicate localization of damage. Assuming that there are correlations in the locations of the ACF, the questions are how great are these correlations, and whether the correlation structures di?er when an animal is exposed to radiation. To understand the extent of the correlation, we cast the problem as a spatial binary regression, where binary responses arise from an underlying Gaussian latent process. We model these marginal probabilities of ACF semiparametrically, using ?xed-knot penalized regression splines and single-index models. We ?t the models using pairwise pseudolikelihood methods. Assuming that the underlying latent process is strongly mixing, known to be the case for many Gaussian processes, we prove asymptotic normality of the methods. The penalized regression splines have penalty parameters that must converge to zero asymptotically: we derive rates for these parameters that do and do not lead to an asymptotic bias, and we derive the optimal rate of convergence for them. Finally, we apply the methods to the data from our experiment.
The application of mathematical ecology and spatial clustering analysis techniques to hailpad measurements
(Texas Tech University, 2008-05) Jones, Keith L.; Matis, Timothy I.; Collins, Terry R.; Smith, Milton L.; Li, Guigen; Zhang, Hong-Chao
The purpose of this study is to evaluate the accuracy of the assumption of randomness in regards to the application of the Poisson distribution to impacts on a hailpad. This research applies a number of methods for the exploration and modelling of spatial point patterns. These methods go beyond the conventional nearest-neighbour and quadrat analyses typical to Mathematical Ecology because they fail to allow for spatial variation in population density. First and second-order properties will be investigated using both distance-based and area-based techniques. The R language an environment for statistical computing and graphics will be used to perform the analysis.
Transfer learning for classification of spatially varying data
(2010-08) Jun, Goo; Ghosh, Joydeep; Aggarwal, J. K.; Crawford, Melba M.; Caramanis, Constantine; Sanghavi, Sujay; Grauman, Kristen
Many real-world datasets have spatial components that provide valuable information about characteristics of the data. In this dissertation, a novel framework for adaptive models that exploit spatial information in data is proposed. The proposed framework is mainly based on development and applications of Gaussian processes. First, a supervised learning method is proposed for the classification of hyperspectral data with spatially adaptive model parameters. The proposed algorithm models spatially varying means of each spectral band of a given class using a Gaussian process regression model. For a given location, the predictive distribution of a given class is modeled by a multivariate Gaussian distribution with spatially adjusted parameters obtained from the proposed algorithm. The Gaussian process model is generally regarded as a good tool for interpolation, but not for extrapolation. Moreover, the uncertainty of the predictive distribution increases as the distance from the training instances increases. To overcome this problem, a semi-supervised learning algorithm is presented for the classification of hyperspectral data with spatially adaptive model parameters. This algorithm fits the test data with a spatially adaptive mixture-of-Gaussians model, where the spatially varying parameters of each component are obtained by Gaussian process regressions with soft memberships using the mixture-of-Gaussian-processes model. The proposed semi-supervised algorithm assumes a transductive setting, where the unlabeled data is considered to be similar to the training data. This is not true in general, however, since one may not know how many classes may existin the unexplored regions. A spatially adaptive nonparametric Bayesian framework is therefore proposed by applying spatially adaptive mechanisms to the mixture model with infinitely many components. In this method, each component in the mixture has spatially adapted parameters estimated by Gaussian process regressions, and spatial correlations between indicator variables are also considered. In addition to land cover and land use classification applications based on hyperspectral imagery, the Gaussian process-based spatio-temporal model is also applied to predict ground-based aerosol optical depth measurements from satellite multispectral images, and to select the most informative ground-based sites by active learning. In this application, heterogeneous features with spatial and temporal information are incorporated together by employing a set of covariance functions, and it is shown that the spatio-temporal information exploited in this manner substantially improves the regression model. The conventional meaning of spatial information usually refers to actual spatio-temporal locations in the physical world. In the final chapter of this dissertation, the meaning of spatial information is generalized to the parametrized low-dimensional representation of data in feature space, and a corresponding spatial modeling technique is exploited to develop a nearest-manifold classification algorithm.

Browsing by Subject "Spatial statistics"

Results Per Page

Sort Options