Browsing by Subject "Linear regression"

Now showing 1 - 5 of 5

Accounting for multiple membership data in adolescent social networks : an analysis of simulated data
(2016-05) Peek, Jaclyn Kara; Beretvas, Susan Natasha; Powers, Daniel A.
Multilevel modeling allows for the modeling of nested structures such as students nested within middle schools and middle schools nested within high schools. These kinds of hierarchies are common in social science research. Pure hierarchies may exist, where one variable is completely nested within another. Multiple membership (MM) structures occur when some lower level units are members of more than one higher level clustering unit (e.g., a student attends more than one high school). An extension to the conventional multilevel model, the multiple membership random effects model (MMREM) can be used to handle MM data. I compare a random effects model with and without multiple membership effects to demonstrate the possible benefit of accounting for the MM structure. We replicate an existing study on student academic outcomes (Tranmer et al., 2013) which assumes a multiple membership data structure, and add a comparison to a non-MM (i.e. single membership) model in order to assess the improvement in model fit. The original study investigated the effect of school, area, and social network membership in friendship dyads and triads on academic achievement in adolescents, with age, gender, and ethnicity as covariates. Our models retain the MM structure found in the original social network data. The original data is confidential and unavailable for use – therefore, a major component of this report is the simulation of this dataset in R. Results indicate that multiple membership does not necessarily lead to better goodness-of-fit as measured by DIC. Accounting for MM data structure initially produced a worse-fitting model. Artificially inflating the fixed and random effects that generated the simulated academic performance outcome led to the opposite effect. We conclude that the scale of random effects is important in determining the DIC measure of fit, and propose a full simulation study to more conclusively test our original hypothesis.
Automatic regularization technique for the estimation of neural receptive fields
(2010-05) Park, Mijung; Vikalo, Haris; Pillow, Jonathan W.
A fundamental question on visual system in neuroscience is how the visual stimuli are functionally related to neural responses. This relationship is often explained by the notion of receptive fields, an approximated linear or quasi-linear filter that encodes the high dimensional visual stimuli into neural spikes. Traditional methods for estimating the filter do not efficiently exploit prior information about the structure of neural receptive fields. Here, we propose several approaches to design the prior distribution over the filter, considering the neurophysiological fact that receptive fields tend to be localized both in space-time and spatio-temporal frequency domain. To automatically regularize the estimation of neural receptive fields, we use the evidence optimization technique, a MAP (maximum a posteriori) estimation under a prior distribution whose parameters are set by maximizing the marginal likelihood. Simulation results show that the proposed methods can estimate the receptive field using datasets that are tens to hundreds of times smaller than those required by traditional methods.
Development of linear capacitance-resistance models for characterizing waterflooded reservoirs
(2011-12) Kim, Jong Suk; Edgar, Thomas F.; Lake, Larry W.
The capacitance-resistance model (CRM) has been continuously improved and tested on both synthetic and real fields. For a large waterflood, with hundreds of injectors and producers present in a reservoir, tens of thousands of model parameters (gains, time constants, and productivity indices) in a field must be determined to completely define the CRM. In this case obtaining a unique solution in history-matching large reservoirs by nonlinear regression is difficult. Moreover, this approach is more likely to produce parameters that are statistically insignificant. The nonlinear nature of the CRM also makes it difficult to quantify the uncertainty in model parameters. The analytical solutions of the two linear reservoir models, the linearly transformed CRM whose control volume is the drainage volume around each producer (ltCRMP) and integrated capacitance-resistance model (ICRM), are developed in this work. Both models are derived from the governing differential equation of the producer-based representation of CRM (CRMP) that represents an in-situ material balance over the effective pore volume of a producer. The proposed methods use a constrained linear multivariate regression (LMR) to provide information about preferential permeability trends and fractures in a reservoir. The two models’ capabilities are validated with simulated data in several synthetic case studies. The ltCRMP and ICRM have the following advantages over the nonlinear waterflood model (CRMP): (1) convex objective functions, (2) elimination of the use of solver when constraints are ignored, and (3) faster computation time in optimization. In both methods, a unique solution can always be obtained regardless of the number of parameters as long as the number of data points is greater than the number of unknowns (parameters). The methods of establishing the confidence limits on CRMP gains and ICRM parameters are demonstrated in this work. This research also presents a method that uses the ICRM to estimate the gains between newly introduced injectors and existing producers for a homogeneous reservoir without having to do additional simulations or regression on newly simulated data. This procedure can guide geoscientists to decide where to drill new injectors to increase future oil recovery and provide rapid solutions without having to run reservoir simulations for each scenario.
Fitting planar points with two lines
(2013-08) Jiang, Junhai; Martin, Clyde F.; Mansouri, Hossein
There are various researches on linear regression and it has become a systematic and thus an important subject of statistics. However, almost all of the researches focused on regression with only one line, and only a few of the researchers are concerning about fitting the scatter points with more than one line, for example, two lines. The key point of this problem is that we have to find a method to generate two lines that at least locally minimize the sum of squared distances from the points to the nearest line. The method is given by fitting regression lines for two updating sets of points.
Linear estimation for data with error ellipses
(2012-05) Amen, Sally Kathleen; Powers, Daniel A.; Robinson, Edward L.
When scientists collect data to be analyzed, regardless of what quantities are being measured, there are inevitably errors in the measurements. In cases where two independent variables are measured with errors, many existing techniques can produce an estimated least-squares linear fit to the data, taking into consideration the size of the errors in both variables. Yet some experiments yield data that do not only contain errors in both variables, but also a non-zero covariance between the errors. In such situations, the experiment results in measurements with error ellipses with tilts specified by the covariance terms. Following an approach suggested by Dr. Edward Robinson, Professor of Astronomy at the University of Texas at Austin, this report describes a methodology that finds the estimates of linear regression parameters, as well as an estimated covariance matrix, for a dataset with tilted error ellipses. Contained in an appendix is the R code for a program that produces these estimates according to the methodology. This report describes the results of the program run on a dataset of measurements of the surface brightness and Sérsic index of galaxies in the Virgo cluster.

Browsing by Subject "Linear regression"

Results Per Page

Sort Options