Browsing by Subject "Stochastic Approximation Monte Carlo"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Combining Strategies for Parallel Stochastic Approximation Monte Carlo Algorithm of Big Data(2014-10-15) Lin, Fang-YuModeling and mining with massive volumes of data have become popular in recent decades. However, it is difficult to analyze on a single commodity computer because the size of data is too large. Parallel computing is widely used. As a natural methodology, the divide-and-combine (D&C) method has been applied in parallel computing. The general method of D&C is to use MCMC algorithm in each divided data set. However, MCMC algorith is computationally expensive because it requires a large number of iterations and is prone to get trapped into local optima. On the other hand, Stochastic Approximation in Monte Carlo algorithm (SAMC), a very sophisticated algorithm in theory and applications, can avoid getting trapped into local optima and produce more accurate estimation than the conventional MCMC algorithm does. Motivated by the success of SAMC, we propose parallel SAMC algorithm that can be utilized on massive data and is workable in parallel computing. It can also be applied for model selection and optimization problem. The main challenge of the parallel SAMC algorithm is how to combine the results from each parallel subset. In this work, three strategies to overcome the combining difficulties are proposed. From the simulation results, these strategies result in significant time saving and accurate estimation. Synthetic Aperture Radar Interferometry (InSAR) is a technique of analyzing deformation caused by geophysical processes. However, it is limited by signal losses which are from topographic residuals. In order to analyze the surface deformation, we have to distinguish signal losses. Many methods assume the noise has second order stationary structure without testing it. The objective of this study is to examine the second order stationary assumption for InSAR noise and develop a parametric nonstationary model in order to demonstrate the effect of making incorrect assumption on random field. It indicates that wrong stationary assumption will result in bias estimation and large variation.Item Protein folding and phylogenetic tree reconstruction using stochastic approximation Monte Carlo(Texas A&M University, 2007-09-17) Cheon, SooyoungRecently, the stochastic approximation Monte Carlo algorithm has been proposed by Liang et al. (2005) as a general-purpose stochastic optimization and simulation algorithm. An annealing version of this algorithm was developed for real small protein folding problems. The numerical results indicate that it outperforms simulated annealing and conventional Monte Carlo algorithms as a stochastic optimization algorithm. We also propose one method for the use of secondary structures in protein folding. The predicted protein structures are rather close to the true structures. Phylogenetic trees have been used in biology for a long time to graphically represent evolutionary relationships among species and genes. An understanding of evolutionary relationships is critical to appropriate interpretation of bioinformatics results. The use of the sequential structure of phylogenetic trees in conjunction with stochastic approximation Monte Carlo was developed for phylogenetic tree reconstruction. The numerical results indicate that it has a capability of escaping from local traps and achieving a much faster convergence to the global likelihood maxima than other phylogenetic tree reconstruction methods, such as BAMBE and MrBayes.Item Statistical Inference for Models with Intractable Normalizing Constants(2011-06-27) Jin, Ick HoonIn this dissertation, we have proposed two new algorithms for statistical inference for models with intractable normalizing constants: the Monte Carlo Metropolis-Hastings algorithm and the Bayesian Stochastic Approximation Monte Carlo algorithm. The MCMH algorithm is a Monte Carlo version of the Metropolis-Hastings algorithm. At each iteration, it replaces the unknown normalizing constant ratio by a Monte Carlo estimate. Although the algorithm violates the detailed balance condition, it still converges, as shown in the paper, to the desired target distribution under mild conditions. The BSAMC algorithm works by simulating from a sequence of approximated distributions using the SAMC algorithm. A strong law of large numbers has been established for BSAMC estimators under mild conditions. One significant advantage of our algorithms over the auxiliary variable MCMC methods is that they avoid the requirement for perfect samples, and thus it can be applied to many models for which perfect sampling is not available or very expensive. In addition, although the normalizing constant approximation is also involved in BSAMC, BSAMC can perform very robustly to initial guesses of parameters due to the powerful ability of SAMC in sample space exploration. BSAMC has also provided a general framework for approximated Bayesian inference for the models for which the likelihood function is intractable: sampling from a sequence of approximated distributions with their average converging to the target distribution. With these two illustrated algorithms, we have demonstrated how the SAMCMC method can be applied to estimate the parameters of ERGMs, which is one of the typical examples of statistical models with intractable normalizing constants. We showed that the resulting estimate is consistent, asymptotically normal and asymptotically efficient. Compared to the MCMLE and SSA methods, a significant advantage of SAMCMC is that it overcomes the model degeneracy problem. The strength of SAMCMC comes from its varying truncation mechanism, which enables SAMCMC to avoid the model degeneracy problem through re-initialization. MCMLE and SSA do not possess the re-initialization mechanism, and tend to converge to a solution near the starting point, so they often fail for the models which suffer from the model degeneracy problem.