Convex Versions Of Multivariate Adaptive Regression Splines And Implementations For Complex Optimization Problems
Shih, Dachuan Thomas
MetadataShow full item record
Multivariate Adaptive Regression Splines (MARS) provide a flexible statistical modeling method that employs forward and backward search algorithms to identify the combination of basis functions that best fits the data. In optimization, MARS has been used successfully to estimate the value function in stochastic dynamic programming, and MARS could be potentially useful in many real world optimization problems where objective (or other) functions need to be estimated from data, such as in simulation optimization. Many optimization methods depend on convexity, but a nonconvex MARS approximation is inherently possible because interaction terms are products of univariate terms. In this dissertation, convex versions of MARS are proposed. In order to ensure MARS convexity, two major modifications are made: (1) coefficients are constrained such that pairs of basis functions are guaranteed to jointly form convex functions; (2) The form of interaction terms is appropriately changed. Finally, MARS convexity can be achieved by the fact that the sum of convex functions is convex. The implementation of MARS for approximating complex optimization functions can involve hundreds to thousands of state or decision variables. In particular, this research studies application to an inventory forecasting stochastic dynamic programming problem and an airline fleet assignment problem. Although one can simply attempt a MARS approximation over all the variables, prior research on the fleet assignment application indicates that many variables have little effect on the objective. Thus, a data mining step to conduct variable selection is needed. This step separates potentially critical variables from clearly redundant ones. In this dissertation, variants of two data mining tools are explored separately and in combination for variable selection: regression trees and multiple procedures based on false discovery rate.