Dirty statistical models

Jalali, Ali, 1982-

Dirty statistical models

dc.contributor.advisor	Sanghavi, Sujay Rajendra, 1979-	en
dc.contributor.committeeMember	Caramanis, Constantine	en
dc.contributor.committeeMember	Ghosh, Joydeep	en
dc.contributor.committeeMember	Dhillon, Inderjit	en
dc.contributor.committeeMember	Ravikumar, Pradeep	en
dc.creator	Jalali, Ali, 1982-	en
dc.date.accessioned	2012-07-11T17:38:51Z	en
dc.date.accessioned	2017-05-11T22:25:46Z
dc.date.available	2012-07-11T17:38:51Z	en
dc.date.available	2017-05-11T22:25:46Z
dc.date.issued	2012-05	en
dc.date.submitted	May 2012	en
dc.date.updated	2012-07-11T17:39:22Z	en
dc.description	text	en
dc.description.abstract	In fields across science and engineering, we are increasingly faced with problems where the number of variables or features we need to estimate is much larger than the number of observations. Under such high-dimensional scaling, for any hope of statistically consistent estimation, it becomes vital to leverage any potential structure in the problem such as sparsity, low-rank structure or block sparsity. However, data may deviate significantly from any one such statistical model. The motivation of this thesis is: can we simultaneously leverage more than one such statistical structural model, to obtain consistency in a larger number of problems, and with fewer samples, than can be obtained by single models? Our approach involves combining via simple linear superposition, a technique we term dirty models. The idea is very simple: while any one structure might not capture the data, a superposition of structural classes might. Dirty models thus searches for a parameter that can be decomposed into a number of simpler structures such as (a) sparse plus block-sparse, (b) sparse plus low-rank and (c) low-rank plus block-sparse. In this thesis, we propose dirty model based algorithms for different problems such as multi-task learning, graph clustering and time-series analysis with latent factors. We analyze these algorithms in terms of the number of observations we need to estimate the variables. These algorithms are based on convex optimization and sometimes they are relatively slow. We provide a class of low-complexity greedy algorithms that not only can solve these optimizations faster, but also guarantee the solution. Other than theoretical results, in each case, we provide experimental results to illustrate the power of dirty models.	en
dc.description.department	Electrical and Computer Engineering	en
dc.format.mimetype	application/pdf	en
dc.identifier.slug	2152/ETD-UT-2012-05-5088	en
dc.identifier.uri	http://hdl.handle.net/2152/ETD-UT-2012-05-5088	en
dc.language.iso	eng	en
dc.subject	Structure learning	en
dc.subject	Statistical inference	en
dc.subject	Dirty models	en
dc.subject	High-dimensional statistics	en
dc.subject	Machine learning	en
dc.subject	Sparse and low-rank decomposition	en
dc.subject	Graph clustering	en
dc.subject	Time series analysis	en
dc.subject	Greedy dirty algorithms	en
dc.title	Dirty statistical models	en
dc.type.genre	thesis	en

Collections

University of Texas at Austin

Dirty statistical models

Files

Collections