Browsing by Subject "Structure learning"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Dirty statistical models(2012-05) Jalali, Ali, 1982-; Sanghavi, Sujay Rajendra, 1979-; Caramanis, Constantine; Ghosh, Joydeep; Dhillon, Inderjit; Ravikumar, PradeepIn fields across science and engineering, we are increasingly faced with problems where the number of variables or features we need to estimate is much larger than the number of observations. Under such high-dimensional scaling, for any hope of statistically consistent estimation, it becomes vital to leverage any potential structure in the problem such as sparsity, low-rank structure or block sparsity. However, data may deviate significantly from any one such statistical model. The motivation of this thesis is: can we simultaneously leverage more than one such statistical structural model, to obtain consistency in a larger number of problems, and with fewer samples, than can be obtained by single models? Our approach involves combining via simple linear superposition, a technique we term dirty models. The idea is very simple: while any one structure might not capture the data, a superposition of structural classes might. Dirty models thus searches for a parameter that can be decomposed into a number of simpler structures such as (a) sparse plus block-sparse, (b) sparse plus low-rank and (c) low-rank plus block-sparse. In this thesis, we propose dirty model based algorithms for different problems such as multi-task learning, graph clustering and time-series analysis with latent factors. We analyze these algorithms in terms of the number of observations we need to estimate the variables. These algorithms are based on convex optimization and sometimes they are relatively slow. We provide a class of low-complexity greedy algorithms that not only can solve these optimizations faster, but also guarantee the solution. Other than theoretical results, in each case, we provide experimental results to illustrate the power of dirty models.Item Greedy structure learning of Markov Random Fields(2011-08) Johnson, Christopher Carroll; Ravikumar, Pradeep; Dhillon, InderjitProbabilistic graphical models are used in a variety of domains to capture and represent general dependencies in joint probability distributions. In this document we examine the problem of learning the structure of an undirected graphical model, also called a Markov Random Field (MRF), given a set of independent and identically distributed (i.i.d.) samples. Specifically, we introduce an adaptive forward-backward greedy algorithm for learning the structure of a discrete, pairwise MRF given a high dimensional set of i.i.d. samples. The algorithm works by greedily estimating the neighborhood of each node independently through a series of forward and backward steps. By imposing a restricted strong convexity condition on the structure of the learned graph we show that the structure can be fully learned with high probability given $n=\Omega(d\log (p))$ samples where $d$ is the dimension of the graph and $p$ is the number of nodes. This is a significant improvement over existing convex-optimization based algorithms that require a sample complexity of $n=\Omega(d^2\log(p))$ and a stronger irrepresentability condition. We further support these claims with an empirical comparison of the greedy algorithm to node-wise $\ell_1$-regularized logistic regression as well as provide a real data analysis of the greedy algorithm using the Audioscrobbler music listener dataset. The results of this document provide an additional representation of work submitted by A. Jalali, C. Johnson, and P. Ravikumar to NIPS 2011.Item Learning with Markov logic networks : transfer learning, structure learning, and an application to Web query disambiguation(2009-08) Mihalkova, Lilyana Simeonova; Mooney, Raymond J. (Raymond Joseph)Traditionally, machine learning algorithms assume that training data is provided as a set of independent instances, each of which can be described as a feature vector. In contrast, many domains of interest are inherently multi-relational, consisting of entities connected by a rich set of relations. For example, the participants in a social network are linked by friendships, collaborations, and shared interests. Likewise, the users of a search engine are related by searches for similar items and clicks to shared sites. The ability to model and reason about such relations is essential not only because better predictive accuracy is achieved by exploiting this additional information, but also because frequently the goal is to predict whether a set of entities are related in a particular way. This thesis falls within the area of Statistical Relational Learning (SRL), which combines ideas from two traditions within artificial intelligence, first-order logic and probabilistic graphical models to address the challenge of learning from multi-relational data. We build on one particular SRL model, Markov logic networks (MLNs), which consist of a set of weighted first-order-logic formulae and provide a principled way of defining a probability distribution over possible worlds. We develop algorithms for learning of MLN structure both from scratch and by transferring a previously learned model, as well as an application of MLNs to the problem of Web query disambiguation. The ideas we present are unified by two main themes: the need to deal with limited training data and the use of bottom-up learning techniques. Structure learning, the task of automatically acquiring a set of dependencies among the relations in the domain, is a central problem in SRL. We introduce BUSL, an algorithm for learning MLN structure from scratch that proceeds in a more bottom-up fashion, breaking away from the tradition of top-down learning typical in SRL. Our approach first constructs a novel data structure called a Markov network template that is used to restrict the search space for clauses. Our experiments in three relational domains demonstrate that BUSL dramatically reduces the search space for clauses and attains a significantly higher accuracy than a structure learner that follows a top-down approach. Accurate and efficient structure learning can also be achieved by transferring a model obtained in a source domain related to the current target domain of interest. We view transfer as a revision task and present an algorithm that diagnoses a source MLN to determine which of its parts transfer directly to the target domain and which need to be updated. This analysis focuses the search for revisions on the incorrect portions of the source structure, thus speeding up learning. Transfer learning is particularly important when target-domain data is limited, such as when data on only a few individuals is available from domains with hundreds of entities connected by a variety of relations. We also address this challenging case and develop a general transfer learning approach that makes effective use of such limited target data in several social network domains. Finally, we develop an application of MLNs to the problem of Web query disambiguation in a more privacy-aware setting where the only information available about a user is that captured in a short search session of 5-6 previous queries on average. This setting contrasts with previous work that typically assumes the availability of long user-specific search histories. To compensate for the scarcity of user-specific information, our approach exploits the relations between users, search terms, and URLs. We demonstrate the effectiveness of our approach in the presence of noise and show that it outperforms several natural baselines on a large data set collected from the MSN search engine.