Browsing by Subject "Unsupervised learning"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Application of Information Theoretic Unsupervised Learning to Medical Image Analysis(2013-05) Hill, Jason E; Mitra, Sunanda; Nutter, BrianAutomated segmentation of medical images is a challenging problem. The number of segments in a medical image may be unknown a priori, due to the presence or absence of pathological anomalies. Some unsupervised learning techniques that take advantage of information theory concepts may provide a solid approach to the solution of this problem. To this end, there has been the recent development of the Improved “Jump” Method (IJM), a technique that efficiently finds a suitable number of clusters representing different tissue characteristics in a medical image. The IJM works by optimizing an objective function, the margin, that quantifies the quality of particular cluster configurations. Recent developments involving interesting relationships between Spectral Clustering (SC) and kernel Principal Component Analysis (kPCA) are used by the implementation of the IJM to cover the non-linear domain. In this novel SC approach the data is mapped to a new space where the points belonging to the same cluster are collinear if the parameters of a Radial Basis Function (RBF) kernel are adequately selected. After projecting these points onto the unit sphere, IJM measures the quality of different cluster configurations, yielding an algorithm that simultaneously selects the number of clusters, and the RBF kernel parameter. Validation of this method is sought via segmentation of MR brain images in a combination of all major modalities. Such labeled MRI datasets serve as benchmarks for any segmentation algorithm. The effectiveness of the nonlinear IJM is demonstrated in the segmentation of uterine cervix color images for early identification of cervical neoplasia, as an aid to cervical cancer diagnosis. Limitations of the current implementation of IJM are encountered when attempting to segment and MR brain images with multiple sclerosis (MS) lesions. These limitations and a strategy to overcome them are discussed. Finally, an outlook to applying this method to the segmentation of cells in Pap smear test micrographs is laid out.Item Energy storage-aware prediction/control for mobile systems with unstructured loads(2013-08) LeSage, Jonathan Robert, 1985-; Longoria, Raul G.Mobile systems, such as ground robots and electric vehicles, inherently operate in stochastic environments where load demands are largely unknown. Onboard energy storage, most commonly an electrochemical battery system, can significantly constrain operation. As such, mission planning and control of mobile systems can benefit from a priori knowledge about battery dynamics and constraints, especially the rate-capacity and recovery effects. To help overcome overly conservative predictions common with most existing battery remaining run-time algorithms, a prediction scheme was proposed. For characterization of a priori unknown power loads, an unsupervised Gaussian mixture routine identifies/clusters the measured power loads, and a jump-Markov chain characterizes the load transients. With the jump-Markov load forecasts, a model-based particle filter scheme predicts battery remaining run-time. Monte Carlo simulation studies demonstrate the marked improvement of the proposed technique. It was found that the increase in computational complexity from using a particle filter was justified for power load transient jumps greater than 13.4% of total system power. A multivariable reliability method was developed to assess the feasibility of a planned mission. The probability of mission completion is computed as the reliability integral of mission time exceeding the battery run-time. Because these random variables are inherently dependent, a bivariate characterization was necessary and a method is presented for online estimation of the process correlation via Bayesian updating. Finally, to abate transient shutdown of mobile systems, a model predictive control scheme is proposed that enforces battery terminal voltage constraints under stochastic loading conditions. A Monte Carlo simulation study of a small ground vehicle indicated significant improvement in both time and distance traveled as a result. For evaluation of the proposed methodologies, a laboratory terrain environment was designed and constructed for repeated mobile system discharge studies. The test environment consists of three distinct terrains. For each discharge study, a small unmanned ground vehicle traversed the stochastic terrain environment until battery exhaustion. Results from field tests with a Packbot ground vehicle in generic desert terrain were also used. Evaluation of the proposed prediction algorithms using the experimental studies, via relative accuracy and [alpha]-[lambda] prognostic metrics, indicated significant gains over existing methods.Item The Fern algorithm for intelligent discretization(2012-08) Hall, John Wendell; Djurdjanovic, Dragan; Fernandez, Benito R.This thesis proposes and tests a recursive, adpative, and computationally inexpensive method for partitioning real-number spaces. When tested for proof-of-concept on both one- and two- dimensional classification and control problems, the Fern algorithm was found to work well in one dimension, moderately well for two-dimensional classification, and not at all for two-dimensional control. Testing ferns as pure discretizers - which would involve a secondary discrete learner - has been left to future work.Item Semantic interpretation with distributional analysis(2012-05) Glass, Michael Robert; Barker, Ken, 1959-; Porter, Bruce, 1956-; Mooney, Ray; Erk, Katrin; Dhillon, InderjitUnstructured text contains a wealth of knowledge, however, it is in a form unsuitable for reasoning. Semantic interpretation is the task of processing natural language text to create or extend a coherent, formal knowledgebase able to reason and support question answering. This task involves entity, event and relation extraction, co-reference resolution, and inference. Many domains, from intelligence data to bioinformatics, would benefit by semantic interpretation. But traditional approaches to the subtasks typically require a large annotated corpus specific to a single domain and ontology. This dissertation describes an approach to rapidly train a semantic interpreter using a set of seed annotations and a large, unlabeled corpus. Our approach adapts methods from paraphrase acquisition and automatic thesaurus construction to extend seed syntactic to semantic mappings using an automatically gathered, domain specific, parallel corpus. During interpretation, the system uses joint probabilistic inference to select the most probable interpretation consistent with the background knowledge. We evaluate both the quality of the extended mappings as well as the performance of the semantic interpreter.Item Unsupervised learning methods: An efficient clustering framework with integrated model selection(2012-08) Corona, Enrique; Nutter, Brian; Mitra, Sunanda; Pal, Ranadip; López-Benitez, NoéClassification is one of the most important practices in data analysis. In the context of machine learning, this practice can be viewed as the problem of identifying representative data patterns in such a manner that coherent groups are formed. If the data structure is readily available (e.g. supervised learning), it is usually used to establish classification rules for discrimination. However, when the data is unlabeled, its underlying structure must be unveiled first. Consequently, unsupervised classification poses more challenges. Among them, the fundamental question of an appropriate number of groups or clusters in the data must be addressed. In this context, the "jump" method, an efficient but limited linear approach that finds plausible answers to the number of clusters in a dataset, is improved via the optimization of an appropriate objective function that quantifies the quality of particular cluster configurations. Recent developments showing interesting associations between spectral clustering (SC) and kernel principal component analysis (KPCA) are used to extend the improved method to the non-linear domain. This is achieved by mapping the input data to a new space where the original clusters appear as linear structures. The characteristics of this mapping depend to a large extent on the parameters of the kernel function selected. By projecting these linear structures to the unit sphere, the proposed method is able to measure the quality of the resulting cluster configurations. These quality scores aid in the simultaneous decision of the kernel parameters (i.e. model selection) and the number of clusters present in the dataset. Results of the enhanced jump method are compared to other relative validation criteria such as minimum description length (MDL), Akaike's information criterion (AIC) and consistent Akaike's information criterion (CAIC). The extension of the method is tested with other cluster validity indices, in similar settings, such as the adjusted Rand index (ARI) and the balanced line fit (BLF). Finally, image segmentation examples are shown as a real world application of the technique.Item Visual object category discovery in images and videos(2012-05) Lee, Yong Jae, 1984-; Grauman, Kristen Lorraine, 1979-; Ghosh, Joydeep; Efros, Alexei; Bovik, Al; Geisler, Wilson; Aggarwal, J. K.The current trend in visual recognition research is to place a strict division between the supervised and unsupervised learning paradigms, which is problematic for two main reasons. On the one hand, supervised methods require training data for each and every category that the system learns; training data may not always be available and is expensive to obtain. On the other hand, unsupervised methods must determine the optimal visual cues and distance metrics that distinguish one category from another to group images into semantically meaningful categories; however, for unlabeled data, these are unknown a priori. I propose a visual category discovery framework that transcends the two paradigms and learns accurate models with few labeled exemplars. The main insight is to automatically focus on the prevalent objects in images and videos, and learn models from them for category grouping, segmentation, and summarization. To implement this idea, I first present a context-aware category discovery framework that discovers novel categories by leveraging context from previously learned categories. I devise a novel object-graph descriptor to model the interaction between a set of known categories and the unknown to-be-discovered categories, and group regions that have similar appearance and similar object-graphs. I then present a collective segmentation framework that simultaneously discovers the segmentations and groupings of objects by leveraging the shared patterns in the unlabeled image collection. It discovers an ensemble of representative instances for each unknown category, and builds top-down models from them to refine the segmentation of the remaining instances. Finally, building on these techniques, I show how to produce compact visual summaries for first-person egocentric videos that focus on the important people and objects. The system leverages novel egocentric and high-level saliency features to predict important regions in the video, and produces a concise visual summary that is driven by those regions. I compare against existing state-of-the-art methods for category discovery and segmentation on several challenging benchmark datasets. I demonstrate that we can discover visual concepts more accurately by focusing on the prevalent objects in images and videos, and show clear advantages of departing from the status quo division between the supervised and unsupervised learning paradigms. The main impact of my thesis is that it lays the groundwork for building large-scale visual discovery systems that can automatically discover visual concepts with minimal human supervision.