Browsing by Subject "Feature selection"

Now showing 1 - 4 of 4

Active Control Strategies for Chemical Sensors and Sensor Arrays
(2013-07-17) Gosangi, Rakesh
Chemical sensors are generally used as one-dimensional devices, where one measures the sensor?s response at a fixed setting, e.g., infrared absorption at a specific wavelength, or conductivity of a solid-state sensor at a specific operating temperature. In many cases, additional information can be extracted by modulating some internal property (e.g., temperature, voltage) of the sensor. However, this additional information comes at a cost (e.g., sensing times, power consumption), so offline optimization techniques (such as feature-subset selection) are commonly used to identify a subset of the most informative sensor tunings. An alternative to offline techniques is active sensing, where the sensor tunings are adapted in real-time based on the information obtained from previous measurements. Prior work in domains such as vision, robotics, and target tracking has shown that active sensing can schedule agile sensors to manage their sensing resources more efficiently than passive sensing, and also balance between sensing costs and performance. Inspired from the history of active sensing, in this dissertation, we developed active sensing algorithms that address three different computational problems in chemical sensing. First, we consider the problem of classification with a single tunable chemical sensor. We formulate the classification problem as a partially observable Markov decision process, and solve it with a myopic algorithm. At each step, the algorithm estimates the utility of each sensing configuration as the difference between expected reduction in Bayesian risk and sensing cost, and selects the configuration with maximum utility. We evaluated this approach on simulated Fabry-Perot interferometers (FPI), and experimentally validated on metal-oxide (MOX) sensors. Our results show that the active sensing method obtains better classification performance than passive sensing methods, and also is more robust to additive Gaussian noise in sensor measurements. Second, we consider the problem of estimating concentrations of the constituents in a gas mixture using a tunable sensor. We formulate this multicomponent-analysis problem as that of probabilistic state estimation, where each state represents a different concentration profile. We maintain a belief distribution that assigns a probability to each profile, and update the distribution by incorporating the latest sensor measurements. To select the sensor?s next operating configuration, we use a myopic algorithm that chooses the operating configuration expected to best reduce the uncertainty in the future belief distribution. We validated this approach on both simulated and real MOX sensors. The results again demonstrate improved estimation performance and robustness to noise. Lastly, we present an algorithm that extends active sensing to sensor arrays. This algorithm borrows concepts from feature subset selection to enable an array of tunable sensors operate collaboratively for the classification of gas samples. The algorithm constructs an optimized action vector at each sensing step, which contains separate operating configurations for each sensor in the array. When dealing with sensor arrays, one needs to account for the correlation among sensors. To this end, we developed two objective functions: weighted Fisher scores, and dynamic mutual information, which can quantify the discriminatory information and redundancy of a given action vector with respect to the measurements already acquired. Once again, we validated the approach on simulated FPI arrays and experimentally tested it on an array of MOX sensors. The results show improved classification performance and robustness to additive noise.
An EEG feature selection toolbox for EEGLAB in the matlab environment
(2011-08) Kerr, Andy S; Baker, Mary C.; Pal, Ranadip
A complete system is proposed to generate features from raw EEG data and qausi-optimally reduce the feature set based on classification rates. Several default features are included for generating feature sets, and the feature set is qausi-optimally reduced using stepwise regression algorithms based on the classification of known classes. A plug-in known as the Feature Selection Toolbox was developed for the open source EEGLAB toolbox within the MATLAB environment to accomplish the goals of this thesis. Synchrony measures of the EEG are becoming more common as a means to establish network links and general comparisons of different areas of the brain. The four default features included in the Feature Selection Toolbox are average power, correlation coefficient, magnitude squared coherence, and phase synchrony index. An exhaustive search is impractical in finding an optimal subset of features as the computational time increases exponentially with the number of desired features in the optimal subset. Three stepwise regression feature selection algorithms are implemented to select the near optimal feature subset with a nearly linear increase in computational time as the maximum number of selected features increases. An example study comparing Alzheimer's Disease to Mild Cognitive Impairment and controls demonstrates the usefulness of the tools developed as part of this thesis. Also, the tradeoffs of different options in the Feature Selection Toolbox are assessed from the results of the algorithm in classifying the responses of individuals to two different cognitive tasks, one involving visual stimulus and counting, the other involving visual stimulus and spatial reasoning.
Feature selection in credit scoring- a quadratic programming approach solving with bisection method based on Tabu search
(Texas A&M International University, 2015-06) Huang, Jun; Wang, Haibo
Credit risk is one of the most important topics in the risk management. Meanwhile, it is the major risk of banks and financial institutions encountered as claimed by the Basel capital accord. As a form of credit risk measurement, credit scoring is the credit evaluation process to reduce the current and expected risk of a customer being bad credit. The credit scoring models usually use a set of features to predict the credit status, good credit (unlikely to default) and bad credit (more likely to default), of the applicants. However, with the fast growth in the credit industry and facilitation of collecting and storing information due to the new technologies, a huge amount of information on customer is available. Feature selection or subset selection is therefore essential to handle irrelevant, redundant or misleading features in order to improve predictive (classification) accuracy and reduce high complexity, intensive computation, and instability for most credit scoring models. In this study, a hybrid model is developed for credit scoring problems to predict the classification accuracy based on selected subsets by first establishing a correlation coefficient based binary quadratic programming model for feature selection. The model is then solved with the bisection method based on Tabu search algorithm (BMTS) and provides optional subsets of features in different sizes from which the satisfactory subsets for credit scoring models are selected based on both the size and overall classification accuracy rate (OCAR). The results of this proposed BMTS+SVM method, tested on two benchmark credit datasets, shed light on the improvement of the existing credit scoring systems with flexibility and robustness. This validated method is then used in an international business context to test the data on the U.S. and Chinese companies in order to find out the subsets of features that act as key factors in distinguishing good credit companies from bad credit companies in these two countries. Finally, The performance of classification models, using different classifiers, in terms of OCAR and misclassification cost is evaluated based on the U.S. and Chinese datasets. Cutoff values which give highest OCAR and minimum misclassification cost is also discussed.
A survey of feature selection methods : algorithms and software
(2015-05) Arguello, Bryan; Dimitrov, Nedialko B.; Maloney, Andy
The feature selection problem is a major component in disease surveillance since data sources are so costly. This report describes several existing methods for performing feature selection along with software that implements these methods. To help make experimenting with different algorithms easy, we have created a feature selection wrapper package in Python. This wrapper allows the user to easily try different algorithms on the same data set and visualize the results. Experiments are performed to validate that the methods perform as expected.

Browsing by Subject "Feature selection"

Results Per Page

Sort Options