Automated object recognition through reinforcement learning



Journal Title

Journal ISSN

Volume Title


Texas Tech University


Object recognition, a branch of pattern recognition, is to identify and localize one or more objects in a given scene. We have to determine what is present and where it is within the input image. Although great achievements have been made during the last decades, currently existing object recognition techniques have shortcomings like unreliability and inefficiency, general inadaptability, manual template marking heavily influenced by human factors, inability to recognize an object without a model, and so on. Any recognition problem can be formulated as a searching process and has to be guided in a controlled manner. All search problems involve optimization, so object recognition requires optimization and control techniques. Reinforcement learning is learning how to behave given a situation and possible actions to maximize the total expected reward in the long run, and therefore needs to be optimized. Most pattern recognition techniques do not combine reinforcement learning for feature understanding. In this dissertation, reinforcement learning is applied both to automatic template generation from a model image and to template matching within the input image. The newly designed affine parameter estimation algorithm provides reliable results based on information contained at all feature point locations. The points are extracted in the scale-space using isophote curvature extreme points, which are invariant to affine transformations. The affine parameter estimation algorithm is applicable to any kind of translations, rotations, and scales, and moderate occlusions and deformations of the object to be recognized. Experiment results showed that the proposed set of algorithms are fast, efficient, and potentially robust. The automatic template generation algorithm, an efficient contour tracing one in gray-level images, can also be used in object recognition without a model. This is a new research field, and a great amount of future work needs to be done before an intelligent recognition system, as efficient as the human vision system, can be developed.