Exploration of an unknown space by collective robotics using fuzzy logic and reinforcement learning

Pandya, Ashish K.

Exploration of an unknown space by collective robotics using fuzzy logic and reinforcement learning

Date

2000-05

Authors

Pandya, Ashish K.

Publisher

Texas Tech University

Abstract

This thesis concerns itself with the specific problem as follows: search an area using mobile robots without the aid of human (or central) tele-operation. The robots must correctly identify the goal source which is characterized by a maximum intensity (or favorability.) Subsequently, they must reach the position of the goal source while incurring a low total cost (energy consumed). The principles with their scalability and usability are used as evaluation criteria for the methods used to explore the unknown search area. Two different approaches are considered to solve this problem. The first, uses fiizzy mles [1], so that a robot in collaboration with other robots may use the knowledge of its present state vectors to find the desired signal source.The second approach uses reinforced leaming technique to train robots. In this technique, we have 3 different methodologies. The first is the simplest reinforcement leaming called QLeaming in which we have a lookup table to train individual robot. Second method is similar to Simple ACD viz: Heuristic Dynamic Programming (HDP) called Temporal Difference (TD(X)) method. The Temporal Difference method is an elegant way of doing reinforcement leaming. A simple ACD uses two neural networks, e.g., a criticand an action (control) network. The critic network leams to predict the total fiiture cost from a given environment to the terminal state, while the action network leams a policy function to optimize critic's cost output at each state. A graphical user interface and display plus a software implemented simulator are used for experimental purposes for both approaches.