Abstract
While most of the research on reinforcement learning assumed a discrete control space, many of the real world control problems need to have continuous output. This can be achieved by using continuous mapping functions for the value and action functions of the reinforcement learning architecture. Two questions arise here however. One is what sort of function representation to use and the other is how to determine the amount of noise for search in action space. The ubiquitous back-propagation neural network is used here to learn the value and action functions. Next, the reinforcement predictor that is intended to predict the next reinforcement is introduced that also determines the amount of noise to add to the controller output. This proposed Reinforcement Learning architecture is found to have a sound on-line learning control performance through a computer simulation of the ball and beam system as an example plant.
| Original language | English |
|---|---|
| Pages | 2028-2033 |
| Number of pages | 6 |
| State | Published - 1998 |
| Event | Proceedings of the 1998 IEEE International Joint Conference on Neural Networks. Part 1 (of 3) - Anchorage, AK, USA Duration: 4 May 1998 → 9 May 1998 |
Conference
| Conference | Proceedings of the 1998 IEEE International Joint Conference on Neural Networks. Part 1 (of 3) |
|---|---|
| City | Anchorage, AK, USA |
| Period | 4/05/98 → 9/05/98 |