Investigating exploration techniques for ACS in discretized real-valued environments
2020
One way of dealing with the real-valued input signal is to discretize it. This might influence the process of learning the environmental model by the ACS2 agent. A more sophisticated method of selecting action can be applied to increase the speed of gaining knowledge by determining the most valuable regions of the input-space. This paper compares four ACS2 biasing exploration techniques applied across four real-valued environments. A new class of benchmark problem (inverted pendulum) and an agent modification - Optimistic Initial Quality (OIQ) are introduced for ACS2 both with promising outcomes.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
19
References
0
Citations
NaN
KQI