Investigating exploration techniques for ACS in discretized real-valued environments

2020 
One way of dealing with the real-valued input signal is to discretize it. This might influence the process of learning the environmental model by the ACS2 agent. A more sophisticated method of selecting action can be applied to increase the speed of gaining knowledge by determining the most valuable regions of the input-space. This paper compares four ACS2 biasing exploration techniques applied across four real-valued environments. A new class of benchmark problem (inverted pendulum) and an agent modification - Optimistic Initial Quality (OIQ) are introduced for ACS2 both with promising outcomes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []