Single Observation Adaptive Search for Continuous Simulation Optimization

2018 
Finding optimal policies when the objective function can only be observed with a noisy error has traditionally required repeated observations to average out the noise. In this paper, we prove for a very large class of continuous stochastic optimization problems and search algorithms that a single observation for each candidate solution suffices to completely eliminate the error. This surprising result can best be understood through the metaphor of facing a continuum of slot machines and observing the reward for a single pull of the arm for each machine as we search for a machine delivering the best return. By the impossibility of systems, the average winnings must approach the expected winnings regardless of how we select the sequence of machines to test. This same martingale property is inherited for simulation optimization so that a single observation also suffices for convergence to the global optimum. The resulting savings in simulation time can be quite significant.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    2
    Citations
    NaN
    KQI
    []