On monotone optimal decision rules and the stay-on-a-winner rule for the two-armed bandit

1985 
Consider the following optimization problem: Find a decision rule δ such thatw(x, δ (x))=max a w(x, a) for allx under the constraint δ (x)∈D (x). We give conditions for the existence of monotone optimal decision rules δ. The term ‘monotone’ is used in a general sense. The well-known stay-on-a-winner rules for the two-armed bandit can be characterized as monotone decision rules by including the stage number intox and using a special ordering onx. This enables us to give simple conditions for the existence of optimal rules that are stay-on-a-winner rules. We extend results ofBerry andKalin/Theodorescu to the case of dependent arms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    8
    Citations
    NaN
    KQI
    []