On monotone optimal decision rules and the stay-on-a-winner rule for the two-armed bandit
1985
Consider the following optimization problem: Find a decision rule δ such thatw(x, δ (x))=max a w(x, a) for allx under the constraint δ (x)∈D (x). We give conditions for the existence of monotone optimal decision rules δ. The term ‘monotone’ is used in a general sense. The well-known stay-on-a-winner rules for the two-armed bandit can be characterized as monotone decision rules by including the stage number intox and using a special ordering onx. This enables us to give simple conditions for the existence of optimal rules that are stay-on-a-winner rules. We extend results ofBerry andKalin/Theodorescu to the case of dependent arms.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
9
References
8
Citations
NaN
KQI