Assessing reinforcement delta hedging

2021 
Usual option pricing is based on replication assuming complete markets. Complete markets means that simple hedging strategies, like delta hedging, work exactly but real markets are not complete. This has motivated research on reinforcement learning to develop pricing for incomplete markets that are highly complex to deal with otherwise. However, the literature has studied approximations of reinforcement learning giving the agents hints, e.g. approximated payoffs before the actual final payoff, or by using a neural network for each timestep, increasing complexity. Here, we investigate if a single agent can learn hedging without hints, i.e. by only knowing the final pay-off, using reinforcement learning. We consider the simplest possible option setup: a vanilla European equity option in a pure Black-Scholes world. We use an agent-critic-based method, Deep Deterministic Policy Gradient (DDPG), and show that even in this simplest case, DDPG alone is unable to properly hedge. However, by employing ensemble methods, i.e. average and median, we can significantly improve accuracy and stability. This emphasises the need for customization in the application of deep learning in hedging rather than being purely driven by data and hyper-parameters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []