Operating a Treatment Planning System using a Deep‐Reinforcement‐Learning based Virtual Treatment Planner for Prostate Cancer Intensity‐Modulated Radiation Therapy Treatment Planning

Chenyang Shen,Dan Nguyen,Liyuan Chen,Yesenia Gonzalez,Rafe McBeth,Nan Qin,Steve B Jiang,Xun Jia

Operating a Treatment Planning System using a Deep‐Reinforcement‐Learning based Virtual Treatment Planner for Prostate Cancer Intensity‐Modulated Radiation Therapy Treatment Planning

2020

PURPOSE: In the treatment planning process of intensity modulated radiation therapy (IMRT), a human planner operates the treatment planning system (TPS) to adjust treatment planning parameters, e.g. dose volume histogram (DVH) constraints' locations and weights, to achieve a satisfactory plan for each patient. This process is usually time-consuming, and the plan quality depends on planer's experience and available planning time. In this study, we proposed to model the behaviors of human planners in treatment planning by a deep reinforcement learning (DRL)-based virtual treatment planner network (VTPN), such that it can operate the TPS in a human-like manner for treatment planning. METHODS AND MATERIALS: Using prostate cancer IMRT as an example, we established the VTPN using a deep neural network developed. We considered an in-house optimization engine with a weighted quadratic objective function. VTPN was designed to observe an intermediate plan DVHs and decide the action to improve the plan by changing weights and threshold dose in the objective function. We trained the VTPN in an end-to-end DRL process in 10 patient cases. A plan score was used to measure plan quality. We demonstrated the feasibility and effectiveness of the trained VTPN in another 64 patient cases. RESULTS: VTPN was trained to spontaneously learn how to adjust treatment planning parameters to generate high-quality treatment plans. In the 64 testing cases, with initialized parameters, quality score was 4.97 (+/-2.02), with 9.0 being the highest possible score. Using VTPN to perform treatment planning improved quality score to 8.44 (+/-0.48). CONCLUSIONS: To our knowledge, this was the first time that intelligent treatment planning behaviors of human planner in external beam IMRT are autonomously encoded in an artificial intelligence system. The trained VTPN is capable of behaving in a human-like way to produce high-quality plans.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations