Contextual multi-armed bandit algorithms for personalized learning action selection

2017 
Optimizing the selection of learning resources and practice questions to address each individual student's needs has the potential to improve students' learning efficiency. In this paper, we study the problem of selecting a personalized learning action for each student (e.g. watching a lecture video, working on a practice question, etc.), based on their prior performance, in order to maximize their learning outcome. We formulate this problem using the contextual multi-armed bandits framework, where students' prior concept knowledge states (estimated from their responses to questions in previous assessments) correspond to contexts, the personalized learning actions correspond to arms, and their performance on future assessments correspond to rewards. We propose three new Bayesian policies to select personalized learning actions for students that each exhibits advantages over prior work, and experimentally validate them using real-world datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    13
    Citations
    NaN
    KQI
    []