Modeling Complex Clickstream Data by Stochastic Models: Theory and Methods

2016 
As the website is a primary customer touch-point, millions are spent to gather web data about customer visits. Sadly, the trove of data and corresponding analytics have not lived up to the promise. Current marketing practice relies on ambiguous summary statistics or small-sample usability studies. Idiosyncratic browsing and low conversion (browser-to-buyer) make modeling hard. In this paper, we model browsing patterns (sequence of clicks) via Markov chain theory to predict users' propensity to buy within a session. We focus on model complexity, imputing missing values, data augmentation, and other attendant issues that impact performance. The paper addresses the following aspects; (1) Determine appropriate order of the Markov chain (assess the influence of prior history in prediction), (2) Impute missing transitions by exploiting the inherent link structure in the page sequences, (3) predict the likelihood of a purchase based on variable-length page sequences, and (4) Augment the training set of buyers (which is typically very small: 2% by viewing the page transitions as a graph and exploiting its link structure to improve performance. The cocktail of solutions address important issues in practical digital marketing. Extensive analysis of data applied to a large commercial web-site shows that Markov chain based classifiers are useful predictors of user intent.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    13
    Citations
    NaN
    KQI
    []