Learning thresholds for PV change detection from operators' labels

2015 
Page Views (PVs) are very crucial for search engines due to their close relationship to the revenue. When PVs change significantly, operators must be informed so that they can diagnose and fix the problem quickly, and prevent further loss. In reality, PVs can be counted in many ways (e.g., PVs originated from different ISPs), and different PVs are of different interest to operators (e.g., the PVs of a larger ISP is more important). As a result, different PVs often require different detection standards, or thresholds. However, attempts to tune a number of thresholds have been hampered by the cost of the manual effort involved. To address the above problem, we propose a practical framework, called PTL (practical threshold learning). Operators only need to provide a few simple labels about the detection results, then PTL will automatically tune the thresholds for different PVs. Using 4-month PVs from a global top search engine, our evaluation demonstrates that PTL can improve the accuracy of detection dramatically. More importantly, it introduces very little labeling overhead for operators. For example, when detecting the PVs of 103 ISPs, PTL can reduce the overall false negative rate from 96% to 9% using only 29 labels per week on average.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []