Coordinated Analysis of Heterogeneous Monitor Data in Enterprise Clouds for Incident Response

2019 
During incident analysis and response, enterprise cloud administrators want to use as much of their generated monitor data as possible. However, the reality is that decisions are often dictated by the tools actually available to automatically process the monitor data, rather than by an understanding of the relevance of the data for incident response. The significant manual effort and domain expertise required to process diverse cloud monitors means that much monitor data remain unexamined. We propose a framework for simplifying the complexity of data analysis for incident response. Our framework enables coordinated analysis of both metric (numerical) data and log (semi-structured, textual) data and exposes salient features within those data. As a foundation for the framework, we define a taxonomy for fields within monitor data based on insights gained from analyzing logs and metrics collected from all levels of an experimental platform-as-a-service (PaaS) cloud (EPC). Using the taxonomy, we lay out a method for semi-automated feature extraction and discovery across heterogeneous monitors. We then describe a method for feature clustering to promote effective analysis of the data, and to remove redundant and uninformative features. We discuss the application of our framework for incident response within the EPC, including root cause analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []