Unveiling ALMA software behavior using a decoupled log analysis framework

2014 
ABSTRACT ALMA Software is a complex distributed system installed in more than one hundred of computers, which interactswith more than one thousand of hardware device components. A normal observation follows a ow that interactswith almost that entire infrastructure in a coordinated way. The Software Operation Support team (SOFTOPS)comprises specialized engineers, which analyze the generated software log messages in daily basis to detectbugs, failures and predict eventual failures. These log message can reach up to 30 GB per day. We describea decoupled and non-intrusive log analysis framework and implemented tools to identify well known problems,measure times taken by speci c tasks and detect abnormal behaviors in the system in order to alert the engineersto take corrective actions. The main advantage of this approach among others is that the analysis itself does notinterfere with the performance of the production system, allowing to run multiple analyzers in parallel. In thispaper we'll describe the selected framework and show the result of some of the implemented tools.Keywords: bug, detection, log, analyzer
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    2
    Citations
    NaN
    KQI
    []