Web log analysis using Spark with solution recommendation

2021 
Abstract The robustness required by the web server that operates behind those giant service providers must be taken into account in today's world, where any service is accessible on the internet in the form of large sites or exceptional web applications. The businesses spend a great deal of money on maintaining these servers so that they work always at their highest level. The amount of data that is transacted and the intricacies involved in the Internet every microsecond is beyond man's imagination. The server needs to run 24X7 to make it easier for their demanding clients. Incidents, however, have contributed to millions of dollars losses for the organization due to a malfunction of a company server and it merely emphasizes the value of a stable server. Each server contains details about the requests that it has previously handled, called web server logs, and it is extremely valuable because it gives us information about the potential reasons for the server failure. For a certain period, famine attacks (DDOS attacks) and many others may be overloaded. The web server's audit logs include details about the response code it generates and should the error codes involved be well examined, the reasons for a server failure might probably be identified. We aim to review the error code documents to gain their information and evaluate web server logs using a solution-recommendation framework that informs the server administrator of the potential reasons for any previous server breakdowns and how it can be prevented. For analysis, we will be using Apache Spark keeping in mind its in-memory processing of data, making it a lot faster than Map Reduce of the Hadoop ecosystem which goes for batch processing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []