Estimation of a logistic regression model by a genetic algorithm to predict pipe failures in sewer networks

2021 
Sewer networks are mainly composed of pipelines which are in charge of transporting sewage and rainwater to wastewater treatment plants. A failure in a sewer pipe has many negative consequences, such as accidents, flooding, pollution or extra costs. Machine learning arises as a very powerful tool to predict these incidents when the amount of available data is large enough. In this study, a real-coded genetic algorithm is implemented to estimate the optimal weights of a logistic regression model whose objective is to forecast pipe failures in wastewater networks. The goal is to create an autonomous and independent predictive system able to support the decisions about pipe replacement plans of companies. From the data processing to the validation of the model, all stages for the implementation of the machine-learning system are explored and carefully explained. Moreover, the methodology is applied to a real sewer network of a Spanish city to check its performance. Results demonstrate that by annually replacing 4% of pipe segments, those whose estimated failure probability is higher than 0.75, almost 30% of unexpected pipe failures are prevented. Furthermore, the analysis of the estimated weights of the logistic regression model reveals some weaknesses of the network as well as the influence of the features in the pipe failures. For instance, the predisposition of vitrified clay pipes to fail and of that pipes with smaller diameters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    1
    Citations
    NaN
    KQI
    []