Forecasting Incident Patterns in Production Systems with ML to Prevent Recurring Failures
Author(s): Hariprasad Sivaraman
Publication #: 2411105
Date of Publication: 12.04.2024
Country: USA
Pages: 1-7
Published In: Volume 10 Issue 2 April-2024
DOI: https://doi.org/10.5281/zenodo.14250637
Abstract
Across industries, production systems supporting continuous operations face recurring failures. For traditional incident response, this can be hard since it is reactive making it difficult to prevent failures proactively. This paper presents an ML-based method to anticipate the incident frequency using historical data which could help in avoiding system downtime through predictive maintenance. Model selection, data preparation, training and validation are covered to show an example from a financial production environment which demonstrates how ML can improve resilience of production systems.
Keywords: Incident Forecasting, Machine Learning, Production Systems, Reliability, Anomaly Detection, Predictive Maintenance, Time-Series Forecasting
Download/View Count: 128
Share this Article