Forecasting Incident Patterns in Production Systems with ML to Prevent Recurring Failures

Author(s): Hariprasad Sivaraman

Publication #: 2411105

Date of Publication: 12.04.2024

Country: USA

Pages: 1-7

Published In: Volume 10 Issue 2 April-2024

DOI: https://doi.org/10.5281/zenodo.14250637

Abstract

Across industries, production systems supporting continuous operations face recurring failures. For traditional incident response, this can be hard since it is reactive making it difficult to prevent failures proactively. This paper presents an ML-based method to anticipate the incident frequency using historical data which could help in avoiding system downtime through predictive maintenance. Model selection, data preparation, training and validation are covered to show an example from a financial production environment which demonstrates how ML can improve resilience of production systems.

Keywords: Incident Forecasting, Machine Learning, Production Systems, Reliability, Anomaly Detection, Predictive Maintenance, Time-Series Forecasting

Download/View Paper's PDF

Download/View Count: 128

Share this Article