Project
Anomaly detection for punching machine
Machine Learning for detecting anomalous machine behavior
Request/problem:
The customer wanted to extend the functionality of their machines with an automatic system that detects anomalous behavior during operation. The machines are rarely supervised by operators and the anomaly detection should signalize when machine behavior is deviating from normal operation. The system had to work without knowledge of the possible malfunctions that can occur, and the inference was to be performed on low-cost hardware. In a proof-of-concept the main drive was chosen as the component to be monitored. The motor provided a total of six variables that were available for classifying behavior.
Solution:
As a first step, data from the drive had to be made accessible. For this, OPC UA was chosen for communication. The OPC UA server is provided by the PLC controlling the drive and was configured to expose six control variables. For persistence and training of the ML system, these values were written into a time-series database. InfluxDB was chosen for data storage, as it is open-source, purpose-built for time-series and comparatively quick to deploy. A Python client connects the OPC UA server and writes the values into the DB each second. Normal operation was recorded for two hours and used to train several unsupervised ML-models. For later evaluation of the models, anomalous behavior was simulated by hindering the movement of the machine mechanically. From the initial selection of models, the model calculating the Local Outlier Factor proved to be the most suitable. The trained model was serialized and used in a second Python program that calculates the Local Outlier Factor of each new datapoint and writes this metric into the DB as well. Performance during live operation was tested by running normal and abnormal operation cycles, while the anomaly score and the respective motor variables are visualized in a dashboard. The software components were implemented as individual Docker containers that were managed via Portainer. The application was deployed on a Raspberry Pi 4B 4GB, as it fulfilled the hardware requirements needed for the PoC.
Architecture:
Results:
The initial performance with the raw data from the drive was not sufficient to reliably detect all the anomalous operation modes. Therefore, extensive feature engineering was needed to find features that enable more accurate detection of anomalies. Calculating statistical metrics of sliding windows proved to be efficient in this case. After including these features into the model, all anomalies that were simulated were detected with high anomaly scores, whereas normal operating behavior was connected to anomaly scores close to zero.