Elise Thorud ABB Energy Industries Oslo, Norway elise.thorud@no.abb.com
Marcel Dix ABB Corporate Research Mannheim, Germany marcel.dix@de.abb.com
Jean-Christophe Blanchon Corys Grenoble, France
Benjamin Kloepper Former ABB employee
With a view to overcoming this drawback, Corys and ABB have combined two simulation technologies to create an environment that generates data that is remarkably similar to that produced by specific processes in real industrial plants. This new level of simulation accuracy opens the door to tailormade, targeted and accelerated anomaly detection capabilities.
Industrial facilities need to run as smoothly as possible. To do so, indications of potential problems, such as anomalous vibrations, temperatures, pressures, and sounds need to be detected, identified, analyzed, and managed in their earliest stages. Anomaly detection, a key form of machine learning, can play a major role here by effectively supporting plant operators as they monitor the health of industrial systems.
Machine learning models, however, are typically trained using historical plant data. But as industrial systems are very robust, there are often not enough examples of real failure cases in the data to train reliable models. Moreover, even if some failure cases did occur, they are often hard to find in the data because they were not labeled as such by the operator, or because they were not noticed when they occurred. Furthermore, this state of affairs can lead to the mistaken identification of anomalous situations as being normal.
Creating an infrastructure for machine learning research
With a view to overcoming these drawbacks, data scientists are using high-fidelity process simulators, such as the Indiss Plus Simulator from Corys [1], to train machine learning models on specific normal and abnormal plant situations, such as, for example, valve failures, in order to correctly label such events.
For instance, Corys and ABB have created an infrastructure for machine learning designed to explore the potential – as well as the data requirements –of different algorithms in a realistic setup. →01 shows the experiment’s infrastructure for machine learning research created by Corys and ABB. At the heart of the infrastructure are the simulation tools of the two companies: Corys’ process simulation Indiss Plus and ABB’s control system simulator 800xA Simulator [3]. Individually, both tools have been proven to be highly accurate in several operator training projects. Now, in a combined configuration, the tools can generate a simulation of the behavior of a process and its associated automation system, such as, for instance, a real plant’s control logic, including alarms and safety logic.

A key advantage of Indiss Plus in this setup is that it also opens the door to simulating various plant equipment failures, eg, a valve leakage as shown in →02. The resulting failure data can overcome the issue of not having a sufficient number of failure cases to support machine learning.

To create simulation data sets suitable for the training and validation of a machine learning model, the execution of simulation experiments must be automated. In the present case, an experiment controller was developed as shown in →01. The experiment controller takes in an experiment plan describing when to perform various operator actions like setpoint changes and when to trigger failures within the Indiss Plus process simulation. The experiment controller performs batches of experiments, starting and stopping the process simulation from different initial process states and automatically performing operator actions. It also starts the data collection that receives data from an 800xA Simulator, making it possible to use ABB’s 800xA as a simulated control system in a simulator, with identical operator layout, view and control logic as in the plant. The data and a protocol of the actions performed by the experiment controller are stored in a time-series database and made available to a data scientist for the training of machine learning models.
Case study: Developing a machine learning model for anomaly detection
In the study described in this article, simulated datasets were used to train a model for anomaly detection that would be able to detect simulated device failures.
A feasible approach in machine learning for detecting anomalies in signal timeseries is to utilize so-called autoencoders [4]. An autoencoder is composed of two artificial neural networks, the first one learning to compress the data (encoder) and the second one learning how to reconstruct the compressed data (decoder). For the purpose of anomaly detection, the degree of error indicated by the reconstructed data is used to measure how abnormal the data is.
For the purposes of the current study, Indiss Plus from Corys was used (a high-fidelity process simulator). Here, Corys had implemented a high-fidelity simulation model of a three-phase separator process that is typically used in oil production. The core component of this process is a separator vessel that segregates fluids from a well into three outputs: oil, gas, and wastewater. In order for the separator to function properly, it is important to maintain the oil, water, and gas levels in balance. This is performed automatically by the control system by adjusting several valves →03. If a setpoint in one of the levels is changed, the system will adjust the other valves automatically to keep the whole separator in balance.

The above-described simulator was used to train an autoencoder to detect a physical valve failure such as a valve blockage or leakage. Such failures are often difficult to detect by operators, particularly if they are not represented directly in an HMI, which can occur eg, if sensors for detecting these failures are missing. The idea was to train an autoencoder that learns the signal trends from the three-phase separator process during normal operation, ie, when there are no failures. The trained autoencoder was then applied to try to reconstruct the trends for different simulated device failures.
In the current evaluation the autoencoder was able to detect device failures as anomalies because the signal trends that represent these failures had not previously been seen by the autoencoder during model training. This led to a relatively high reconstruction error. When the error was higher than a predefined threshold, the autoencoder classified this situation as anomalous and reported the anomaly to the user. As shown in →04, this anomaly threshold was exceeded exactly at the time of the device failure; but when the failure was removed in Indiss Plus, the reconstruction error from the autoencoder went back to normal. When an anomaly is detected, a subsequent step is to locate its potential root cause. In →04 the root cause was found to be in the oil valve.

Toward hybrid digital twins
The research described in this article outlines how ABB and Corys have worked together to create an infrastructure for reproducible machine learning research. The Corys Indiss Plus and ABB 800xA Simulator tools create an environment that produces data that is remarkably similar to that produced by real industrial plants. The key difference is that machine learning scientists have full control of the data generation and can test and evaluate their approach in a sound and comprehensive way. The combination of high--fidelity simulation based on first-principle-models and machine learning enables the creation of plant digital twins composed from different types of models that can be leveraged depending on the different types of functionalities the digital twins should deliver to the various plant stakeholders, ranging from operators to plant managers. Such hybrid digital twins hold the promise of becoming a key enabler of future autonomous industrial plants.
The logical next step in this research will be to test the machine learning infrastructure described in this article in a simulation of an actual customer plant. This will make it possible to investigate the potential benefits of machine learning models that have been pre-trained using simulation models of actual applications.
References
[1] T. Gamer and A. Isaksson, “Autonomous systems,” ABB Review, vol. 2018, no. 4, 2018.
[2] T. Gamer, M. Hoernicke, B. Klöpper, R. Bauer and A. Isaksson, “The autonomous industrial plant – future of process engineering, operations and maintenance,” Journal of Process Control, vol. 88, pp. 101 – 110, 2020.
[3] Corys, “Indiss Plus – Dynamic simulation platform,” Corys, 03 6 2020. [Online]. Available: https://www.corys.com/en/indiss-plusr. [Accessed April 20, 2021].
[4] M. a. Y. T. Sakurada, “Anomaly detection using autoencoders with nonlinear dimensionality reduction,” in Proceedings of the MLSDA 2014 2 nd Workshop on Machine Learning for Sensory Data Analysia, 2014.
[5] Z. GE, Z. SONG, S. X. and B. HUANG, “Data Mining and Analytics in the Process Industry: The Role of Machine Learning,” IEEE Access, vol. 2017, no. 5, pp. 20,590 – 20,616, 2017.
[6] Q. Joe and C. Leo, “Advances and opportunities in machine learning for process data analytics,” Computers and Chemical Engineering, no. 126, pp. 465 – 473, 2019.
[7] I. Amihai, R. Gitzel, A. M. Kotriwala, D. Pareschi, S. Subbiah and G. Sosale, “An Industrial case study using vibration data and machine learning to predict asset health,” in 2018 20th IEEE International Conference on Business Informatics, Wien, 2018.
[8] Wuest, T., Weimer, D., Irgens, C., & Thoben, K. D. (2016). “Machine learning in manufacturing: advantages, challenges, and applications,” Production & Manufacturing Research, vol. 4, no. 1, pp. 23 – 45, 2016.Originally published in ABB Review 02/2021, pp. 14 – 17