AI learns to mimic process dynamics

Subscribe to ABB Review

Industrial automatic control systems employ digital twin models to relate process inputs to outputs. Until recently, it would have needed an extreme engineering effort to provide always-up-to-date digital twins, but recent advances in artificial intelligence technologies are about to change that.

Mehmet Mercangoez, Andrea Cortinovis ABB Corporate Research Baden-Dättwil, ­Switzerland,,; Luis Dominguez Former ABB employee

The role of automatic control systems in industrial plants is to maintain a safe and stable operation, ensure product quality by shifting variability from key outputs to process actuators, and provide operational flexibility and efficiency. Digital twins are powerful tools that help achieve these aims.

Practitioners in the domain of process control are no strangers to the concept of digital twins as even the simplest analysis of stability requires a mathematical model, conventionally in the form of a transfer function, to represent the relationship between a system’s input and output.

Other automatic control concepts such as controllability and observability also rely on the possibility of model-based analysis. Therefore, the availability of always-up-to-date digital representations of such input-output relationships will be an exhilarating thought for control engineers, especially those dealing with systems whose performance or even configuration is subject to changes and uncertainties. Until recently, this idea would have been a pure fantasy or at best something only possible with extreme engineering effort – but recent advances in artificial intelligence (AI) are about to change that →1.

01 Recent advances in AI are opening up whole new approaches to automatic process control in industry. Shown is a cluster mill used in metal processing.
01 Recent advances in AI are opening up whole new approaches to automatic process control in industry. Shown is a cluster mill used in metal processing.

Digital twin vs. mathematical model
Mathematical modeling approaches in process control can be divided into two main categories: Those that utilize physical insight and domain knowledge, and data-driven methods. However, this division is not necessarily absolute. Most models based on physical relationships contain parametrization options that are chosen based on process data (gray-box models) and for data-driven approaches (black-box models) it is not uncommon to select model orders or forms (such as first-order plus time delay) based on prior experience or domain knowledge.

Regardless of the modeling approach, the conventional practice is to have experienced engineers carry out plant tests or analyze historical data to configure these models for specific applications. These applications can range from the creation of a simple soft sensor (eg, for Kappa number estimation in pulp production), to model predictive control (MPC) – eg, for controlling an integrated gasification and combined cycle plant. Unfortunately, all industrial plants are in a constant state of change: Catalysts in catalytic reactors get poisoned, compressors or turbines experience fouling and heat exchangers clog. Such phenomena lead to deviations from model predictions.

In contrast, a digital twin is a mathematical process model that can stay up-to-date and preferably do so without an army of expert engineers looking at the data to manually tweak and tune the model constituents.

Conventional digital twins: a scalability challenge
In conventional models, future system changes –  for example coke deposition on a turbocharger – must be known and modeled in advance. But as system complexity grows, it becomes more and more difficult to write down the relevant mathematical formulas and identify the associated states and parameters to estimate. Qualified and experienced engineers are required to achieve this task and eventually the cost of the solution crosses the commercial feasibility barrier.

The art and science of black-box system identification
“System identification” refers to the model estimation problem for dynamic systems, based on observed input-output behavior [1]. The black-box model identification techniques emerging from this domain of study have found extensive use in industrial applications. Black-box modeling stands out as a scalable alternative to gray-box modeling approaches, especially since it shifts the requirement from domain expertise and know-how to the complexity of the mathematical constructs and algorithms, and to increased amounts of measurement data.

However, some challenges remain. Current practice is to use a series of open-loop and single-variable step tests to generate the data required for model identification using established black-box identification techniques. This procedure is not only time-consuming, but it might also not be feasible if the involved control loops cannot be safely taken into manual operation.

Further, step or impulse-response models can capture some of the output nonlinear dynamic behavior – such as time delays – but they need to be superposed in a linear way to represent multiple-input, multiple-output (MIMO) behavior and can only capture linear input and output relationships in the steady state.

Moreover, these rudimentary models cannot represent open-loop unstable behavior. More advanced methods such as subspace identification can be used to obtain truly MIMO models, but these are inherently linear and cannot efficiently represent delays or saturation conditions. It is possible to utilize multiple models to represent the dynamic system behavior and capture nonlinearities, but this is a nontrivial task requiring further engineering effort.

The AI package: manifold learning meets recurrent neural networks
ABB decided to explore a new approach to constructing digital twins in the process industries, with several main objectives:
• Minimize or eliminate expert intervention or manual engineering.
• Have the ability to build the models without any open-loop plant testing and preferably from historical operating data.
• Retain or exceed previous prediction accuracy levels.

Among these wishes, the one that will run up against fundamental scientific limits is the second one, as no algorithm and computation will be able to extract the information needed if that information is not contained in the data. If a certain operating condition has not been visited by the system, the information needed will be absent and the predictive ability of the derived models will be constrained.

To realize the objectives listed above, recent developments in AI, and more specifically machine learning, were exploited.

The use of neural networks for nonlinear system modeling is not a new idea in process control. Investigations in the 1990s examined the use of multilayer feedforward networks with hidden layers to approximate plant responses and their derivatives [2]. The conclusions of these early studies mostly highlighted the computational difficulties involved. Also, in those days, feedforward networks were, by design, not optimal for modeling dynamic systems – and key developments in recurrent neural networks (RNNs) or long short-term memory (LSTM) that better represent dynamics had not yet happened. Today, many research articles consider the use of RNNs for modeling dynamic systems in control problems.

Another development – manifold learning – delivered another piece of the solution. Principle component analysis (PCA) and dimensionality reduction techniques, in general, have been frequently used in process systems engineering, especially for condition monitoring and fault-detection applications.

The basic idea here is that the process plants are many-dimensional by nature, but numerous measurements correlate due to their underlying physics. Again, the prevalent methods here were linear and had limited success in capturing nonlinear behavior. Although extensions of the linear methods were proposed and employed with some success, recent developments in artificial neural networks and their use for constructing variational autoencoders (VAEs) has brought a much more significant improvement in the performance of these approaches. A VAE is an unsupervised learning construct of an encoding and a decoding structure able to map data to a reduced order space and back to the original dimension with the goal of having minimum loss of information in that process.

02 The basic scheme of the AI algorithm used to predict the future behavior of a process plant given future values of plant inputs.
02 The basic scheme of the AI algorithm used to predict the future behavior of a process plant given future values of plant inputs.

The main idea in the ABB approach for creating digital twins is to learn plant dynamics in a low-dimensional space encoded by the VAEs. In practice, the VAEs are realized as feedforward neural networks. In contrast to similar arrangements encountered in the literature, ABB suggests an additional layer that initializes the states of the so-called GRUs (gated recurrent units), which are the main element of the RNNs capturing the system dynamics. Note that the input neural network layers depicted in →2 are only shown for completeness – in typical control problems the inputs are already reduced to a minimum set and are physically independent and uncorrelated. Therefore, a mapping of inputs to a lower dimensional latent space is not needed.

Putting it to the test
To test the performance and robustness of the suggested approach, a complex system concerning the machine direction (MD) control of a paper machine was chosen [3,4]. Paper machines have strong interactions between multiple scanning frames (sensors) and multiple sets of actuators. These interactions affect the properties of the paper produced. Moreover, due to different response times and transport delays between actuators and measurements along the scanned sensor locus, the sheets may experience different shrinkage amounts that lead to different widths. In general, the MD control system should apply control actions that eliminate rapid sheet weight and moisture variations during transitions from manufacturing one type of paper to another (“grade-change”), considering and predicting the long transport delays.

The model was validated against plant data that consisted of three inputs: Stock Flow (ST01), Steam Pressure 1 (PR02) and Steam Pressure 2 (PR03); three outputs: Dry Weight (DW), Moisture Content 1 (MT1) and Moisture Content 2 (MT2); and three disturbances: Retention Air flow or Ash Content (RA01), Bright Clay Flow (CY01) and Machine Speed (MS). The objective in the present case study is reference tracking on the three outputs defined above, whereby the disturbances are assumed to be measured. The main challenges of the case study are the system complexity arising from a large number of states, the handling of various delays and the measurement noise added to the output signals.

Training data for this exercise was generated by running a high-fidelity paper machine simulator with various output references changes and recording the input/output signals. Output references changes are carried out back-to-back, meaning that no preprocessing is necessary to filter out the steady-state operation. Additive autocorrelated noise is added on each output channel to make the setup more realistic. Independent datasets that were never seen in the training are generated for validation.

The RNN model is constructed in Keras with the backend in Tensorflow. GRU training is based on back-propagation through time (BPTT), which has a relatively high computational cost.

03 Validation results for N-step-ahead predictions in the paper machine case study. Dots indicate measurement updates; uncorrected predictions, where measurements and their histories are updated again, lie between the dots. Y-axes are normalized.
03 Validation results for N-step-ahead predictions in the paper machine case study. Dots indicate measurement updates; uncorrected predictions, where measurements and their histories are updated again, lie between the dots. Y-axes are normalized.

Both →3 and the quantitative metrics show that the RNN performs well on the unseen validation set and that it is, therefore, able to learn the dynamics of the system, including its delays. The obtained predictions can then be used for decision support or for closed-loop control using model-based techniques and, in the future, or autonomous operations →4.

04 Autonomous decision making.
04 Autonomous decision making.

Where ABB is now
The present work is just the first step toward the creation of sophisticated digital twins for the process industries. Some ingredients are already established, such as the capability to build accurate nonlinear multivariable dynamic models from closed-loop operating data [5]. The triggers for automatic retraining and the autonomous parametrization of the underlying structures in the associated neural networks are works in progress. The selection of architectures – such as the number of layers in the VAEs or the number of GRUs – is, at the moment, manual and requires knowledge of the deep-learning environments rather than knowledge of the engineering domains. Hyperparameter optimization can address this dependence.

05 ABB Ability APCA Suite implementing MD control of a paper machine using the VAE and RNN based modeling approach. The APCA Suite also supports graphical first-principles modeling, linear and nonlinear regression, PCA, artificial neural networks (ANN) and support vectormachines (SVM).
05 ABB Ability APCA Suite implementing MD control of a paper machine using the VAE and RNN based modeling approach. The APCA Suite also supports graphical first-principles modeling, linear and nonlinear regression, PCA, artificial neural networks (ANN) and support vectormachines (SVM).

A valuable asset is the ABB AbilityTM APCA Suite – a set of tools that simplifies the deployment of advanced controllers and analytic models [6]. With the APCA Suite, analytic models can be deduced from either first principles or process data and deployed in the APCA run-time system. The modeling system described above together with the VAEs and RNNs are now part of the APCA Suite and are ready to benefit ABB’s customers →5. ABB is aiming to further improve these capabilities and offer customers tangible improvements in the safety, availability, quality, efficiency and flexibility of their processes. 

[1] L. Ljung, “Perspectives on system identification,” Annual Reviews in Control, 34(1), pp. 1–12, 2010.
[2] K. Hornik, et al., “Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks,” Neural networks, 3(5), 
pp. 551–560, 1990.
[3] S.-C. Chen et al., “Use a Machine’s full capability,” Pulp & Paper International (PPI), Process Control, pp. 39–42, March 2009.
[4] S.-C. Chen et al., “Multivariable CD control applications,” IPW, Process and Quality Control, pp. 16–20, October 2008.
[5] N. Lanzetti et al., “Recurrent Neural Network based MPC for Process Industries,” European Control Conference, Naples, Italy, 2019.
[6] L. Dominguez and E. Gallestey, “Leveraging advanced process control and analytics in industrial automation,” ABB Review, 02/2018, pp. 38–45.


Traditional autonomous decision-­making system
Traditional autonomous decision-­making systems are handcrafted and are based on engineered mathematical models representing the physical reality. As a response to observations or commands, decisions are made by scanning over possible actions and using the mathematical models to determine the outcomes of those actions over a prediction horizon. The algorithm implements the action that produces the best outcome matching the commands.

AI-generated model
The modeling method proposed in this article replaces the engineered system models by system models generated by an AI application. These AI-generated models are still compatible with the traditional way of using decision-making algorithms and can be used to generate predictions based on possible actions. They lend themselves to optimization of the actions to obtain the desired behavior and outcomes.

A promising future approach
The two previous schemes rely on the computation of the actions in real time in response to observations and changing commands. The creation of the engineered models or the training of the AI-based models has to happen offline or in a separate step. A promising approach for the future is to also employ AI to learn the best actions corresponding to observations and commands and let the AI decide on the actions directly. The advantage of this approach is the possibility to come up with previously unthought-of strategies that are not restricted by the engineered constructs of the past. The proposed learning approach in this article can be used to generate digital twins as playgrounds for such AI algorithms to construct their policies on simulated realities.


Contact us


Share this article

Facebook LinkedIn Twitter WhatsApp