Workflow mining of ­operator audit trails

Despite extensive automation of industrial processes, operators often have to intervene manually. These interventions are recorded in various storage locations – such as the plant historian. How can this, currently rarely reused, data be exploited to create operational knowledge for future reuse?

Subscribe to ABB Review

Benedikt Schmidt, Marco Gärtler, Arzam Kotriwala, Sylvia Maczey, Reuben Borrison ABB Corporate Research Ladenburg, Germany, benedikt.schmidt@de.abb.com, marco.gaertler@de.abb.com, arzam.kotriwala@de.abb.com, sylvia.maczey@de.abb.com, reuben.borrison@de.abb.com

Even though the typical modern process plant is highly automated, manual intervention is still common – ie, human operators monitor the plant state continuously and counteract abnormal situations that endanger safety, environmental footprint, quality and operational efficiency by switching to manual mode and taking appropriate action. Repair and maintenance procedures or regular startups or shutdowns also require manual intervention. Interventions can take minutes or hours and operators can often make the same or very similar intervention over weeks, months or years and take similar action each time.

For compliance reasons, most process plants have a centralized historian that stores control system operational data. The historian covers event data and signals data generated by controllers, actuators and sensors. Manual interventions are also usually stored in the historian as an audit trail – ie, an event log that records every interaction with the control system, such as setpoint changes, the opening and closing of valves and the startup and shutdown of equipment.

01 Workflow mining of otherwise underused plant historian data can help improve operations. Photo: ©Kalyakan/stock.adobe.com
01 Workflow mining of otherwise underused plant historian data can help improve operations. Photo: ©Kalyakan/stock.adobe.com

While every intervention is stored in the historian, this data, due partly to its size (and differing formats), is typically not processed further. Even for small plants, the historian can store several hundred thousand events and signals from thousands of sensors every day, often leading to data quantities in the terabyte range.

Systems such as the historian represent a rich, untapped source of potentially valuable data. The question then arises: Can this data be used to preserve operational knowledge for future reuse? The answer is “yes.” Workflow mining is the key.

Workflow mining
Workflow mining of the manual interventions stored in the historian can allow a better understanding of a plant’s behavior, deliver insights into solution strategies and enable the assessment of the quality of these strategies. Workflow mining can also generate standardized best practices. Because the information related to manual intervention is scattered and it is not necessarily clear which case-cause data can be grouped or is related to the case in hand, the extraction of manual intervention cases from the process historian is, in itself, a challenge.

In this article, workflow mining on the plant historian is discussed with a specific focus on:
• Identification of cases of manual interventions.
• Identification of the plant state that triggered the manual intervention case.
• Extraction of case classes, which are put into a workflow mining pipeline, ultimately leading to operator guidance.

Manual intervention analysis
The first step is to create a tool to identify and display instances of manual interventions and their frequency and duration. This tool queries the audit trail and event database of the plant to provide a list of intervention data. From this data, a “case” must be extracted – ie, a subset of events from the list →02-03. The seed event is included in the subset, as are events that occur a given amount of time before and after the first and the last event, respectively. In other words, case extraction is built upon the notion of temporal isolation.

02 Number and duration of interventions in various regions of the plant.
02 Number and duration of interventions in various regions of the plant.

Case-cause extraction
It is assumed that every case is triggered by a plant state, represented by sensor values, other process-related information and active alarms or events. Therefore, an analysis (a “fingerprint”) of the overall system state just before a case is contrasted with a “normal” one to extract the cause of the case. This fingerprinting activity depends highly on the system under investigation. For the process plant associated with the work described here, it was decided to focus on the state of the signals that are part of the case. For these signals, key performance indicators (KPIs) are generated based on moving-average calculations. In other words, a fingerprint of the sensor values of the plant before the manual intervention is compared to the average “normal” sensor values. Those signals with a difference above a certain threshold compared to the long-term KPIs are candidates for a case cause. Case-cause information is added to the case information.

03 One region of the plant can be selected to see where most manual interventions happen.
03 One region of the plant can be selected to see where most manual interventions happen.

Case clustering
Every extracted case potentially presents a different manual intervention to solve a specific situational issue by following a particular strategy. To prepare for workflow mining, a clustering of those cases that represent similar strategies is applied.

04 Example of a mined episode.
04 Example of a mined episode.

Finding solutions
Once a general understanding of manual interventions has been gained from the audit trail and event database, as described above, solution procedures, so-called episodes, can be examined. →04 shows the four screen elements of a typical episode as displayed in the tool ABB has developed for this task. Along the top is a slider bar that defines how close two interventions need to be to belong to the same episode. Extending this window captures more events, resulting in a much longer solution procedure. The optimization of the length of this event window is a work in progress. The top-left element in →04 shows event types over time; top right is a density plot of events over time; and the bottom half of the screen shows plant events relating to the episode.

05 Example with relation of plant signals and workflow.
05 Example with relation of plant signals and workflow.

After the tool has defined suitable episodes, the next step is the generation of workflows – step-by-step instructions for the operator to rectify the abnormal situation in the future. Similar episodes that represent solutions to the same issue are imported into an external tool to generate a workflow →05. This workflow shows all the different actions taken to address the same problem – here, handling of the burner in a waste incinerator. →05 includes steps that were rarely executed and these can be filtered out to provide a step-by-step guide that consists of the most frequently executed steps →06. A timing guide can also be generated →07. Before it goes live, the workflow is checked by an expert.

06 Less-frequently used steps can be filtered out.
06 Less-frequently used steps can be filtered out.
07 The time taken for the steps can also be displayed.
07 The time taken for the steps can also be displayed.

Use by the operator
In the field, when an abnormal situation occurs for which a workflow is available, this workflow will be recommended to the operator. On acceptance, the step-by-step workflow will be shown on a sidebar.

08 Applied process of case and workflow mining.
08 Applied process of case and workflow mining.

The full process from intervention type selection, case extraction, case cause extraction, case clustering and workflow mining as described above is summarized in →08.

Insights from a mid-sized plant
During the development process, the team worked with a copy of the historian from a medium-sized power plant. The plant stores 8,000 signals in the historian and generates approximately 80 million events per year. The events include the operator’s audit trail. The six-month dataset from the historian was used to test different approaches. Discussions with experts further aided the development process.

09 Event type count. The numbers provide a rough idea of the dimensions to be expected by the analysis and mining activity.
09 Event type count. The numbers provide a rough idea of the dimensions to be expected by the analysis and mining activity.

→09 shows the large number of alarms triggered in the plant (high alarm numbers are not atypical). Operators have a good understanding of alarms and their relation to the plant state.

Enhanced process plant operation
Much valuable data is to be found in isolated plant historians. Workflow mining techniques exploit such data to bring benefits to a process plant’s operation. Online systems can be created to assist operators when faced with abnormal conditions and future work foresees enhancements through machine-learning approaches and full automation of the workflow mining process.

Some related topics require more research – for example, how to realize event localization if the plant tagging scheme does not provide it. Or how to assess the conformity and efficiency of the mined workflows as operators might perform actions that do not conform with the general guidance (eg, ignoring recommended sequences for starts or stops of equipment).

The successful resolution of these, and other, topics will allow plant operators to make more use of the data they already have to enhance the performance of their assets further and improve their financial results. 

Links

Contact us

Downloads

Share this article

Facebook LinkedIn Twitter WhatsApp