Summarizing the standard's 10 stages into four general tasks can simplify implementation and speed up the schedule.
Marcus Tennant
Figure 1: The stages in the ISA-18.2 lifecycle model of alarm management are depicted in this diagram. All graphics courtesy: Yokogawa |
In process industries, alarm systems are used to notify operators and other plant personnel of abnormal process conditions or equipment malfunctions. Alarm systems help operators operate the process safely under both normal and abnormal conditions, and the alarm system needs to be designed correctly to provide the best opportunity for safe and efficient operation.
Before the wide adoption of distributed control systems (DCSs) and other PC-based human machine interfaces (HMIs), visual and audible indications of process plant operations were normally provided by a panel board, with the number of alarms restricted because of space limitations. Alarm points had to be selected with care because they were hardwired and expensive to change.
But with a modem automation system, the number of alarms is virtually unlimited, as additions and changes are made simply by reconfiguring software. This ease-of-use provides the opportunity to improve alarm systems, but can also make alarm management more challenging.
In particular, there is a temptation to alarm every possible deviation, even when the deviation doesn't require immediate attention. In the event of a serious incident, this practice can generate a huge number of alarms simultaneously, commonly referred to as alarm flooding. When this occurs, operators may not be able to ascertain and act on the important alarm(s), causing the incident to escalate in terms of severity.
In the worst case, alarm flooding can cause serious environmental damage, production loss, injury, or even death to plant personnel. Proper management of alarm systems is essential to deal with alarm flooding and other related issues.
Poor alarm management can lead to serious consequences in process plants, as noted in the book "Alarm Management for Process Control" by Douglas H. Rothenberg, and by others in various documents and publications.
For example, poor alarm management caused one incident that resulted in $80 million damage and injured 26 people. Another process plant incident resulted in 15 deaths, 170 injuries, and significant economic losses. To avoid these types of incidents, proper alarm management is essential.
To improve alarm management, the International Society for Automation (ISA) issued standard ANSI/ISA-l8.2-2009, "Management of Alarm Systems for Process Industries." When issuing this standard, ISA considered other existing documents including the Engineering Equipment and Materials Users' Association (EEMUA) standard 191 "Alarm Systems: A Guide to Design, Management and Procurement." The International Electrotechnical Commission (IEC) is using ISA-18.2 as the basis for international alarm management standard IEC-62682.
Role of alarms
ISA-18.2 defines an alarm as "An audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response." This means an alarm is more than a message or an event, as it indicates a condition demanding quick operator action.
Ideally, each alarm will provide the operator with related information such as priority, possible root cause, and a recommend response procedure. The operator can then respond to the alarm quickly and effectively. Limiting alarms, prioritizing alarms, and providing alarms with necessary related information can reduce the chance that an operator will delay response, or even ignore the alarm.
What is alarm management?
Alarm management is the proper implementation of documentation, design, usage, and maintenance procedures to construct an effective alarm system. ISA18.2 defines the processes and procedures required to create an effective alarm management system. Figure 1 shows the ISA18.2 lifecycle model of alarm management. This model can be applied to a new or an existing alarm system.
Stages and activities
Stage | Title | Activities |
---|---|---|
A | Philosophy | Define process for alarm management and alarm system requirements specification |
B | Identification | Determine potential alarms |
C | Rationalization | Rationalization, classification, prioritization, and documentation |
D | Detailed design | Basic alarm design, HMI design, and advanced alarming design |
E | Implementation | Install alarms, initial testing, and initial training |
F | Operation | Operator responds to alarms refresher training |
G | Maintenance | Maintenance repair and replacement, and periodic testing |
H | Monitoring and assessment | Monitoring alarm data and report performance |
I | Management of change | Process to authorize additions, modifications, and deletions of alarms |
J | Audit | Periodic audit of alarm management processes |
Figure 2: Activities associated with the stages in the ISA18.2 lifecycle model of alarm management are shown.
As shown in Figure 2, stage activities logically follow one another, and correct completion of all activities will result in a properly designed and effectively operating alarm management system. The lifecycle model also includes stages for ongoing maintenance, essential for sustaining effective operation.
The 10 stages in the lifecycle model can be roughly categorized into four general tasks. To perform these tasks, it's essential that a process plant create a cross-functional team that includes all relevant plant functional areas including, but not limited to, management, engineering, safety, operations, and maintenance.
Task 1: Optimizing system design
Figure 3: This alarm summary window is an effective tool for displaying alarms to operators in a manner allowing quick and comprehensive response. |
This task encompasses lifecycle model stages A through E: philosophy, identification, rationalization, detailed design, and implementation. When properly executed, this task supports the design of an alarm system that prevents alarm flooding and other undesirable alarm system occurrences. It also provides operators with the information they need to take proper action when alarms occur.
An important activity within this task is to specify the causes of current nuisance alarms and to eliminate these alarms, or at least greatly reduce their frequency. This is an essential step toward reducing alarm flooding. In many cases, alarms can be reclassified as events to be recorded by the automation system for later review, instead of as items requiring immediate operator attention.
Once the total number of alarms has been reduced as much as possible, the next step is to prioritize the remaining alarms. Prioritization can be quite complex, as it requires plant personnel to identify possible abnormal operating conditions, list the alarms that might occur for each condition, and then prioritize these alarms. After alarms are reduced and prioritized, then recommended operator actions for each alarm can be created.
Performing these and other steps as listed in the lifecycle model stages A through E will result in the creation of an effective alarm system.
Task 2: Advanced operator support
This task encompasses lifecycle model stages F and G: operation and maintenance. A tool in implementing this task is the creation of alarm summary windows in the automation system (see Figure 3). Modern automation systems will include the functionality to create these windows in the process and its alarms.
Figure 4: This advanced alarm summary window uses a flow chart to display what sequence of actions should be performed by the operator in response to a particular alarm. |
An alarm summary window typically displays the list of currently active alarms. With most automation systems, the alarm summary window will provide sort, filter, shelving, and other functions to help improve display of information to the operators. These functions can be used to prevent higher priority alarms from being overlooked by operators.
Each alarm will require some type of response from the operator. Advanced alarm summary windows display what sequence of actions should be performed by the operator in response to particular alarms (see Figure 4). An effective method for displaying these actions is a flow chart, which can be used to guide the operator through the response sequence.
A flow chart can be very effective because it can contain if/then instructions, guiding operators to take different actions depending on how the process responds to operator actions and other conditions.
Task 3: Performance evaluation
This task models stage H, monitoring and assessment, to evaluate the performance of the existing alarm system. In ISA18.2, key performance indicators (KPIs) are suggested as a useful tool to perform the activities in this stage. An example KPI would be the number of alarms within a fixed time of the operation. As shown in Figure 5, ISA 18.2 lists the very likely to be acceptable and maximum manageable number of alarms for various time periods.
Practical limits to human capabilities
Very likely to be acceptable | Maximum manageable |
---|---|
~150 alarms per day | ~300 alarms per day |
~6 alarms per hour (average) | ~12 alarms per hour (average) |
~1 alarm per 10 minutes (average) | ~2 alarms per 10 minutes (average) |
Figure 5: This chart lists recommended alarm frequencies for an alarm system in accordance with ISA 18.2.
The alarm system will provide a host of data to help evaluate performance including, but not limited to, alarm frequency, operator response time, and specific operation actions. This data can be used to improve the alarm system, and to provide more effective operator training.
It is often useful to evaluate alarm system data from several viewpoints. For instance, the number of alarms in each area and the number of hourly alarms are both important data points, and can be evaluated separately or together. By using these kinds of data, the conditions in which operator errors frequently occur can be specified, and these results can be used to improve operator response.
Figure 6: These types of reports can be very effective tools for evaluating alarm system performance. |
Modern automation systems provide tools for creating reports, and these reports can be particularly useful for evaluating alarm system performance (see Figure 6). Modern automation systems can be configured to collect a host of data concerning alarm system performance.
This data can be presented to plant personnel in a variety of formats from simple KPIs to charts and graphs. Using this information, alarm system performance can be evaluated and improved.
Task 4: Continuous improvement
This task encompasses lifecycle model stages I and J, management of change and audit. Continuous alarm system improvement is supported by performing uniform management of the enormous amount of alarm-related data typically contained in an alarm master database.
For example, alarm system design parameters can be compared with actual alarm system performance figures. When significant discrepancies exist, then corrective action can be recommended. Recommended corrective actions can then be reviewed and implemented as required using a comprehensive change management procedure.
Alarm management entry points
For most alarm systems, there are three typical entry or starting points for creating an alarm management system. Referring to the ISA 18.2 lifecycle model, these points are: a) philosophy, h) monitoring and assessment, or j) audit.
For new process plants, philosophy is the preferred point of entry. For existing plants, either monitoring and assessment or audit is preferred using recent plant operating data. Actions are then taken based on the evaluation. This course retains effective existing practices while pinpointing areas that require improvement.
For new plants, the lifecycle model should be followed in its entirety, starting with task 1, ensuring that all necessary steps are taken to implement an effective alarm management system.
Proper alarm management is indispensable for achieving safe and secure process plant operation. The approach to alarm management standardized by ISA18.2 was introduced and explained in this article, and then summarized into four general tasks. Following this approach will result in an optimal alarm management system that prevents minor alarms and upsets from escalating into serious incidents. ce
Marcus Tennant is principal systems architect for Yokogawa.
Posted with permission from the September 2013 issue of Control Engineering® www.csemag.com.
Copyright 2013, CFE Media. All rights reserved.
For more information on the use of this content, contact Wright's Media at 877-652-5295.
Related Products & Solutions
-
Alarm Management
Software solutions to help reduce risk and increase safety of plant operations through well-managed alarm systems.
Have Questions?
Contact a Yokogawa Expert to learn how we can help you solve your challenges.