Control System Studio - CSS - Alarm Handling

38
Managed by UT-Battelle for the Department of Energy Kay Kasemir, Ph.D. ORNL/SNS [email protected] June 2011 at KEK Control System Studio - CSS - Alarm Handling

description

Control System Studio - CSS - Alarm Handling. Kay Kasemir , Ph.D. ORNL/SNS [email protected] June 2011 at KEK. Previous Attempts at SNS. ALH ; manual “summary” displays; generated soft-IOCs + displays Issues GUI Static Layouts N clicks to see active alarms Configuration - PowerPoint PPT Presentation

Transcript of Control System Studio - CSS - Alarm Handling

Page 1: Control System Studio - CSS - Alarm Handling

Managed by UT-Battellefor the Department of Energy

Kay Kasemir, Ph.D.ORNL/[email protected]

June 2011 at KEK

Control System Studio- CSS -

Alarm Handling

Page 2: Control System Studio - CSS - Alarm Handling

2 Managed by UT-Battellefor the Department of Energy

Previous Attempts at SNSALH; manual “summary” displays; generated soft-IOCs + displaysIssues

– GUI· Static Layouts· N clicks to see active alarms

– Configuration· .. was bad Always too many alarms· Changes required contacting one of the 2 experts, wait

~days, restart CA gateway, hope that nothing else broke– Information

· Operator guidance?· Related displays?· Most frequent alarm?· Timeline of alarm?

Page 3: Control System Studio - CSS - Alarm Handling

3 Managed by UT-Battellefor the Department of Energy

Now: Best Ever Alarm System Tool

Yes, alarms are always a little scary…

Page 4: Control System Studio - CSS - Alarm Handling

4 Managed by UT-Battellefor the Department of Energy

Alarm System Components

Control SystemAlarm Server

Cool UIConfiguration

B. Hollifield, E. Habibi, "Alarm Management: Seven Effective Methods for Optimum Performance", ISA, 2007

1. What you see2. Technical details3. How to use it

Page 5: Control System Studio - CSS - Alarm Handling

5 Managed by UT-Battellefor the Department of Energy

1. What you see

Alarm GUI used by Operators

Page 6: Control System Studio - CSS - Alarm Handling

6 Managed by UT-Battellefor the Department of Energy

What you see: Alarm Table· All current

alarms– new, ack’ed

· Sort by PV,Descr., Time, Severity, …

· Optional: Annunciate

· Acknowledge one or multiple alarms– Select by PV or description– BNL/RHIC type un-ack’

Page 7: Control System Studio - CSS - Alarm Handling

7 Managed by UT-Battellefor the Department of Energy

Another View: Alarm Tree

· All alarms– Disabled, inactive, new, ack’ed

· Hierarchical– Optionally only show

active alarms– Ack’/Un-ack’ PVs or sub-tree

Page 8: Control System Studio - CSS - Alarm Handling

8 Managed by UT-Battellefor the Department of Energy

Guidance, Related Displays, Commands

Basic Text Open EDM/OPI screen Open web page Run ext. commandHierarchical:

Including info of parent entries

Merges Guidance etc. from all selected alarms

Page 9: Control System Studio - CSS - Alarm Handling

9 Managed by UT-Battellefor the Department of Energy

Integrated with other CSS Tools

AlarmsHistory of PVEPICS Config.

Page 10: Control System Studio - CSS - Alarm Handling

10 Managed by UT-Battellefor the Department of Energy

CSS Context Menus Connect the Tools

Send alarmPV to anyother CSSPV tool

Page 11: Control System Studio - CSS - Alarm Handling

11 Managed by UT-Battellefor the Department of Energy

E-Log Entries

· “Logbook”from context menucreates text w/basic info aboutselected alarms.Edit, submit.

· Pluggable implementation, not limited to Oracle-based SNS ELog

Page 12: Control System Studio - CSS - Alarm Handling

12 Managed by UT-Battellefor the Department of Energy

.. may require Authentication/AuthorizationLog in/out while CSS is running

Online Configuration Changes

Page 13: Control System Studio - CSS - Alarm Handling

13 Managed by UT-Battellefor the Department of Energy

Add PV or Subsystem1. Right-click on ‘parent’2. “Add …”3. Enter nameOnline. No search for config files, no restarts.

Page 14: Control System Studio - CSS - Alarm Handling

14 Managed by UT-Battellefor the Department of Energy

Configure PV· Again online· Especially useful

for operators toupdate guidanceand relatedscreens.

Page 15: Control System Studio - CSS - Alarm Handling

15 Managed by UT-Battellefor the Department of Energy

2. Technical details

Behind the GUI;Tools to monitor performance

Page 16: Control System Studio - CSS - Alarm Handling

16 Managed by UT-Battellefor the Department of Energy

Technical View

Alarm Cfg & StateRDB

IOCs

Alarm ServerCurrent Alarms: Acknowledged? Transient? Annunciated?

LOG

MessageRDB

JMS2

Speech

JMS2

RDB

Tomcat- Report

s

CSS Applications

Alarm Client GUI

JMS

Alarm Updates Ack’; Config UpdatesAnnunciationsLog Messages

TALK ALARM_CLIENTALARM_SERVER

PV Updates (Channel Access, …)

Page 17: Control System Studio - CSS - Alarm Handling

17 Managed by UT-Battellefor the Department of Energy

General Alarm Server Behavior· Latch highest severity, or non-latching

– like ALH “ack. transient”

· Annunciate· Chatter filter ala ALH

· Alarm only if severity persists some minimum time· .. or alarm happens >=N times within period

· Optional formula-based alarm enablement:– Enable if “(pv_x > 5 && pv_y < 7) || pv_z==1”– … but we prefer to move that logic into IOC

· When acknowledging MAJOR alarm, subsequent MINOR alarms not annunciated– ALH would again blink/require ack’

Page 18: Control System Studio - CSS - Alarm Handling

18 Managed by UT-Battellefor the Department of Energy

Logging

· ..into generic CSS log also used for error/warn/info/debug messages

· Alarm Server: State transitions, Annunciations· Alarm GUI: Ack/Un-Ack requests, Config changes· Generic Message History Viewer

– Example w/ Filter on TEXT=CONFIG

Page 19: Control System Studio - CSS - Alarm Handling

19 Managed by UT-Battellefor the Department of Energy

Logging: Get timeline· Example: Filter on TYPE, PV

1. PV triggers,clears, triggers again

2. Alarm Server latches alarm

4. Problem fixed

3. Alarm Server annunciates

5. Ack’ed by operator

6. All OK

Page 20: Control System Studio - CSS - Alarm Handling

20 Managed by UT-Battellefor the Department of Energy

All Sorts of Web Reports

Page 21: Control System Studio - CSS - Alarm Handling

21 Managed by UT-Battellefor the Department of Energy

3. How to use it

This may be more importantthan the tools!

Page 22: Control System Studio - CSS - Alarm Handling

22 Managed by UT-Battellefor the Department of Energy

Best Ever Alarm System Tools, Indeed

.. but Tools are only half the issue

Good configuration requires plan & follow-up.

B. Hollifield, E. Habibi,"Alarm Management: Seven (??) Effective Methods for Optimum Performance", ISA, 2007

Page 23: Control System Studio - CSS - Alarm Handling

23 Managed by UT-Battellefor the Department of Energy

Alarm Philosophy

Goal:

Help operators take correct actions

– Alarms with guidance, related displays– Manageable alarm rate (<150/day)– Operators will respond to every alarm

(corollary to manageable rate)

Page 24: Control System Studio - CSS - Alarm Handling

24 Managed by UT-Battellefor the Department of Energy

· DOES IT REQUIRE IMMEDIATE OPERATOR ACTION?– What action? Alarm guidance!

· Not “make elog entry”, “tell next shift”, …· Consider consequence of no action

· Is it the best alarm?– Would other subsystems, with better PVs, alarm at the

same time?

What’s a valid alarm?

Page 25: Control System Studio - CSS - Alarm Handling

25 Managed by UT-Battellefor the Department of Energy

How are alarms added?· Alarm triggers: PVs on IOCs

– But more than just setting HIGH, HIHI, HSV, HHSV– HYST is good idea– Dynamic limits, enable based on machine state,...

Requires thought, communication, documentation· Added to alarm server with

– Guidance: How to respond– Related screen: Reason for alarm (limits, …), link

to screens mentioned in guidance– Link to rationalization info (wiki)

Page 26: Control System Studio - CSS - Alarm Handling

26 Managed by UT-Battellefor the Department of Energy

Impact/Consequence GridCategory So What Minor Consequence Major Consequence

Personnel Safety PPS independent from EPICS?

Environment, Public

Can EPICS cause contained spill of mercury?

Uncontained spill??

Cost:Beam Production, Downtime,Beam Quality

No effect

Beam off < 1 sec?

Beam off <10 min

<$10000

Beam off >10min

>$10000

· Mostly: How long will beam be off?

Page 27: Control System Studio - CSS - Alarm Handling

27 Managed by UT-Battellefor the Department of Energy

.. combined with Response TimeTime to Respond Minor Consequence Major Consequence

>30 Minutes NO_ALARM MINOR

10..30 minutes MINOR MAJOR

<10 minutes MAJOR MAJOR + Annunciate

– This part is still evolving…

Page 28: Control System Studio - CSS - Alarm Handling

28 Managed by UT-Battellefor the Department of Energy

Example: Elevated Temp/Press/Res.Err./…· Immediate action required?

– Do something to prevent interlock trip

· Impact, Consequence?– Beam off: Reset & OK, 5 minutes? – Cryo cold box trip: Off for a day?

· Time to respond?– 10 minutes to prevent interlock?

· MINOR? MAJOR?· Guidance: “Open Valve 47 a bit, …”· Related Displays: Screen that shows Temp, Valve, …

Page 29: Control System Studio - CSS - Alarm Handling

29 Managed by UT-Battellefor the Department of Energy

“Safety System” Alarms· Protection Systems not per se high priority

– Action is required, but we’re safe for now, it won’t get worse if we wait

· Pick One“Mommy, I need to gooo!”“Mommy, I went”

(Does it require operator action? How much time is there?)

Page 30: Control System Studio - CSS - Alarm Handling

30 Managed by UT-Battellefor the Department of Energy

Avoid Multiple Alarm Levels· Analog PVs for Temp/Press/Res.Err./…:

– Easy to set LOLO, LOW, HIGH, HIHI· Consider:

– Do they require significantly different operator actions?

– Will there be a lot of time after the HIGH to react before a follow-up HIHI alarm?

· In most cases, HIGH & HIHI only double the alarm traffic– Set only HSV to generate single, early alarm– Adding HHSV alarm assuming that the first one is

ignored only worsens the problem

Page 31: Control System Studio - CSS - Alarm Handling

31 Managed by UT-Battellefor the Department of Energy

Bad Example: Old SNS ‘MEBT’ Alarms

· Each amplifier trip:≥ 3 ~identicalalarms, no guidance

· Rethought w/ subsystemengineer, IOC programmerand operators: 1 better alarm

Page 32: Control System Studio - CSS - Alarm Handling

32 Managed by UT-Battellefor the Department of Energy

Alarms for Redundant Pumps

Page 33: Control System Studio - CSS - Alarm Handling

33 Managed by UT-Battellefor the Department of Energy

Alarm Generation: Redundant Pumps the wrong way· Control System

– Pump1 on/off status– Pump2 on/off status

· Simple Config setting: Pump Off => Alarm:– It’s normal for the ‘backup’ to be off– Both running is usually bad as well

· Except during tests or switchover– During maintenance, both can be off

Page 34: Control System Studio - CSS - Alarm Handling

34 Managed by UT-Battellefor the Department of Energy

Redundant Pumps

· Control System– Pump1 on/off status– Pump2 on/off status– Number of running pumps– Configurable number of desired pumps

· Alarm System: Running == Desired?– … with delay to handle tests, switchover

· Same applies to devices that are only needed on-demand

1Required Pumps:

Page 35: Control System Studio - CSS - Alarm Handling

35 Managed by UT-Battellefor the Department of Energy

Weekly Review: How Many? Top 10?

Page 36: Control System Studio - CSS - Alarm Handling

36 Managed by UT-Battellefor the Department of Energy

A lot of information available· How often did PV trigger?· For how long?· When?

· Temporary issue?Or need HYST,alarm delay,fix to hardware?

Page 37: Control System Studio - CSS - Alarm Handling

37 Managed by UT-Battellefor the Department of Energy

Weekly Check: Stale, Forgotten?

Page 38: Control System Studio - CSS - Alarm Handling

38 Managed by UT-Battellefor the Department of Energy

Summary

· BEAST operational since Feb’09– Needs a logo– For now without BEAUtY– DESY AMS is similar and has been

operational for longer

· Pick either, but good configuration requires work in any case– Started with previous “annunciated” alarms

· ~300, no guidance, no related displays· Now ~330, all with guidance, rel. displays

– “Philosophy” helps decide what gets added and how· Immediate Operator Action? Consequence?

Response Time? – Weekly review spots troubles and tries to improve

configuration