HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

20
HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution

Transcript of HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Page 1: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

HEPiX Fall 2014

Tony Wong (BNL)

UPS Monitoring with Sensaphone: A cost-effective solution

Page 2: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

BackgroundFacility built piece-meal over the years

Old data center dates back to ~70’sNew floor space (~60% of total) built-refurbished since

2008UPS power provided for most of the RACF

1 MW (battery only runs for ~30 min) for old data center

1.3 MW (flywheel + generator runs for ~days) for new floor space

Direct monitoring of battery UPS with proprietary software

No direct monitoring for flywheel+generator systemOperational oversight at BNLExpensive, proprietary proposed solutions were rejected

Page 3: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Old Data Center

Page 4: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

New Data Center

Page 5: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

UPS monitoring in new data centerRequirements include:

Must be fully configurable and controlled by RACF

Alarm notification mechanism over multiple channelsCell phone, SMS/text messaging and emailDirect interface with monitoring computer

Commercially available and supportedCheap (ie, no yearly maintenance contracts)Stand-alone battery back-up (in case of

power loss)Ability to integrate with existing battery UPS

monitoring system

Page 6: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

SynapsenseFound with a simple google searchPurchased model IMS-1000 (single room system)

Cheap (~US$900 each unit)Requires some electrical work to install Connect to phone line and internet

Installed one unit in Sigma-7 in 2010 and another in CDCE in 2011Initially configured to notify over phone and email only (no

integration with existing auto-shutdown mechanism)Call down list feature with auto-escalation enabled (ie, if

person A doesn’t acknowledge the alarm, the system calls person B, etc)

Supervisor on call down list – effective way to motivate staffAfter extensive testing, no further development for several

years (other priorities took over)

Page 7: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Anatomy of BNL Configuration

IMS-1000

Utility power and default power source for IMS-1000 unit

UPS 1 UPS 2

Battery back-up for IMS-1000 unit

Alarm signal via telephone line

Alarm signal via Internet

Inputs

Outputs

Page 8: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Alarm & Notification Mechanism

UPS Alarm

Inform 1st contact person

Begin countdown for automatic shutdown

Alert data center supervisor

Page 9: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Alarm & Notification Mechanism

UPS Alarm

Inform 1st contact person

If no answer, escalate to 2nd , 3rd and 4th contacts

Begin countdown for automatic shutdown

Alert data center supervisor

Page 10: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Alarm & Notification Mechanism

UPS Alarm

Inform 1st contact person

If no answer, escalate to 2nd , 3rd and 4th contacts

If no answer from any responder, call the boss

Begin countdown for automatic shutdown

Alert data center supervisor

Page 11: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Alarm & Notification Mechanism

UPS Alarm

Inform 1st contact person

If no answer, escalate to 2nd , 3rd and 4th contacts

If no answer from any responder, call the boss

Begin countdown for automatic shutdown

Alert data center supervisor

Shutdown worker nodes and non-critical servers

Page 12: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Work Timeline

UPS Alarm

Inform 1st contact person

If no answer, escalate to 2nd , 3rd and 4th contacts

If no answer from any responder, call the boss

Begin countdown for automatic shutdown

Alert data center supervisor

Shutdown worker nodes and non-critical servers

Before Summer 2014

After Summer 2014

Page 13: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

IMS-1000 Unit

Wall-mounted box

Installed IMS-1000 unit

Page 14: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

IMS-1000 Unit (continued)

Close-up view of unit

Input sensor (UPS)

Page 15: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

IMS-1000 Web Interface

Page 16: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

UPS and CoolingMost of the IT equipment connected to

UPS-backed power, but CRAC (Computer Room Air Conditioning) are not.

Dangerous overheating can occur in a matter of minutes

Page 17: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

March 18, 2014

6:20 am 6:40 am

Page 18: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

Recent developmentsThe cooling incident on March 18 gave us tangible

evidence that investing a little more time on configuring Sensaphone is a good idea

UPS monitoring via Sensaphone was integrated with existing auto-shutdown of (most) IT equipment due to cooling or utility power interruptions – completed summer 2014Beyond email/phone alarm acknowledgementBrookhaven’s utility division on-call staff notifiedShutdown if temperature passes threshold or time limits

Selected CRAC units now on UPS back-up power and domestic water back-up (for utility chilled water) – completed September 2014

Plan to add more CRAC units to UPS and domestic water back-up in next 2-3 years

Page 19: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

September 2014

Temperature fluctuations resulting from engineering work to integrate CRAC units to domestic back-up water (in case utility chilled water plant is down)

Page 20: HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.

ConclusionsSensaphone is a low-cost solution for UPS

monitoring of a data centerEasy to configure (ours was done by a

technician and a summer student)Portable and flexible

Can monitor multiple power sources if needed

Can monitor other parameters such as humidity, temperature, etc

Durable – has worked quietly and reliably for ~4 years

Free, technical support (via email and phone) available