Ithings2012 20nov

27
Improving data center management operations using wireless sensor networks The IEEE International Conference on Internet of Things November 2012, Besançon, France Panagiotis Garefalakis and Kostas Magoutis Institute of Computer Science (ICS) Foundation for Research and Technology Hellas (FORTH) Heraklion, Greece

Transcript of Ithings2012 20nov

Page 1: Ithings2012 20nov

Improving data center management operations using wireless sensor

networks

The IEEE International Conference on Internet of Things November 2012, Besançon, France

Panagiotis Garefalakis and Kostas Magoutis

Institute of Computer Science (ICS)

Foundation for Research and Technology – Hellas (FORTH)

Heraklion, Greece

Page 2: Ithings2012 20nov

Motivation

Page 3: Ithings2012 20nov

Motivation

• Challenges:

High complexity configuration

Hardware maintenance

Software changes

• Several systems proposed to reduce management complexity

AutoPilot (Microsoft), SmartFrog (HP), OpenView (HP), etc.

• Several problems remain unsolved thus keeping the complexity and the cost of running a DC high.

Page 4: Ithings2012 20nov

Goal

• Address three important problems:

Automatically determine the physical location of servers.

Notify administrators of any location changes.

Determine status of servers even if network is down.

• Our solution to these problems relies on :

Auto-configuring wireless sensor network.

Distributed monitoring and management system.

Page 5: Ithings2012 20nov

Wireless technology used

• Zigbee IEEE 802.15.4:

Up to 65536 Personal area networks with 16 channels each.

Specific roles of each device ( coordinator, slave).

• Two types of messages:

Transparent mode (broadcast only, simple).

API communication mode (unicast, reliable, RSSI).

IEEE 802.15.4Low power250Kbit/secRange ~100m

Page 6: Ithings2012 20nov

Prototype Wireless Sensor

Page 7: Ithings2012 20nov

Nagios: an open-source distributed discovery, monitoring, and control system

Page 8: Ithings2012 20nov

Nagios: remote plugin execution

Page 9: Ithings2012 20nov

Nagios state machine

Host/service statestate type

Page 10: Ithings2012 20nov

Challenge: Determining host status

Page 11: Ithings2012 20nov

A typical Data Center

Page 12: Ithings2012 20nov

System Architecture

Page 13: Ithings2012 20nov

Auto-configuration

Page 14: Ithings2012 20nov

Data Collection

Page 15: Ithings2012 20nov

Server integration

• Access to a variety of sensors:

– Temperature

– Airflow

– Power consumption

– Rack information

• Current technology : BMC

• Intelligent Platform Management Interface (IPMI)

Page 16: Ithings2012 20nov

Server Localization: Trilateration

• RSSI values: -40dB (strong) … -90dB (weak).

Page 17: Ithings2012 20nov

Event correlation

Page 18: Ithings2012 20nov

Evaluation

• Office environment

• Data Center environment

• Use of management interface

Page 19: Ithings2012 20nov

Office environment

• Server S movement over a 2 meter distance.

• We compare the means of RSSI time series before and after movement using the unpaired student t-test.

• The mean of the time series for the moved server has a statistically reliable shift.

Page 20: Ithings2012 20nov

Data Center

• Metallic enclosures, electromagnetic interference introduce noise.

• Management server continuously evaluates the RSSI of messages received from all coordinators.

Page 21: Ithings2012 20nov

Server movement accuracy

• Coordinator movement over a 1.5 meter distance.

• We compare the means of RSSI time series before and after movement using the unpaired student t-test.

• The mean of the time series for the moved coordinator has a statistically reliable shift.

• Known techniques can increase accuracy using LQI(signal filtering).

Data Center Topology. Groups of servers sharing a coordinator are show in dashed boxes. Slave Zigbees are omitted.

Page 22: Ithings2012 20nov

Use of management interface

Server state is UNREACHABLE, but server state is UP (network partition)

Wireless sensor reports location change

Page 23: Ithings2012 20nov

Conclusions

• Extended Nagios to take advantage of auto configuring WSN .

Easy to deploy.

Low capital costs.

Helps administrators by:

o Collecting sensor data – monitoring status.

o Alert them in a case of location changes.

o Identifies types of failure.

o Sophisticated correlation of DC states.

• In line with trends in server management technology.

Page 25: Ithings2012 20nov

Security Considerations

128-bit symmetric key encryption (AES)Hardware support by Zigbee on top of IEEE 802.15.4Coordinator performs key management (trust center)

Page 26: Ithings2012 20nov

Nagios event correlation

Page 27: Ithings2012 20nov

Implementation - Extending Nagios

Status code Explanation and status message

OK

The plugin was able to check the service and it appeared to be

functioning properly :

“Signal-Fine Distance + distance (m)”

Warning

The plugin was able to check the service, but it appeared to

violate a warning threshold or not working properly :

“Signal-Low Distance + distance (m)” or

“Sensor Changed Position + distance (m)”

Critical

The plugin detected that either the service was not running or

it was violating a critical threshold:

“Sensor Disconnected!”

Unknown

Invalid command line arguments were supplied to the plugin

or low-level failures internal to the plugin (such as unable to

fork or to open a TCP socket) that prevent it from performing

the specified operation.

“Unknown State!”

• WSN plug in for localization.