Managing E-Science Cyber-Infrastructures: A Case Study

1
Managing E-Science Cyber-Infrastructures: A Case Study Srinath Perera, Suresh Marru, Dennis Gannon Indiana University {hperera, smarru, gannon}@cs.indiana.edu The Problem 1. Robust – Election based, and the system recovers given a constant time-period where manager and communication failures do not happen. 2. Scalable (close to 100000 resources ). 3. Supports User-defined Management Rules (see the example rule). 4. Local control (decisions based on a single resource ) and Global control (decisions based on many resources) using rules. Methodology for Managing LEAD This work is primarily supported by National Science Foundation under the Co- Operative Agreement ATM03-0331480 S. Perera and D. Gannon. Employing Automated Management to Administer a Large-Scale E-Science Cyber- Infrastructure. Retrieved oct 21, 2008 from http://www.extreme.indiana.edu/hasthi/lea d-2008.pdf . 2. http://www.extreme.indiana.edu/hasthi/lea d/ 1 Hasthi Framework The LEAD infrastructure enables students, faculty, and researchers to perform numerical weather predictions by logging in to the portal, search for data, and run workflows to data mine, forecast, or post-process data. 1. Create-new, restart, relocate a service (use shell scripts). 2. Tune and configure services (using WSDM). 3. Shutdown services. 4. Perform user interactions by sending an email and when the user make a choice by clicking on a link in the mail, executing an action. 1. Host Agent – monitor a host. 2. In-Memory Agent running in the same address space as services and monitor them. 3. Polling Based Agent– periodically reads a html page that shows resource state and parses it to get metrics. 4. Log Based Agent – Listens to log events of an application to monitor it. Supports any application that uses Log4j. 5. Process Monitor - monitors a UNIX process. Supported Sensors Supported Actions This work will be available through the OGCE project ( http://www.collab-ogce.org ) and more details can be found in [1] & [2].

description

S. Perera and D. Gannon. Employing Automated Management to Administer a Large-Scale E-Science Cyber-Infrastructure . Retrieved oct 21, 2008 from http://www.extreme.indiana.edu/hasthi/lead-2008.pdf . 2. http://www.extreme.indiana.edu/hasthi/lead/. 1. - PowerPoint PPT Presentation

Transcript of Managing E-Science Cyber-Infrastructures: A Case Study

Page 1: Managing  E-Science  Cyber-Infrastructures:  A Case  Study

Managing E-Science Cyber-Infrastructures: A Case Study

Srinath Perera, Suresh Marru, Dennis Gannon

Indiana University{hperera, smarru, gannon}@cs.indiana.edu

The Problem

1. Robust – Election based, and the system recovers given a constant time-period where manager and communication failures do not happen.

2. Scalable (close to 100000 resources ).

3. Supports User-defined Management Rules (see the example rule).

4. Local control (decisions based on a single resource ) and Global control (decisions based on many resources) using rules.

Methodology for Managing LEAD

This work is primarily supported by National Science Foundation under the Co-Operative Agreement ATM03-0331480

S. Perera and D. Gannon. Employing Automated Management to Administer a Large-Scale E-Science Cyber-Infrastructure. Retrieved oct 21, 2008 fromhttp://www.extreme.indiana.edu/hasthi/lead-2008.pdf.

2. http://www.extreme.indiana.edu/hasthi/lead/

1

Hasthi Framework

The LEAD infrastructure enables students, faculty, and researchers to perform numerical weather predictions by logging in to the portal, search for data, and run workflows to data mine, forecast, or post-process data.

1. Create-new, restart, relocate a service (use shell scripts).

2. Tune and configure services (using WSDM).

3. Shutdown services.

4. Perform user interactions by sending an email and when the user make a choice by clicking on a link in the mail, executing an action.

1. Host Agent – monitor a host.

2. In-Memory Agent – running in the same address space as services and monitor them.

3. Polling Based Agent– periodically reads a html page that shows resource state and parses it to get metrics.

4. Log Based Agent – Listens to log events of an application to monitor it. Supports any application that uses Log4j.

5. Process Monitor - monitors a UNIX process.

Supported Sensors Supported Actions

This work will be available through the OGCE project (http://www.collab-ogce.org) and more details can be found in [1] & [2].