Http:// Resource Monitoring & Service Discovery in GeneGrid Sachin Wasnik Belfast e-Science Centre.

Post on 01-Jan-2016

218 views 0 download

Tags:

Transcript of Http:// Resource Monitoring & Service Discovery in GeneGrid Sachin Wasnik Belfast e-Science Centre.

http://www.qub.ac.uk/escience

Resource Monitoring & Service Discovery in GeneGrid

Sachin Wasnik

Belfast e-Science Centre

GeneGrid Project

• Collaborative Industrial R&D project• Stakeholders

–Fusion Antibodies–Amtec Medical

–Support from BT plc

–£820,000 (DTI funding £406,000)

GeneGrid: Objectives

• Grid Based Framework for Bioinformatics Analysis

• Integration of Existing Technologies & Data Sets

• Production of a ‘Virtual Bioinformatics Laboratory’

• Platform for scientists to access collective skills and experiences in a secure, reliable and scalable manner

• in silico knowledge discovery

GeneGrid: Components

• Application Integration & Management (GAM)

• Data Access, Integration & Storage (GDM)

• Resource Monitoring & Service Discovery (GRM)

• Workflow & Process Management (GWM)

• Portal

Resource Monitoring & Service Discovery

• Built upon the Belfast e-Science Grid Manager project which consists of

1) GeneGrid Application & Resources Registry (GARR)– Registry service - GT3 based

2) GeneGrid Node Monitors (GNM)– Light weight adapter present on all Node

GeneGrid Environment # 2

GeneGrid Environment # n

GeneGrid Overview

GAM Service

University Melbourne

Primer3

4p SMP linux

GeneWise

I686 Linux Sparc (Solaris 8)

GAM

BT Data Centre

SignalP

RP

GAM

TMHMM

EMBOSS

GeneGrid Environment

GeneGridApp &

ResourceRegistryGARR

GeneGrid Portal

GeneGrid Workflow Manager

GDM Service

GDM Service

GeneGrid Workflow Definition

GeneGridSTRIP

EMBOSS

GAM Service

Swissprot

EMBL

ClustalW HMMER

32 x Sun Blade linux

DB query RP Eliminator

Belfast e-Science Centre

QUB

SignalP

TMHMM

RP

bl2seq

6p SMP sparc

(solaris 7)

GAM Service

QUB

BLAST

GAM Service

SDSC

Swissprot

EMBL

TMHMM

DB query

bl2seq

4p SMP linux

GeneGrid Environment

• GWMSF and both GDMSFs in the GE register their existence with the GARR

• GWMSF and the GeneGrid Portal are both configured with the location of the GARR service

• Upon start up, the Portal connects to the GARR to discover the location of the GDMSF for both the GWDD and the GSTRIP databases

GARR

Portal GWMSF

GWDD

GDMSF

GSTRIP

GDMSF

GNM GNM

GNMGNM

GNM on all GeneGrid Environment nodes registering

with the GARR.

GeneGrid Portal

GARR Service

• GARR is the central service that mediates service discovery by publishing information about various services available

• provides an interface to query

• captures the information which is sent by the GNM

• Stores the information in GARR Database

GeneGrid Node Monitor

System Information» Hardware address» System Time,» IP address,» CPU speed, » CPU load, » Total Memory » Free Memory» Operating System’s

Name» Operating Systems

version» Uptime» Hostname» System Architecture» Number of Processor» Load average for last 1

minute, 5 minutes 15 minutes

» Custom Data.

Application Data Name of the Resource Type of the Resource Grid Service Handle (GSH)

BLAST

EMBLBLAST

GARR

GNM GNM

GNM

BeSC

SDSC

GAMSF

GAMSF

GDMSF

GNM on multiple resources across administrative domains registering resource information securely to a GARR

User Interface

Shared Resources

• GAM and GDM services make up the GeneGrid Shared Resources

• GNM can register with many GARR services across multiple GE allowing the resources to be shared between multiple organisations

• Organisations have complete control over what resources, if any, they wish to share with other organisations, forming dynamic virtual organisations

VO 1GARR

VO 2GARR

GNM

GAMSF

GNM

GAMSF

Resource A Resource B

Resource A registers with both VO1 & VO2.

Resource B registers with VO1 only.

Performance

Response Time ( GARR with 50 entries )

0

500

1000

1500

2000

2500

3000

3500

1 2 3 4 5

Clients

Tim

e in

mil

isec

on

ds

c01

c0

c05

c06

c07

Performance 2

Response Time (GARR with 100 Entries)

010002000300040005000600070008000

1 2 3 4 5

Clients

Tim

e i

n M

ilis

eco

nd

s

c0

c01

c05

c06

c07

Future Work

• Capture the network information in order to effectively utilize the resources when huge file transfer are to be performed

• Predicting the performance of resources based on the data stored in GARR database

• Add metadata about the services registered

Contact

• Project Manager: Dr Paul Donachy– p.donachy@qub.ac.uk

• Bioinformatician: P.V. Jithesh– p.jithesh@qub.ac.uk

• Grid Programmer: Sachin Wasnik– s.wasnik@qub.ac.uk

• More information:http://www.qub.ac.uk/escience/projects/genegrid

http://www.qub.ac.uk/escience

Thank You!