Outline

27
TRANSPARENT GRID ENABLEMENT OF WEATHER RESEARCH AND FORECASTING S. MASOUD SADJADI1, LIANA FONG6, ROSA M. BADIA2, JAVIER FIGUEROA1,9, JAVIER DELGADO1, XABRIEL J. COLLAZO-MOJICA8, KHALID SALEEM1, RAJU RANGASWAMI1, SHU SHIMIZU4, HECTOR A. DURAN LIMON5, PAT WELSH3, SANDEEP PATTNAIK10, ANTHONY PRAINO6, DAVID VILLEGAS1, SELIM KALAYCI1, GARGI DASGUPTA7, ONYEKA EZENWOYE1, JUAN CARLOS MARTINEZ1, IVAN RODERO2, SHUYI CHEN9, JAVIER MUÑOZ1, DIEGO LOPEZ1, JULITA CORBALAN2, HUGH WILLOUGHBY1, MICHAEL MCFAIL1, CHRISTINE LISETTI1, AND MALEK ADJOUADI1 1: FLORIDA INTERNATIONAL UNIVERSITY (FIU), MIAMI, FLORIDA, USA; 2: BARCELONA SUPERCOMPUTING CENTER, BARCELONA, SPAIN; 3: UNIVERSITY OF NORTH FLORIDA, JACKSONVILLE, FLORIDA, USA; 4: IBM TOKYO RESEARCH LABORATORY, TOKYO, JAPAN; 5: UNIVERSITY OF GUADALAJARA, CUCEA, MEXICO; 6: IBM T. J. WATSON, NY, USA; 7: IBM IRL, INDIA; 8: UNIVERSITY OF PUERTO RICO, MAYAGUEZ CAMPUS, PUERTO RICO; 9: UNIVERSITY OF MIAMI, CORAL GABLES, FLORIDA, USA; 10: FLORIDA STATE UNIVERSITY, TALLAHASSEE, FLORIDA, USA CONTACT: [email protected]

description

- PowerPoint PPT Presentation

Transcript of Outline

Page 1: Outline

TRANSPARENT GRID ENABLEMENT OFWEATHER RESEARCH AND FORECASTINGS. MASOUD SADJADI1, LIANA FONG6, ROSA M. BADIA2, JAVIER FIGUEROA1,9, JAVIER DELGADO1, XABRIEL J. COLLAZO-MOJICA8, KHALID SALEEM1, RAJU RANGASWAMI1, SHU SHIMIZU4, HECTOR A. DURAN LIMON5, PAT WELSH3, SANDEEP PATTNAIK10, ANTHONY PRAINO6, DAVID VILLEGAS1, SELIM KALAYCI1, GARGI DASGUPTA7, ONYEKA EZENWOYE1, JUAN CARLOS MARTINEZ1, IVAN RODERO2, SHUYI CHEN9, JAVIER MUÑOZ1, DIEGO LOPEZ1, JULITA CORBALAN2, HUGH WILLOUGHBY1, MICHAEL MCFAIL1, CHRISTINE LISETTI1, AND MALEK ADJOUADI1

1: FLORIDA INTERNATIONAL UNIVERSITY (FIU), MIAMI, FLORIDA, USA; 2: BARCELONA SUPERCOMPUTING CENTER, BARCELONA, SPAIN; 3: UNIVERSITY OF NORTH FLORIDA, JACKSONVILLE, FLORIDA, USA; 4: IBM TOKYO RESEARCH LABORATORY, TOKYO, JAPAN; 5: UNIVERSITY OF GUADALAJARA, CUCEA, MEXICO; 6: IBM T. J. WATSON, NY, USA; 7: IBM IRL, INDIA; 8: UNIVERSITY OF PUERTO RICO, MAYAGUEZ CAMPUS, PUERTO RICO; 9: UNIVERSITY OF MIAMI, CORAL GABLES, FLORIDA, USA; 10: FLORIDA STATE UNIVERSITY, TALLAHASSEE, FLORIDA, USA

CONTACT: [email protected]

Page 2: Outline

OUTLINE Motivation

Grid Enablement

Application and Scenario

System Overview

Remaining Challenges & Lessons Learned

Page 3: Outline

MOTIVATION Weather Prediction can:

Save Lives Help Business Owners

How? Accurate Results Precise Location Information

What do we have? WRF – Weather Research Forecast “The Weather Research and Forecasting (WRF)

Model is a next-generation mesocale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs.”

Page 4: Outline

MOTIVATION (CONT.) WRF Status

Single Machine/Cluster Single Domain Fine Resolution -> Resource Requirements

How to Overcome this? Through Grid Enablement

Expected Benefits to WRF More available resources – Different Domains Faster results Improved Accuracy

Page 5: Outline

GRID ENABLEMENT “Grid-enabling is the practice of taking existing

applications, which currently run on a single node or on a cluster of homogeneous nodes, and adapt them (either automatically or manually) so that they can be deployed over non-homogeneous computing resources connected through the Internet across multiple organizational boundaries (e.g., multiple clusters from different organizations) without major modifications to the underlying source code.”

Grid-enablement process successful if the resulting Grid-enabled application “performs better” than the original application.

Performs better can be interpreted differently Improved execution time, better resource utilization, enabling

collaboration, …

Page 6: Outline

APPLICATION AND SCENARIOTHREE-LAYER NESTED DOMAIN

04/22/23

6

Page 7: Outline

04/22/23

7

15 km

5 km1 km

Application and ScenarioThree-Layer Nested Domain

Page 8: Outline

APPLICATION AND SCENARIOTHREE-LAYER NESTED DOMAIN

04/22/23

8

Page 9: Outline

SYSTEM OVERVIEW Web-Based Portal Grid Middleware (Plumbing)

Job-Flow Management Meta-Scheduling Profiling and Benchmarking

Development Tools and Environments Transparent Grid Enablement (TGE)

TRAP: Static and Dynamic adaptation of programs TRAP/BPEL, TRAP/J, TRAP.NET, etc.

GRID superscalar: Programming Paradigm for parallelizing a sequential application dynamically in a Computational Grid 9

Page 10: Outline

04/22/23

10

System Architecture

Grid Middleware

Page 11: Outline

04/22/23

11

Web-Based Portal ScreenshotMeteorologist Login Interface

Page 12: Outline

04/22/23

12

Web-Based Portal ScreenshotBusiness Owners/Emergency Official’s Login Interface

Page 13: Outline

GRID MIDDLEWAREMiddleware: “A layer between network operating systems and

applications that aims to resolve heterogeneity and distribution”

Examples: CORBA, Java’s RMI and .NET.

Grid Middleware: Middleware for Grid Enablement Examples: Globus, Legion, Condor-G, etc.

Page 14: Outline

PEER-TO-PEER INTER-DOMAIN INTERACTIONS

04/22/23

14

BSC FIU

C

Job-Flow Manager

Job-Flow Manager

Peer-to-peerProtocols

Web-Base Portal

Web-Base Portal

Meteorologist Meteorologist

Local Resources

Local Resources

Local Resources

Local Resources

Meta-Schedule

r

Meta-Schedule

rLoca

schedulerLoca

schedulerLoca

schedulerLoca

schedulerResource Policies

Resource Policies

Page 15: Outline

PEER-TO-PEER INTER-DOMAIN INTERACTIONS

04/22/23

15

BSC FIU

C

Job-Flow Manager

Job-Flow Manager

Peer-to-peerProtocols

Web-Base Portal

Web-Base Portal

Meteorologist Meteorologist

Local Resources

Local Resources

Local Resources

Local Resources

Meta-Schedule

r

Meta-Schedule

rLoca

schedulerLoca

schedulerLoca

schedulerLoca

scheduler

1

2 3 4

5 6

7

1

4

6

1

2 3

5

7

12357 1 4 67

Resource Policies

Resource Policies

Page 16: Outline

PEER-TO-PEER INTER-DOMAIN INTERACTIONS

04/22/23

16TDWB

IBM-USA

TDWB

IBM-India

IBM

Fork

BSCgrid

BSC

SGE

GCB

Fork

GCBViz

FIU

Meta-Scheduler

Meta-Scheduler

Meta-Scheduler

LL/Fork

CEPBA

Job-Flow Manager

Job-Flow Manager

Job-Flow Manager

Peer-to-peer

Peer-to-peer

Peer-t

o-pe

er

Page 17: Outline

04/22/23 17

FAULT-TOLERANT JOB-FLOW MANAGEMENT

GlobusGridwayMeta-Scheduler

ActiveBPELEngine

Portal Client

IBM TDWB Meta-Scheduler

Re-submit job to remote domain

Generic Proxy Generic Proxy

IBM’s Websphere Process Server

Local Scheduler

Local Scheduler

Local Scheduler

Local SchedulerLocal

Scheduler

Domain 1: IBM Domain 2: FIU

Re-poll job at remote domain

Portal Client

GlobusGridwayMeta-Scheduler

ActiveBPELEngine

Portal ClientPortal Client

IBM TDWB Meta-Scheduler

Re-submit job to remote domain

Generic Proxy Generic Proxy

IBM’s Websphere Process Server

Local SchedulerLocal Scheduler

Local SchedulerLocal Scheduler

Local Scheduler

Local SchedulerLocal

Scheduler

Domain 1: IBM Domain 2: FIU

Re-poll job at remote domain

Portal ClientPortal Client

Page 18: Outline

JOB FLOW MANAGEMENT ARCHITECTURE

04/22/23

18

PatternsPatternsPatternsPatterns

PoliciesPolicies

LogsLogsLogsLogsLogs

Proxy: : Generic InvokeFM: : Notification

MS:: Job Submission and MonitoringMS:: Notification

Input job flow

Adapted job flow

Monitor

Recovery

Correlater

JobFlow

Manager(FM)

Meta-Scheduler

(MS)

Generic Proxy

Rule Editor

Deployment Time Run Time

FlowAdapter

After adaptation:

Operation:

submitJob

PartnerLink :

Proxy_JobSubmissionService

After adaptation:Operation:

genericInvokePartnerLink : Proxy_GenericInvoke

Sample Adapted job flow:

After adaptation:

Operation:

submitJob

PartnerLink :

Proxy_JobSubmissionService

Sample Adapted job flow:

After adaptation:

Operation:

submitJob

PartnerLink :

Proxy_JobSubmissionService

After adaptation:Operation:

genericInvokePartnerLink : Proxy_GenericInvoke

Sample Adapted job flow:

Input

Sample Job flow

(WS - BPEL + JSDL):

Sample Job flow

(WS - BPEL + JSDL):

Operation:

submitJob

PartnerLink :

MS_JobSubmissionService

To adapt:

Input

Sample Job flow

(WS - BPEL + JSDL):

Sample Job flow

(WS - BPEL + JSDL):

Operation:

submitJob

PartnerLink :

MS_JobSubmissionService

To adapt:

Sample Job flow

(WS - BPEL + JSDL):

Sample Job flow(WS- BPEL + JSDL):

Operation: submitJob

PartnerLink : MS_JobSubmissionService

To adapt:

Start

Page 19: Outline

THE META-SCHEDULING PROTOCOL04/22/23

19

Connection API

Consumer

Site A

ConnectionManagement

JobManagement

ResourceManagement

Producer

Site B

ConnectionManagement

Job Management

ResourceManagement

Job Management API

Resource Exchange API

requestResourceData ()

resourceData () PUSH MODE

PULL MODE

Page 20: Outline

FIU: META-SCHEDULER INTERNAL ARCHITECTURE

04/22/23

20GCB

Cluster

SGE

Globus

Gridway

Site scheduling manager

WS Client

GlobalSchedulingmanager

Resourcemanager

UserClient

JSDL

LA GridCluster

Fork

Globus

ConnectionManagement

JobManagement

ResourceManagement

Page 21: Outline

BETTER SCHEDULING BY MODELING WRF BEHAVIOR

networkdiskmemory

k

kk bbbbbnx 443cache2CPU10

4

10 04/22/23

21

Mathematical Modeling

Parameter Estimation

ProfilingCode Inspection & Modeling

Texe= ( 0 + 1 / #nodes ) ( 0 + 1 / clock )

ModelingModelingWRFWRF

BehaviorBehavior

An Iterative Process

An Incremental

Process

Start

Page 22: Outline

RESULTSEXECUTION TIME VS ALLOCATED CPU

Page 23: Outline

RESULTSMODEL VALIDATION: A LINEAR MODEL!

04/22/23

230

5000

10000

15000

20000

25000

30000

0 0.5 1 1.5 2 2.5 3

I nverse CPU (GHz)

Com

puta

tion

Tim

e (s

econ

ds)

nodes 2 nodes 3 nodes 4 nodes 5 nodes 6 nodes 7 nodes 8 nodes 9 nodes 10 nodes 11 nodes 12 nodes 13 nodes 14 nodes 15

Page 24: Outline

CHALLENGES REMAIN TO BE ADDRESSED

High latency of Internet compared to high-speed LANs

High overhead of the Grid middleware software

Risking compatibility with future WRF versions

High volume of the WRF sources code

Compiling WRF on unsupported platforms

Page 25: Outline

LESSONS LEARNED No current and complete methodology for Grid

Enablement Grid enabling cluster applications Issues: LAN vs

WAN WRF lack of enough documentation, old

programming techniques Mathematical Model – May Optimize Speedup but

also Error Margin – More Clusters Needed Still on early stage of Concrete Scenario for

Forecast Ensemble

Page 26: Outline

ACKNOWLEDGEMENTSWe are thankful to the following individuals for theircontributions to some of the ideas presented in this paper: Yanbin Liu, Norman Bobroff, Balaji Viswanathan, Steve Luis, Shu-Ching Chen, Lloyd Trinish, Jason Liu, Alex Orta, T. N. Krishnamurti, Eric Johnson, and Donald Llopis.

This work was supported in part by IBM (SUR and Student Support awards), the National Science Foundation (grants OISE-0730065, OCI-0636031, REU-0552555, and HRD-0317692). This work is part of the Latin American Grid (LA Grid) project

Page 27: Outline

Contact Information:S. Masoud SadjadiS. Masoud Sadjadi

http://www.cs.fiu.edu/~sadjadi/[email protected]

Thank you!

and

Questions?