GRID Task Team Status Yonsook Enloe May 12, 2003 [email protected].

23
GRID Task Team Status Yonsook Enloe May 12, 2003 [email protected] CE S Committee on Earth Observation Satelli Working Group on Information Systems and Ser

Transcript of GRID Task Team Status Yonsook Enloe May 12, 2003 [email protected].

Page 1: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

GRID Task Team Status

Yonsook EnloeMay 12, 2003

[email protected]

CE SCommittee on Earth Observation SatellitesWorking Group on Information Systems and Services

Page 2: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

GRID Talks This Week

• Previous to this sesssion – ESA Grid Activities – Luigi Fusco

• At this GRID Session :– Overview and Status – Yonsook Enloe– Grid Testing and Monitoring – Yonsook Enloe– Tour of the working (wiki) website – Allan Doyle– Certificate Authority Procedure – Allan Doyle– Firewalls Best Practices – Jeff Smith

• At the Data Services Meeting (Tuesday)– GridFTP and USGS Application – Stu Doescher

• At the ICS (CINTEX) Meeting (Wednesday)– Grid Catalog Methods – Liping Di

Page 3: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Who Is On the GRID Task Team?

• NASA : Yonsook Enloe, Allan Doyle, Jeff Smith, Dick DesJardin, Ananth Rao, Dave Hartzell, R. Suresh, Gene Major, Dave Kendig,…

• NOAA NOMADS : Glenn Rutledge (NGDC), Danny Brinegar, Ted Smith…

• ESA Data Integration: Pedro Goncalves, Luigi Fusco, Ivan Petiteville, Christophe Caspar,….

• George Mason University ECS Data Pools : Liping Di, Aijun Chen• GSFC Advanced Data Grid : Debbie Ladwig, Bob Harberts, Sam

Gasster, ….• Univ of Alabama Data Mining : Sara Graves, Helen Conover, Sandi

Redman, Mike McEniry,…• USGS Data Delivery : Stu Doescher, Mike Neiers, Tim Smith,…• IPG (Grid Experts) : Judith Utley, Tom Hinke, Jana Nguyen,…• Observers and future joinees…. : Wyn Cudlip,…

Page 4: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: USGS Data Delivery

• Goal: Explore use of GRID technologies (primarily GridFTP and Certificate Authority) for the delivery and reception of earth science data.

• Application focus:– Delivery of earth science data from EDC to scientific user community.

– Receiving data into the archive from producer/reception sites.

• Explore how Grid technologies would replace current technologies being used: – Physical media (tape cartridges, CD/ROMs), primitive network protocols

(semi-anonymous FTP and limited FTP push).

– Security is major concern.

• POC: Stu Doescher ([email protected])

Page 5: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: NOAA NOMADS• NOAA Operational Model Archive and Distribution System

(NOMADS) goals:– Develop distributed Grid framework, promoting standards across

multiple institutions.

– Provide access to climate and numerical weather prediction (NWP) models for analysis and intercomparison.

– Foster research within geoscience communities to study complex earth systems using multiple collections of distributed data.

• Led by National Climatic Data Center (NCDC), with support from National Center for Environmental Prediction (NCEP), Geophysical Fluid Dynamics Laboratory (GFDL), and over a dozen other major collaborators.

• Grid technologies: GridFTP, Grid Information Service (GIS), Certificates.

• URL: www.ncdc.noaa.gov/oa/climate/nomads/nomads.html

• POC: [email protected]

Page 6: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: ESA Data Integration• Led by European Space Agency (ESA) European Space Research

Institute (ESRIN).

• Developing Grid Portal for Earth Science Applications Browser:– Interfacing to EU DataGrid, DOE Earth System Grid, other data warehouses,

OpenGIS Consortium (OGC) Web Services (OWS).

• Interfaces CEOS interoperability technologies with Grid environments to support on-demand user-driven data integration:– Catalogue Interoperability Protocol (CIP), Web Map Server (WMS), Archive

Data Management, Selection and transfer of data, On-demand data product generation, Data product visualization.

• HTML user interface implemented using client application with generic functions developed in JavaScript.

• URL: giserver.esrin.esa.int

• POC: [email protected]

Page 7: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: NASA GSFC Advanced Data Grid (ADG)

• Led by NASA Goddard Space Flight Center (GSFC):– Systems engineering, architecture and implementation support from Aerospace

Corporation and GST Inc.– Grid support (Certificate Authority services and Grid resources and services) from

NASA Ames Research Center (ARC) Information Power Grid.– Relationship with EOSDIS Data Pools Project.

• Primary Goals: – Assess scalability of Grid architecture/implementation for Earth Science Data

Segment data life cycle management and workflow (primary focus on Data Grid issues, not Compute Grid issues).

– Demonstrate realistic science application of relevance to NPP mission (www.jointmission.gsfc.nasa.gov) in fully Grid-enabled environment.

• Technologies:– Globus Toolkit– Storage Resource Management: SDSC SRB/MCAT, LBNL SRM, Globus MCS,

related tools. – Grid monitoring tools as required (e.g., ganglia).

• Data and Metadata:– Primarily EOS Data (MODIS) from Terra and Aqua Satellites, ECS Metadata

Schema.• POC: [email protected]

Page 8: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: NASA GSFC/GMU EOSDIS Data Pools

• Led by NASA Earth Observing System (EOS) Data and Information System (DIS) Project at NASA Goddard Space Flight Center (GSFC), with technology development and testbed at George Mason University (GMU).

• Goal is to demonstrate integration of Grid and OpenGIS Consortium (OGC) Web Services (OWS):

– Provide interoperable, personalized, on-demand data access and services.– Initial focus is on the NASA/EOSDIS Data Pools environment at four EOS Distributed Active

Archive Centers (DAACs): Goddard Space Flight Center (GSFC), Langley Research Center (LaRC), National Snow and Ice Data Center (NSIDC) at University of Colorado at Boulder, EROS Data Center (EDC).

– Technology development site is at GMU Laboratory for Advanced Information Technology and Standards (LAITS).

• Integrate NASA HDF-EOS (EOSDIS standard data format) Web GIS Software Suite (NWGISS), which provides OGC web map, coverage and registries services, with Grid technologies which provide security, resource access and management, Grid information/monitoring, data access/transfer.

• Work with Grid teams at Argonne National Laboratory (ANL) and NASA Ames Research Center to make Globus geospatial enabled and OGC interface compatible.

• URL: laits.gmu.edu • POC: [email protected]

Page 9: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Application: UAH/NSSTC Scientific Data Mining

• Led by University of Alabama in Huntsville (UAH) Information Technology and Systems Center (ITSC) data mining tools using Earth Science data from the National Space Science and Technology Center (NSSTC) and other data centers

• Explore use of Grid software tools and resources for compute-intensive data mining and machine learning applications in the earth sciences:

– Investigate Grid-enabled data mining issues, e.g., Grid resource monitoring and intelligent scheduling, to manage distributed data and compute resources in support of scientific data mining.

– Science focus is on developing supervised classifier of storm characteristics to identify dangerous storms with potential for heavy lightning.

– Leverage substantial UAH data mining expertise and software.

– Leverage ITSC testbed for NSF Middleware Initiative (NMI), to provide visibility into NMI for CEOS Grid developers, and to provide earth science and spatial data requirements and feedback to NMI middleware development and support team.

• Grid technologies: Globus Toolkit (Globus Packaging Technology (GPT), Grid Resource Information Service (GRIS), GridResource Allocation Manager (GRAM), GridFTP, Monitoring and Discovery System (MDS),Grid Security Infrastructure (GSI)), Network Weather Service (NWS), Condor-G.

• URL: www.itsc.uah.edu/about.html

• POC: Sara Graves ([email protected])

Page 10: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

What Do We Want to Do?

10/02 04/03 10/03 04/04 10/04

NOMADSNOAA

Data DeliveryUSGS

Technology Core Data IntegrationESA

Advanced Data GridGSFC

WTFs

EOSDISData Pools

Scientific Data MiningUAH

Test Suite Pilot Apps Full Apps Full Grid

Page 11: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

What Do We Want to Do?• Oct 2002-March 2003: Phase 1 Establish CEOS Grid Technology Core Testbed

– Objectives: • Establish an immediate Grid capability base within participating

CEOS agencies:– Grid software– Access to existing Grids– Pilot applications– Knowledgeable people

• April 2003-Sept 2003: Phase 2 Demonstrate CEOS Grid-enabled Applications

– Objectives:• Demonstrate Grid-enabled applications, each involving at least two

CEOS agency sites. • Show proof of concept.• Evaluate benefits.• Obtain lessons learned from infusion of Grid technologies from the

Technology Core into real CEOS agency information systems and applications.

Page 12: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

What Do We Want to Do? (cont)

• Fall 2003: Presentation to WGISS – Decide if to continue – Objectives:

• Report to WGISS on accomplishments and "So what?" from first year.

• Present 2nd year work plan and get approval to continue to second year.

• Oct 2003-Sept 2004: Phase 3

Create persistent CEOS Grid within WTFs– Objectives:

• Infuse applicable Grid technologies into selected CEOS agency information systems and WTFs, to create a persistent CEOS Grid that would be available to support future CEOS agency initiatives.

Page 13: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

CEOS Grid

NOAA NCDC USGS EDC NASA GSFC ESA ESRIN

UAH GSFC/GMU

Page 14: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

How Does the GRID Task Team Work?

• Task Team – overall coordination, identify issues and key technical areas of interest, initiate and staff tiger teams, coordinate implementation schedules, make general agreements, provide peer pressure to accomplish!

• Network Team – supports network issues; e.g. bandwidth testing, study firewall issues

• Tech Team – get technical expertise to provide tech support, identify technical areas of interest, and implement grid capabilities

• Small focussed tiger teams to explore specific topics and issues

Page 15: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

How Does the Grid Team Work?

• Monthly Task Team telecons for everyone• Bi-weekly Tech Team telecons for tech team• Frequent as needed telecons for specific tiger

teams to study specific topics/issues.• Multiple email lists• Public Task Team website at

http://harp.gsfc.nasa.gov/grid• Password protected working website at http://grid-tech.ceos.org/gridwiki

Page 16: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Issues

• CEOS Grid issues: – Six application projects with widely differing application areas.– Many issues are common to all six projects.  – Project team is working together to gain insight into these common

problems.• Issue 1: Lack of Grid software how-to install and use documents:

– Team is producing how-to documentation: Grid Cookbook pages.– 1st cookbook page: How to install and configure Globus 2.2.– 2nd cookbook page: How to install and configure GridFTP with multiple

hosts and multiple clients.– 3rd cookbook page (in progress): How to put simple applications on the

Grid – e.g. Web Map Server Application on the GRID• Issue 2: Lack of Grid expertise by participants:

– Grid Experts (IPG, . . .) are acting as consultants on various specialty topics.

– Formed Tech Team to help each other and help later participants.

Page 17: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Main Issues, Cont’d.

• Issue 3: Most agencies have firewalls. How to deal with these and how to configure to allow access?

– Network Team is gathering requirements for firewalls and is drafting a "CEOS Grid Firewall Best Practices" document. Jeff Smith will give a talk on this

– Technical POCs interested in this issue will review document with their firewall administrators and will iterate on the document.

• Issue 4: Grid Monitoring:– Network bandwidth performance testing and checkout of network routing is

being performed between testbed nodes. Results of initial network bandwidth testing by Andy Germain is accessible online

– Several Grid Monitoring tools (Map Center, Ganglia, NWS) are being studied and tried out.

– Map Center monitoring tool can monitor host machine ports and perform process level monitoring.

– CEOS Grid application sites are linking to this tool to try it out (work in progress).

Page 18: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Main Issues, Cont’d.• Issue 5: Certificate Authority (CA): How should host and user certificates be

implemented in an international multi-agency consortium?– Small tiger team formed to study issue with Grid expert.– Procedure for CA has been drafted and is being reviewed.– IPG Certificate software being tested– Planning to use certificates from multiple sources (work in progress).– Allan Doyle will give talk on this

• Issue 6: Catalog Issues: Because EO data have huge volumes from many sources, need a product catalog that is searchable and scalable. What kinds of catalog components are available on the Grid and do these components have the necessary capabilities for CEOS catalogs?

– Grid experts on SDSC SRB/MCAT and Globus MCS invited to give presentations to entire team.

– Catalog Tiger team formed (small team to study and analyze catalog issues and report back to the main team - work in progress). Liping Di will give Grid Catalog report at the ICS meeting on Wednesday

• Issue 7: Putting EO applications on the Grid:– Small tiger team formed to study this issue and prototype at least one approach.– Prototyping OGC Web Map Server (WMS) and Web Coverage Server (WCS) on the Grid.– Will generate Cookbook pages on putting WMS and WCS on the Grid.

Page 19: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Grid Network Team

• Led by Jeff Smith• "Virtual" CEOS Grid Prototyping Network is actually made up of

connectivity from several High Performance Research and Education Networks (HPRENs), e.g., NASA Research and Education Network (NREN), Energy Science Network (ESnet), Internet2 Abilene, European HPRENs.

• Network team works to ensure adequate connectivity between testbed nodes:– Identify connectivity requirements (testbed network map).– Perform network performance testing.– Work to solve specific network connectivity problems as needed.

• Developed CEOS Grid Firewall Best Common Practices (BCP) Document:– Working with USGS to implement, test and refine document.– Will work with other organizations to test and refine document further

• POCs: [email protected], Dave Hartzell ([email protected])

Page 20: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Grid Tech Team

• Led by Allan Doyle• Focus is on:

– Establishing CEOS Grid Technology Core Testbed, including defining, establishing, extending and documenting a base level of functionality at each participating testbed node and organization.

• Summary Technical Work Plan:– Learn from existing Grid contacts.– Download free Grid software and install in testbed

nodes.

– Connect testbed nodes into Initial CEOS Grid Virtual Organization (VO).

Page 21: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

GRID Tech Team

• Summary Technical Work Plan, Continued:– Define and execute core technology interoperability test suite

(automated for regular testing and measurement).

– Provide CEOS Grid Virtual Organization certificates to participants (certificates will be supplied by NASA Information Power Grid), and help applications negotiate access agreements with existing Grid VOs.

– Assist application team leads to interconnect and interoperate their application sites with existing Grid VOs and CEOS partner sites.

– Identify representatives to attend:• Global Grid Forum (GGF) Applications and Testbeds Research Group

• OGC EO WG and Architecture SIG (which is beginning to focus on Grid)

• APAN (Grid WG and Earth Monitoring WG)

Page 22: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Grid Task Team Accomplishments• Strong team culture of cooperation established and working • Initial startup difficulties at all agency application projects overcome. All groups

have new staff and new machines.• Gaining Grid technical expertise• Small focussed teams (tech team, network team, tiger teams,..) working well• Started the CEOS GRID Cookbook – how to implement/install various Grid

Capabilities and applications – 1st page – how to install Globus 2.x, 2nd page: how to install GridFTP with multiple hosts and multiple clients

• Initial network bandwidth testing completed• CEOS GRID Monitoring Tool web prototype working that tests grid access to

CEOS host machines• Draft CA Procedure with IPG certificates completed. Testing software

implementation with IPG staff • Analysis and study of GRID Catalog issues initiated• Analysis and study of how to install EO applications as Grid Web services started.

Page 23: GRID Task Team Status Yonsook Enloe May 12, 2003 Yonsook@harp.gsfc.nasa.gov.

Future Work• Continue to add to the CEOS Grid Cookbook. IPG expert staff have

expressed interest in using this cookbook• Continue to add additional monitoring capabilities in Mapcenter

prototype or through other tools• Finalize CA Procedure with CEOS Grid team. Complete software testing

with IPG Certificates• Continue study and analysis of Grid Web Services applications for use

with EO services/applications• Facilitate progress on Agency Application Projects • Work towards multiple GRID Application Demos at the Sept CEOS

meeting• Analysis of benefits of Grid Technologies for CEOS EO systems• Grid Tech infusion to other CEOS agencies and projects• Plan for next year activities