UK e-Science Program Core Centres 2001 (EPSRC) Research Council Pilot projects Godiva Ocean grid...

49
UK e-Science Program Core Centres 2001 (EPSRC) Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC) e-Biodiversity (BBSRC) Open EPSRC call for new e-Science Centres Reading e-Science Centre (Nov. 2003) Resources: Access Grid Node Technical Director: Jon Blower

Transcript of UK e-Science Program Core Centres 2001 (EPSRC) Research Council Pilot projects Godiva Ocean grid...

Page 1: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

UK e-Science Program

Core Centres 2001 (EPSRC) Research Council Pilot projects

Godiva Ocean grid (NERC)

Genie Earth System (NERC)

e-Minerals (NERC)

e-Biodiversity (BBSRC)

Open EPSRC call for new e-Science Centres Reading e-Science Centre (Nov. 2003) Resources: Access Grid Node

Technical Director: Jon Blower

Page 2: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

The Reading e-Science Centre(ReSC)

Jon Blower

Technical Director

http://www.resc.rdg.ac.uk

[email protected]

Page 3: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Aims of the ReSC

Promote e-Science methods in the environmental science community– CGAM, DARC, ESSC, JCMM, NCAS all at Reading

Act as a focus for all e-Science activities in Reading

Provide expertise, help and support for these activities

Reach out into government agencies and industry– esp. Met Office, Environment Agency– British Maritime Technology

Page 4: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

What is e-Science?

“science increasingly done through distributed global collaborations enabled by the Internet, using very large data collections, terascale computing resources and high performance visualization”

Page 5: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

What is e-Science? (2)

Easier definition: “Collaborative science using distributed computing”

Who can benefit?– Users of lots of computing power

– Users of large datasets

– Users of very distributed datasets

– scientists who work across geographical and institutional boundaries

Easier to explain with some concrete examples

Page 6: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Case Studies

Page 7: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Case 1: Ensemble modelling

The Problem:– Climate is sensitive to very many factors. How do we work out

which factors are most important in determining our future climate?

The Solution:– Run (fairly simple) simulations many, many times over with

different parameters (an ensemble run)

– climateprediction.net: participants all over the world run the model on their home PCs

Page 8: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

climateprediction.net results

Already largest climate model ensemble ever (by factor of >200) >45,000 users, >15,000 complete model runs, >1,000,000 model years

in ~3 months (this is equivalent to 1.5 Earth Simulators)

Large range of sensitivities found:

• Global outreach (participants in all 7 continents, inc. Antarctica!)• Generated much interest in schools (coolkidsforacoolclimate.com)

10K2K

Page 9: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Case 2: Sharing large datasets

The Problem:– There are many different models of ocean circulation and we

would like to compare and visualize the results. But there are lots of different data formats, and there’s lots of data!

The Solution:– Create an Internet-based service that allows users to cut out just

the data they want, and get it in the format they want (this is called Grid Access Data Service, GADS)

– Developed under the GODIVA project

Page 10: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GODIVA Web Portal

• Allows users to interactively select data for download using a GUI

• Users can create movies on the fly

• cf. Live Access Server

Page 11: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Case 3: Highly distributed data

The Problem:– In order to study the genetic origins of a disease it is necessary to

interrogate many data sources to perform in silico experiments to test hypotheses

The Solution:– Provide Web Services to access these data sources and a means for

combining these Services into workflows.

– These workflows can be shared between scientists, experiments can be easily repeated

– myGrid project is doing just this (www.mygrid.org.uk)

Page 12: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

The Taverna workbench

Each blob on the diagram is a Web Service

Flexible way of creating a distributed application

taverna.sourceforge.net

Page 13: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

e-Science concepts

Page 14: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

e-Science buzzwords

The GRID– highly heterogeneous network of supercomputers, clusters and commodity

machines (and one PS2!)– cf. power grids (long way off!)– not all e-Science is done on The GRID (in fact, most isn’t at the moment)

Interoperability / standards– absolutely necessary for working together and avoiding duplication of

effort

Metadata and Semantics (“The Semantic Web”)– Metadata = “data about data”, vital for discovering data resources– Meaning of data (semantics) must be precisely specified

Page 15: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

The tools of the trade

Middleware– software that “glues together” existing systems and connects

people with distant resources

Condor– Manages task of running jobs over several computers

Globus (Toolkit)– Most popular middleware, handles authentication, job submission,

etc

– version 3 very different from previous versions; it’s based on…

Web Services

Page 16: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Web Services

“Black box” subroutine that can be accessed over the Internet

Platform and language neutral– for example, code can run on Solaris, but be called from Mac,

Windows, Linux etc, any language

Huge industry backing– IBM, Microsoft, Sun, etc

Grid Services extend WS for long-lived jobs– notification of progress, persistence of data etc

Page 17: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Workflows

Web Services can be composed into “workflows” to create a distributed application– hot topic of research and debate in e-Science

Lots of standards and tools to do this, but no one clear “winner” yet

BPEL is popular, but really designed for business-to-business (B2B) interaction

Page 18: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Example workflow

Comparedatasets

Visualize results

Perform diagnostics

Extract dataset 1

Extract dataset 2

Convert format

Page 19: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Visualization

Key component of many e-Science projects Vital for validating models and finding features of interest

– not just “pretty pictures”

Can do collaborative visualization– several groups can look at the same thing at the same time

– e.g. mammography in hospitals

Real-time visualization of model results permits computational steering– RealityGrid (www.realitygrid.org)

– explore parameter space much more quickly

Page 20: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GODIVA visualization

Adaptive meshing gives data compression with little visible degradation

60 x 60 x 66 data points ~ ¼ million reduced by factor of ~10

Page 21: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Back to the ReSC...

Page 22: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Why ReSC?

Centre of Excellence in Environmental e-Science Reading Uni has strong links with Met Office, and

Environment Agency Support existing Reading e-Science activities

– in ESSC, Comp Sci, Plant Sciences, etc

– acts as focus and central point of contact

– not just environmental e-Science

Complements NIEeS– National Institute for Environmental e-Science in Cambridge

– www.niees.ac.uk

Page 23: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Who are we?

Two co-Directors– Keith Haines (ESSC)

– Rachel Harrison (Computer Science)

Technical Director (first point of contact)– Jon Blower (ESSC)

Many Associates– Mike Evans, Lizzie Froude, Kevin Hodges, Chunlei Liu, Kecheng

Liu, Adit Santokhee

– join us!

Page 24: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

What are we doing?

Building Reading e-Science community– Comp Sci, Met Dept, CGAM, DARC, Plant Sciences

Building infrastructure– Building Condor pool between ESSC and Comp Sci, further in

future

– Bidding for dedicated compute cluster

Building software– Web Services for environmental data access and manipulation

Outreach into govt agencies and industry– BMT, ECMWF, MCA, SEEDA

– using Reading Enterprise Hub

Page 25: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

ReSC projects

Flexible Online Environmental Data Systems (EDAS)– SEEDA project– delivery of live Met Office data to end users– e.g. BMT for search and rescue / oil spill mitigation

GODIVA– Grid for Ocean Diagnostics, Interactive Visualization and Analysis

GADS– Grid Access Data Service

Lizzie Froude’s PhD studentship– storm tracking diagnostics on large, distributed data sets

Lots more going on in Reading– e.g. BiodiversityWorld– Computer Science

Page 26: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

How you can get involved

Talk to us!

Join the Reading University e-Science mailing list– [email protected]

Read our website: www.resc.rdg.ac.uk

Use the Wiki site to share ideas– Register expertise and interests

– Share documents that might be of general use

Page 27: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

What we can do for you

Provide technical expertise– e.g. on Web Services, workflow, etc

Provide advice on getting funding

Help find collaborators, resources etc

Provide computational resources

Provide live data

Provide Access Grid for use

Page 28: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

The Access Grid

[email protected]

Page 29: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

What is the Access Grid?

(not to be confused with The GRID!) State-of-the-art videoconferencing suite Can hold meetings with many sites at once

– everyone can see and hear everyone else

Reduces travel costs and saves lots of time Uses high-speed internet

– no running costs!

Easy to operate– don’t need dedicated technician

Page 30: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

In conclusion…

ReSC is here to support all Reading e-Science activity We specialise in environmental e-Science We’re always looking for new projects to be involved in Many potential future projects

– especially in area of delivery of real-time Met Office or Environment Agency data

– engage GIS community

Let us know what you would like us to do!– [email protected]

Page 31: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Other environmental e-Science projects

Page 32: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GENIE

Grid-Enabled Integrated Earth System model Aims to create a distributed, component-based model of the earth

system Will study long-term climate change and palaeoclimate Will incorporate components representing atmosphere, ocean, land

surface, ice, ocean and land biogeochemistry, ocean sediments Developing novel computing techniques for model framework,

integration, data management, visualization

www.genie.ac.uk

Page 33: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GENIE (contd.)

Response of Atlantic circulation to freshwater forcing

New ways of working:– Web Portal for composing + executing simulations, retrieving results

– Use of flocked Condor pools (London, Soton) and Beowulf clusters

– Data client for post-processing

Page 34: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GENIE (contd.)

3 international collaborators (Japan, US, Switzerland) Involvement in international projects: PRISM, EMIC, GAIM 4 Oral, 2 poster presentations at EUG/AGU (Nice), IUGG (Japan),

AHM 03 4 refereed journal papers (1 in press, 3 submitted) Engagement with industry (50K each from Intel, Compusys for

meetings) ~20 people at present using shared code repository

– Tyndall Centre will use code in integrated assessment model

Page 35: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GODIVA

Grid for Ocean Diagnostics, Interactive Visualisation and Analysis Aims to quantify the thermohaline circulation via analysis of model

results and observational data Developing Web Services for performing common tasks on

oceanographic data:– Data extraction, processing, analysis, visualisation

These Services will be composed into “workflows” to create flexible, distributed applications– collaborating with other e-Science projects (e.g. myGrid) in this matter

Page 36: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GODIVA progress

Talks/demonstrations at All Hands meeting and SCGlobal 2003 Created prototype client application:

– extracts live data and performs 3-D rendering

Also created data portal providing global access to data (next slide) Will engage GIS community (e.g. MarineGIS project in Ireland) MENTION irregular mesh

www.nerc-essc.ac.uk/godiva

Page 37: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GODIVA Data Portal

Web-based, similar to Live Access Server

Users select area of interest and can download data or create movies in matter of seconds or minutes

Uses distributed computing for visualisation

Page 38: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

NERC Data Grid

Objective is build a grid which makes data discovery, delivery and use much easier than it is now

Standards compliant (ISO 19115, 19118), semantic data model for maximum interoperability

Data can be stored in many different ways (flat files, databases…) Clear separation between discovery and use of data. 1 PI, 2 co-Investigators, 4 FTE staff, 3 registered US collaborators

ndg.nerc.ac.uk

Page 39: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

NERC Data Grid progress

Involved in many UK events (All Hands, Met Soc, NIEeS workshops etc)

Generated much international interest (US, France, Netherlands, Australia…)

Major challenges:– Influencing OGC and ISO to support the complex requirements of

the climate simulation community

– Developing a “feature-registry” to allow semantics of data types to be well understood by different communities

Page 40: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

climateprediction.net

Have created extremely powerful and distributed climate modelling facility by running model simulation on home computers (cf. SETI@home)

Launch ensemble of coupled simulations of 1950-2000 and compare with observations.

Run on to 2050 under a range of natural and anthropogenic forcing scenarios.

Investigates sensitivity of climate system to increasing CO2 with range of parameter values

Have collaborated with other universities and industry to build system

Page 41: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

e-Minerals Models the atomistic processes involved in environmental

issues (radioactive waste disposal, pollution, weathering)– Simulation of radiation damage (Daresbury)– Order-N quantum mechanical model of fluids (Cambridge)– Complex fluid-mineral interfaces – crystal growth and

dissolution (Bath) Developing new methods

– embedded clusters: links simulations of various sophistication to cover greater ranges of scales

– first use of quantum Monte Carlo techniques in mineral sciences

eminerals.org

Page 42: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

e-Minerals (contd.)

Have constructed minigrid across institutions to run code– ~30 scientists in 8 institutions

Users submit jobs using a Web Portal– This integrates the CCLRC Data Portal with the HPC Portal

Developing tools for collaborative visualisation across the virtual organisation

Collaborating with Peter Murray-Rust to extend the Chemical Markup Language (CML) for computational chemistry

Page 43: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

NIEeS

National Institute for Environmental e-Science Promotes and supports the use of e-science and grid

technologies within the UK environmental science community

Holds workshops, courses, training events, visitor programmes, demonstration projects

Industry event forthcoming (Feb 12th)– generating much interest

www.niees.ac.uk

Page 44: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

NIEeS (contd.)

Up to end of 2003 (since launch in July 2002):– 14 events held– 901 participants

e.g. Earth Systems Modelling workshop (Oct 03) received coverage in national press and engaged Earth Simulator community in Japan

Event sponsorship from BNFL, LaserScan In-kind support from EDINA, ICE, IEMA, MIRO Additional help from Hi Consulting

Page 45: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Illustration of an e-Science problem

SOC’s latest OCCAM model runs at 1/12 degree resolution, covering the entire globe

Every model day, model outputs 8GB of data– Hence whole data set will be several TB in size

How do we work with this data set?– Might want to do analysis, visualisation etc– Extract just the data you want and work with it– OR move the programs (code) to the data, not vice-versa

These are two key principles of e-Science

Page 46: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

• Subset / resample

• Transform / regrid / rotate

• Analyse

• Compare

Working with large data sets

Page 47: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

UK e-Science Centres

National e-Science Centre (NeSC)

National Institute for Environmental e-Science (NIEeS)

Page 48: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

GADS: Background

• Climate scientists have a need to access large datasets:– Model data and satellite observations– Data in a variety of formats (netCDF, HDF,

GRIB, more), grids, naming conventions– Model intercomparisons (MERSEA)

• Existing standards (DODS/OPeNDAP) are limited

Page 49: UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Advantages of GADS

Data are abstracted from storage Data can be exposed with standard variable names, even if

data files do not conform to standards Data can be delivered in many formats, irrespective of

internal storage format Deployed as Web Service

– Platform – independent

– Compatible with current eScience advances