Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

75
Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Page 1: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Grid Tutorial 2008

Page 2: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

?? What do you think is a GRID??

dr. M. Bouwhuis

The word ‘grid’ is (over)used a lot (HYPE)– Oracle databases ;(– cluster computing– cycle scavenging– “If a customer calls it a ‘grid’, then it is a grid”– cross-domain resource and data sharing

Is there a clear definition?• Coördinate resources not under a central controle• The use of standards, open and generic protocols & interfaces• Delivering a non-trivial amount of collective services

When do you need a grid? • More then one computer• More then one use (sharing)• More then one location (collaborating)• More then one company/devision• More then one community

In general: More then ONE. J. Templon

When do you need a grid? • More then one computer• More then one use (sharing)• More then one location (collaborating)• More then one company/devision• More then one community

In general: More then ONE. J. Templon

Page 3: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

The Grid metaphor

GRID

MIDDLEWARE

Visualising

Workstation

Mobile Access

Supercomputer, PC-Cluster

Data-storage, Sensors, Experiments

Internet, networks

Grid Tutorial, Utrecht, 2008

Page 4: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

Introduction to GRID computing

Introduction GRID TutorialMaurice Bouwhuis

SARA

Grid Tutorial, SURFnet, November 2008

Page 5: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 dr.

M. Bouwhui

s

online systemmulti-level triggerfilter out backgroundreduce data volume

level 1 - special hardware

40 MHz (40 TB/sec)level 2 - embedded processors

level 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &

offline analysis

One of the four LHC detectors

45 mATLAS

Problem 1: Hoge Energie Fysica

In NL: -~3000 CPU-3.000.000 GB disk-3.000.000 GB tape per jaar

Grid Tutorial, SURFnet, September 2007

Page 6: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Life and Medical Science

• Autometisering en increased resolution more data AND more complex data

• Many different sources

• Hypothesis driven data driven

0

1.000.000

2.000.000

3.000.000

4.000.000

5.000.000

6.000.000

1965 1970 1975 1980 1985 1990 1995 2000

Over 5 million sequence entries in GenBank

Over 3 billion bases from 41,000 species

Grid Tutorial, Utrecht, November 2008

Page 7: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

And the rest

• Astronomy (LOFAR et al) • Climate research • Earth observation• Alpha en Gamma sciences

– Storage and Long Term archiving – Analysis of digital files

• Ecology• Food and health• Medical instrumentation design• Archaeology• …………………………

Grid Tutorial, Utrecht, November 2008

Where are you from

Page 8: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, November 2008 8

State of GRID today

• So what is happening today? – Scale! Grid infrastructures operate worldwide

• International infrastructures - EGEE, DEISA, Nordugrid, OSG, TeraGrid

• National – NAREGI (Japan), UK-eScience, D-Grid, NLGrid

– Interoperability – availability of middleware – Globus toolkit, UNICORE, NAREGI, schedulers

Page 9: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, November 2008 9

State of GRID Today

• Some basic requirements for a grid infrastructure– Transparent user administration – single sign on (single

grid identity), authorisation and accounting based on grid identity – AAA facilities

– Job scheduling – which can handle different environments

– Global data access

– Global information services – job information, data information, resource information

• Interoperability!– Standards needed for federation of infrastructures – GGF,

IETF….

Page 10: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, November 2008 10

It all starts with Networking

• Developments in network connectivity (high bandwidths) and tools play an important role

– 10 Gbps WAN links available today, both shared links and dedicated lightpaths (based on lambda technology)

– 1 Gbps network adapters are commodity items on systems today and 10GE adapters available

Page 11: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, November 2008 11

SURFnet6 DWDM on dark fiber

Dordrecht1

Breda1

Tilburg1

DenHaag

NLR

BT

BT NLR

BT

Zutphen1

Lelystad1

Subnetwork 4: Purple

Subnetwork 3: Red

Subnetwork 1: Green

Subnetwork 2: Dark blue

Subnetwork 5: Grey

Emmeloord

Zwolle1

Venlo1

Enschede1

Groningen1

LeeuwardenHarlingen

Den Helder

Alkmaar1

Haarlem1

Leiden1

Assen1

Beilen1

Meppel1

Emmen1

Arnhem

Apeldoorn1

Bergen-op-ZoomZierikzee

Middelburg

Vlissingen Krabbendijke

Breukelen1

Ede

Heerlen2Geleen1

DLO

Schiphol-Rijk

Wageningen1 Nijmegen1

Hilversum1

Hoogeveen1

Lelystad2

Amsterdam1

Dwingeloo1

Amsterdam2

Den Bosch1

Utrecht1

Beilen1

Nieuwegein1Rotterdam1

Delft1

Heerlen1

Heerlen1

Maastricht1

Eindhoven1

Maasbracht1

Rotterdam4

3XLSOP

IBG1 & IBG2Middenmeer1

Muenster

SURFnet 6 infrastructure

Page 12: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, November 2008 12

GEANT2 topology

Page 13: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Utrecht, 2008 13

Global Lambda Integrated Facility (GLIF)

World Map

www.glif.isVisualization courtesy of Bob Patterson, NCSA/University of Illinois at Urbana-Champaign.Data compilation by Maxine Brown, University of Illinois at Chicago. Earth texture from NASA.

Page 14: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Univ. Linz - March 2008 14

EGEE

Main Objectives• Operate a large-scale,

production quality grid infrastructure for e-Science

• Attract new resources and users from industry as wellas sciences

Flagship grid infrastructure project co-funded by the European Commission Now in 3nd phase

Page 15: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Univ. Linz - March 2008 15

EGEE – What do we deliver?• Infrastructure operation

– Sites distributed across many countries• Large quantity of CPUs and storage• Continuous monitoring of grid services & automated site

configuration/management• Support multiple Virtual Organisations from diverse

research disciplines

• Middleware– Production quality middleware distributed under

business friendly open source licence

• Implements a service-oriented architecture that virtualises resources

• Adheres to recommendations on web service inter-operability and evolving towards emerging standards

• User Support - Managed process from first contact through to production usage– Training– Expertise in grid-enabling applications– Online helpdesk– Networking events (User Forum, Conferences etc.)

Page 16: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Univ. Linz - March 2008 16

250 sites48 countries50,000 CPUs13 PetaBytes>5000 users>200 VOs>140,000 jobs/day

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

32%

Page 17: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Gris Tutorial 2007 courtesy of Bob Jones (EGEE director) 17

0

5000

10000

15000

20000

25000

30000

35000

40000

Apr-0

4

Jun-

04

Aug-0

4

Oct-

04

Dec-0

4

Feb-0

5

Apr-0

5

Jun-

05

Aug-0

5

Oct-

05

Dec-0

5

Feb-0

6

Apr-0

6

Jun-

06

Aug-0

6

Oct-

06

Dec-0

6

No

. CP

U

0

50

100

150

200

250

Apr-0

4

Jun-

04

Aug-0

4

Oct-

04

Dec-0

4

Feb-0

5

Apr-0

5

Jun-

05

Aug-0

5

Oct-

05

Dec-0

5

Feb-0

6

Apr-0

6

Jun-

06

Aug-0

6

Oct-

06

Dec-0

6

No

. Sit

es

Production Usage Status

• ~19 million jobs run (8200 cpu-years, ~50K jobs/day) in 2006

• Non-physics usage is 10K jobs/day (same as whole of EGEE in 2005)

• Continuous usage of between ¼ and ⅓ of the available resources

• 24% of resources are contributed by groups external to the project

• Grid Operations report: https://edms.cern.ch/document/726140

• ~19 million jobs run (8200 cpu-years, ~50K jobs/day) in 2006

• Non-physics usage is 10K jobs/day (same as whole of EGEE in 2005)

• Continuous usage of between ¼ and ⅓ of the available resources

• 24% of resources are contributed by groups external to the project

• Grid Operations report: https://edms.cern.ch/document/726140

Page 18: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 EGEE & SEE-GRID Summer School, Budapest, June 30th, 2007 18

Registered Collaborating Projects

Applicationsimproved services for academia,

industry and the public

Support Actionskey complementary functions

Infrastructuresgeographical or thematic coverage

24 projects have registered as on February 2007

Page 19: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 19

User Support Activities

Page 20: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 20

User support in NE region

• User support: contact user support at local site or mail to [email protected] – NE uses a ticketing system monitored by different

partners from our region. In NL NIKHEF, RC-RuG, SARA responsible.

– Tickets from GGUS are also imported in the NE system

• Application support – NA4 activity. In NL RC-RuG, SARA

Page 21: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006

A Selection of Monitoring tools

1. GIIS Monitor 2. GIIS Monitor graphs 3. GOC Data Base

4. Scheduled Downtimes6. Live Job Monitor

5. GridIce – VO view

Page 22: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.
Page 23: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

BiG Grid

• Strengthen existing National Grid infrastructure in Netherlands (NL-GRID by NCF)

• Sudsidy of 28 M€ for hardware and peopleware (expertise and support)

• Core partners – NCF– Nikhef (High Energy Physics)– NBIC (BioInformaitcs)

• Central and Distributed facilities

Grid Tutorial, SURFnet, September 2007

Page 24: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.
Page 25: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Infrastrucure

O(5000) CPUO(10) PB disk storageO(20) PB tape storageO(10) Life Science Grid clusters

Page 26: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Combination of ‘push’ and ‘pull’

• Application support:– expertise: Application Domain Analysts

– help desk and operations centre

• Uniform software suite Collaboration– Standarts:

Open Grid Forum

– EGEE: ‘production’

grid (40 disciplines)

– BSIK VL-E project

Grid Tutorial, SURFnet, September 2007

Page 27: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Other Projects: DEISA

Grid Tutorial, SURFnet, September 2007 27

• European super-computing grid

• Shared global file system

• Job migration

• Co-scheduling

Page 28: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Managementof comm. & computing

Managementof comm. & computing

Managementof comm. & computing

Potential Genericpart Potential Generic

partPotential Generic

part

ApplicationSpecific

Part

ApplicationSpecific

Part

ApplicationSpecific

Part

Virtual Laboratory Application oriented services

GridHarness multi-domain distributed

resources

Virtual Laboratories

Distributed computing

Visualization & collaboration

Knowledge

Data & information

Page 29: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Optical NetworkingHigh-performance

distributed computingSecurity & Generic

AAA

Virtual lab. &System integration

Interactive PSE

Collaborative information Management

Adaptive information

disclosure

User Interfaces & Virtual reality

based visualization

Bio

-div

ers

ity

Bio

-In

form

ati

cs

Te

les

cie

nc

e

Da

ta I

nte

ns

ive

Sc

ien

ce

Fo

od

In

form

ati

cs

Me

dic

al

dia

gn

os

is &

im

ag

ing

Virtual Laboratory for e-Science

Page 30: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

The VL-e project

• 40 M€ (20 M€ BSIK funding)• 2004 - 2008

vrije Universiteit

• 20 partners• Academic - Industrial

Page 31: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

The Grid Tutorial

• Day 1 – Grid Certificates and Virtual Organizations

– Job Submission

– Online multimedial collaboration by SURFnet

– Security and authentication

– Drinks

• Day 2– Data handling

– User Scenarios (SciaGrid and BioMed)

• Handout and USB stick– Paper handout

– Usb Stick: demoXX, Imation, info.txt, tut_exercises.tgz, tutorial.pdf, tut_vmimage.zip

Grid Tutorial, SURFnet, September 2007 31

Page 32: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Sponsors

Tutorial is free thanks to support of our sponsors

• Gridforum.nl

• Netherlands Center for BioInformatics

• BigGrid

• SURFnet

Grid Tutorial, SURFnet, September 2007 32

Page 33: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, SURFnet, September 2007 33

Page 34: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Introduction to GRID computingBringing It All Together

Grid Tutorial, SURFnet, September 2007

Page 35: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Summary

You have seen and played with:

• Authentication --- X509 certificate, VO

• Job Submission --- Use the compute resources

• Data Management --- Moving data around

• Use Cases --- plans and achievements on the Grid

This Tutorial

Web ServiceService Oriented

ArchitectureWorkFlow SystemsDatabasesOnthologiesTaverna/myGrid

Embrace/BioMed and other EU projects

Page 36: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Bringing it all Together

• Try your own application on the Grid – Need help, ask us, and we will work with you

• Talk to the experts – We will walk around to answer “any” question

• What does this type of Grid mean for BioInformatics– Working session by Victor de Jager (NBIC), Machiel

Jansen (SARA) and Pieter van Beek (SARA)

Page 37: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 22-05-2007 ICT Delta dr.

M. Bouwhui

s

Extra Extra Extra Extra Extra

Page 38: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 22-05-2007 ICT Delta dr.

M. Bouwhui

s

Om te onthouden

• Grid nu al beschikbaar, in productie en wordt gebruikt

• BIG GRID levert de hardware en peopleware

• Veel opslag en rekenkracht

Page 39: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 22-05-2007 ICT Delta dr.

M. Bouwhui

s

• Kijk ook op de Expo– Virtual Laboratory for e-Science

– Nationale Computer Faciliteiten

– SARA

– ………

Dank voor uw aandacht

Page 40: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 22-05-2007 ICT Delta dr.

M. Bouwhui

s

Waarom in Nederland?

• Grid pioniers:– leidende rol grid ontwikkeling & standaardisatie

– host 1e Global Grid Forum (maart 2001)

– coördinatie worldwide grid identitymanagement

– twee ‘area directors’ Open Grid Forum (David Groep – security, Cees de Laat – infrastructure)

• SURFnet: wereldleider in netwerken

• Nederland wereldwijd toonaangevend in:– bio banking

– digitaal archief – computational science – radioastronomie

Page 41: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 22-05-2007 ICT Delta dr.

M. Bouwhui

s

Veelsoortige wetenschappen

• Alfa- en Gammawetenschappen:DANS, MPI Nijmegen

• Bio-science:BioAssist

• Elementaire deeltjesfysica (HEP):wLCG project (CERN) realiseert grid prototype

• Pan-disciplinair in Nederland:VL-e geeft vorm aan Nederlandse e-Science

Philips jobs at NIKHEF

Page 42: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

NL – LHC/Tier-1

Samenwerking NIKHEF en SARA.

Commitments vastgelegd in wLCG MoU (Maart 2006):

2006 2007 2008 2009 2010

CPU (kSI2K)(installed apr. ‘07)

306 1680(18781)

4380 7540 12300

Disk (Tbytes)(installed apr. ‘07)

170 1100(270)

2500 3800 6100

Tape (Tbytes)(installed apr. ‘07)

143 719(120)

1810 3550 5740

Nominal WAN(Mbits/sec)

10000 10000 10000 10000 10000

Mede gefinancierd uit NCF-project ‘pilot BIG GRID’.Nadruk op generieke services!1 Bruto capaciteit (geen ‘fair share’ toegepast); inclusief 20% LISA @ SARA

Grid Tutorial, SURFnet, September 2007

Page 43: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 43

Definition of Grid

• From an EU brochure:– It doesn’t matter if your team is modeling the Earth’s

atmosphere, designing cars, creating animated films or finding new medicines, the basic principle is the same: your Grid supplies all the computing power, software, data and knowledge you need in one integrated package, and helps project teams work more closely together

• The analogy with the power grid:– Like you can plug in anywhere to the power grid without knowing

where your energy is coming from you can plug into the grid without knowing where your (computing) resources are coming from.

Page 44: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 44

History (1)

• From a news item in 1991– “Smarr describes the metacomputer as a network of

heterogeneous, computational resources linked by software in such a way that they can be used as easily as a personal computer”

– So the concept was introduced already in the early 90s, known as metacomputing.

– Motivation was the emergence of computer networks.

Page 45: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 45

Example (1)

Following is an example of the kind of initiatives started in those years from close by:

In 1996 a project was started in Amsterdam:

The Amsterdam Metacomputing project is an ongoing effort from the University of Amsterdam (UvA), the Free University (VU) and "Academic Computing Services Amsterdam" (SARA) to develop a Metacomputer environment on the Amsterdam campus.Important components of this environment will be: automatic distribution and monitoring of jobs over a network of computer systems, uniform access to files of other users from each place to work and to each computer system incorporated in the environment, distributed storage of data on various fileservers, automatic backup, migration and archiving, general availability of both commercial and public domain software on software servers, and a minimum of system management tasks. In this way scientists will be able to devote all of their time to their actual task: science.

Page 46: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 46

Example (2)

An extensive package of services will gradually be implemented and finally include the following components:

• fileservers and distributed, transparent file-systems; • backup, migration and archiving services; • batch-queueing systems, designed for efficient use of local systems, and if desired, of computational servers supplied by SARA;• public domain and specialist (commercial) software servers.

All components will be accessible from the scientist's desktop. A client-server architecture will play an important role. Combining components will be a relatively easy task, enhancing efficiency in terms of man-hours needed to accomplish a given task. These pages, as well as the Metacomputer are still in a development stage ……..

Page 47: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 47

Example (3)

Parsytec CC56 CPUsIBM SP2

76 CPUs

CRAY YMP Vector system

Systems available at SARA in 1996

Page 48: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 48

Example (4)

• SARA news item on 16-6-1998– Basis voor meta-omgeving gelegd.

– Sinds 4 mei maakt SARA's IBM RS/6000 SP parallelle supercomputer gebruik van de DCE/DFS omgeving, een filesysteem dat een transparante computeromgeving mogelijk maakt. Met het nieuwe filesysteem zijn bestanden van DCE/DFS gebruikers wereldwijd toegankelijk met andere computersystemen die beschikken over DCE/DFS, waarmee een belangrijke basis is gelegd voor de meta-omgeving.

– Gebruikers aan de VU science faculty hebben nu op een uniforme manier toegang tot hun bestanden, ongeacht of ze werken op de RS/6000 SP of een lokaal workstation. Hetzelfde geldt voor gebruikers van het Parsytec CC systeem bij SARA: vanaf zowel de Parsytec als de RS/6000 SP zijn alle bestanden voor de gebruiker direct toegankelijk.

Page 49: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 49

Example (5)

• A web interface was developed for submitting jobs to the metacomputing environment, also a meta job language was used.

• Also job migration between systems and mpi over two systems was investigated– First time we heard about globus, one of the well known building

blocks now for grid infrastructures.

– Network link between systems was a problem, only FE link, Gbit not available, HiPPI (800 Mbps) not available for Parsytec.

Page 50: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 50

NetherLight – Lightpath connections to the Netherlands

3rd quarter 2005

622M GLORIAD

GLORIAD-RU @NIKHEF

GE

Page 51: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 51

Operational Organisation

Page 52: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 52

EGEE-II Federations and Countries

Page 53: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 53

Networking (2)

– GridFTP can use multiple streams in order to take full advantage of available bandwidth

– Parallel files systems can take full advantage of underlying high speed networks - throughput can be in the order of 100MByte/s and more

– Tuning of WAN TCP must get attention, e.g. latencies are in the order of milliseconds (~20 in Europe), defaults on systems mostly not suited for bulk data transports.

Page 54: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 54

Page 55: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

The DEISA project• Objective: To enable Europe’s terascale science by the

integration of Europe’s most powerful supercomputing systems.

• DEISA is an European Supercomputing Service built on top of existing national services. This service is based on the deployment and operation of a persistent, production quality, distributed supercomputing environment with continental scope.

Grid Tutorial, Groningen, September 2006 55

Page 56: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Grid Tutorial, Groningen, September 2006 56GEAN

T

AIX distributedsuper-cluster

Vector systems(NEC, …)

Linux systems(SGI, IBM, …)

THE DEISA SUPERCOMPUTING GRID

Page 57: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833 Grid Tutorial, Groningen, September 2006 57

BSC Barcelona Supercomputing Centre Spain

CINECA Consortio Interuniversitario per il Calcolo Automatico Italy

CSC Finnish Information Technology Centre for Science Finland

EPCC/HPCx University of Edinburgh and CCLRC UK

ECMWF European Centre for Medium-Range Weather Forecast UK (int)

FZJ Research Centre Juelich Germany

HLRS High Performance Computing Centre Stuttgart Germany

IDRIS Institut du Développement et des Ressources France

en Informatique Scientifique - CNRS

LRZ Leibniz Rechenzentrum Munich Germany

RZG Rechenzentrum Garching of the Max Planck Society Germany

SARA Dutch National High Performance Computing The Netherlands

and Networking centre

Participating Sites

Page 58: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

DEISA technologies

• GPFS – parallel filesystem for transparent file access from all systems – dedicated European network used for high throughput

• Loadleveler-MC for job submission on AIX systems

• UNICORE for job submission to all systems

• Common Programming Environment (CPE) on all systems for DEISA users

• Single username on all systems

Grid Tutorial, Groningen, September 2006 58

Page 59: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Access• Users can submit proposals for access to DEISA

resources through DECI (DEISA Extreme Computing Initiative) calls

• Proposals are evaluated by national committees and depending on ranking get access to resources

• Most partners contribute about 10% of their resources for DEISA applications

• URL: www.deisa.org

Grid Tutorial, Groningen, September 2006 59

Page 60: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

Arnold Meijster & Fokke Dijkstra, RC/RuG

Grid Tutorial, Groningen, 18&19 September 2006

Infrastructure overview

Page 61: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Overview

National and international grid infrastructures

• EGEE

• NL Grid

• VL-e

• Deisa

• BIG grid (future)

Grid Tutorial - Groningen September 2006 61

Page 62: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE

• EU project for worldwide Grid

• Operational infrastructure– Support for sites and users

– Monitoring

• Multiple VOs– Many scientific communities

• VOs start locally and can grow worldwide

• gLite middleware

• EGEE organised in multiple regions

Grid Tutorial - Groningen September 2006 62

Page 63: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE

Grid Tutorial - Groningen September 2006 63

Page 64: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE (europe)

Grid Tutorial - Groningen September 2006 64

177 sites

27759 CPUs

44 PB Storage

Page 65: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE live!

• http://gridportal.hep.ph.ic.ac.uk/rtm/applet.html

Grid Tutorial - Groningen September 2006 65

Page 66: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Quality Assurance

• Site Functional tests– Operated several times a day

– Status page: https://lcg-sft.cern.ch/sft/lastreport.cgi(need certificate in browser)

Grid Tutorial - Groningen September 2006 66

Page 67: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Support

• Regional support model– Multiple regions in Europe

• For support in Northern Region(NL,BE,SE,DK,NO,FI) mail to:

[email protected]

Grid Tutorial - Groningen September 2006 67

Page 68: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

LHC tiering model

• Large Hadron Collider starts in 2007

• Petabytes of data!

• Huge computational demand

• Tiering model for distribution of data– CERN Tier0

– Several Tier1 sites, including Nikhef/SARA

– Many Tier2 sites

• Computation near data

Grid Tutorial - Groningen September 2006 68

Page 69: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Tier model

Grid Tutorial - Groningen September 2006 69

Page 70: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

NL-Grid

• National infrastructure

• Clusters at Nikhef,Sara and RuG

• Currently about 400 cpus

• Lisa cluster Sara (1500 cpus) also available

• Special bioinformatics clusters to be installed at several sites

• DAS-3 research clusters

Grid Tutorial - Groningen September 2006 70

Page 71: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

NL-Grid storage

• Storage coupled to clusters at Nikhef and SARA– Several Terabytes

• Tape storage at SARA coupled to grid– Disk pool in front

– Tape storage coupled via SAN

– Multiple petabytes and growing!

• In near future all storage will support the Storage Resource Management (SRM) interface

• San Diego Storage Resource Broker (SRB) also available at Sara

Grid Tutorial - Groningen September 2006 71

Page 72: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

VL-e

Grid Tutorial - Groningen September 2006 72

Mission: To boost e-Science by creating an e-Science environment and carrying out research on methodologies.

Page 73: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Deisa

• European super-computing grid

• Shared global file system

• Job migration

• Co-scheduling

Grid Tutorial - Groningen September 2006 73

Page 74: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Big Grid

• Improving the national grid infrastructure

• Large grant ~30 million euro

• Goals of: – 27,600 kSI2k in 2009 (~35,000 high end PC’s)

– 14.1 Petabytes of storage in 2009

• Support many scientific disciplines:– Astronomy, Biological & medical sciences,

High energy physics, Linguistics,Climate research

Grid Tutorial - Groningen September 2006 74

Page 75: Enabling Grids for E-sciencE INFSO-RI-508833 Grid Tutorial 2008.

Enabling Grids for E-sciencE

INFSO-RI-508833

Conclusions

• Many projects!

• What should I do??

• Go to your nearest expert site and ask for help

• What are your demands?– Computing

– Storage

– Other, e.g. real-time access

• Collaborate!– National

– With other people all over the world

– Start a VO

Grid Tutorial - Groningen September 2006 75