Neil Geddes
NeSC, May 2004
Compute Grids, Data Grids and Service Grids
Dr Neil Geddes CCLRC Head of e-Science
Director of the UK Grid Operations Centre
Neil Geddes
NeSC, May 2004
Compute Grids, Data Grids and Service Grids
- What they are- What they can do- Where they can be found- What the future holds in this arena
Neil Geddes
NeSC, May 2004
Compute Grids, Data Grids and Service Grids
What are they ?
4SIAC 2000, Wright State University, August 21, 2000
What is a computational grid?
• A pool of computational resources that can be “plugged into” via standard interfaces.
• Processors• Data storage devices• Instruments
Neil Geddes
NeSC, May 2004
Compute Grids
• Focus on high throughput computing– Clusters of computers
• Some very big– Clusters of clusters– HPC meta-computing– HPC + pre + post processing
• Grids enable coordination across administrative boundaries
• Key components:– Authentication, Authorisation– Resource discovery– Job submission/retrieval– Networking
NASA Information Power Grid
Neil Geddes
NeSC, May 2004
Data Grids• Focus on
– Large data volumes– Coordinated data access
• Heterogeneous and distributed data
– Importance of metadata• e.g.
– Virtual Observatories– Medical images
• Important components– Authentication, Authorisation– Resource discovery– Data transfer– Confidentiality– Networking
X-ray opticalinfra-red radio
Neil Geddes
NeSC, May 2004
Service Grids• Focus on
– Everything else: – What you want to do rather than how it is done– Integrate audio visual tools– Remote control and tele-presence
• Microscopes, Beamlines, test equipment• Integrated with compute and data grid• Integrate with other services
– Journal archives, website management• Service based architectures
– Web services• Important components
– Authentication, Authorisation– Resource discovery– Data transfer– Confidentiality– Common Interfaces
Neil Geddes
NeSC, May 2004
Common Grid Features
– Authentication – Authorisation– Accounting– Resource discovery– Data transfer– Confidentiality– Security– Automation
Different emphasis for different deployments/problems
Grid computing is about common standards/interfaces to enable inter-enterprise, collaborative computing.
Neil Geddes
NeSC, May 2004
Compute Grids, Data Grids and Service Grids
What can they do ?
Where can they be found ?
Neil Geddes
NeSC, May 2004
(some) US Grid Projects:
• Information Power Grid (IPG) Production Grid for aerosciences and other NASA missions.
• Network for Earthquake Eng. Simulation Grid (NEESGrid) Production Grid for earthquake engineering.
• National Virtual Observatory (NVO) Production Grids for data analysis in astronomy.
• Particle Physics Data Grid (PPDG) Production Grids for data analysis in high energy and nuclear physics
• Southern California Earthquake Center 2 Full geophysics modeling using Grids and knowledge-based systems.
• TeraGrid U.S. science infrastructure linking four major resource sites at 40 Gb/s.
• DOE Science Grid (DOESG) supplies persistent Grid services.
• EdGrid promote applications of modeling and visualization in science and mathematics education, remote control of instruments (electron microscope) for K-12
• Biomedical Informatics Research Network (BIRN) An NCRR initiative aimed at creating a testbed to address biomedical researchers' need to access and analyze data at a variety of levels of aggregation located at diverse sites throughout the country.
Neil Geddes
NeSC, May 2004
UK eScience Projects
CLEF A Co-operative Clinical e-Science Framework
BiosimGRID A GRID Database for biomolecular simulations
e-HPTX An e-Science resource for High Throughput Protein Crystallography
AstroGrid A Virtual Observatory for the UK
BAIR Biological Atlas of Insulin Resistance
ClimatePrediction.com Distributed computing for a global climate (NERC Pilot)
DAME Distributed Aircraft Maintenance Environment
Neil Geddes
NeSC, May 2004
e-Protein A distributed pipeline for structural-based proteome annotation using GRID technology
e-Minerals Environment from the molecular level: an e-Science proposal for modelling the atomistic processes involved in environmental issues.
Integrative Biology A robust and fault tolerant Grid infrastructure fro biomedical science
GENIE Grid Enabled Integrated Earth system model
GEODISE Grid Enabled Optimisation & Design Search for Engineering
myGrid Directly Supporting the E-Scientist Comb-e-Chem Structure-Property Mapping: Combination Chemistry & the Grid
NERC DataGrid Data discovery and delivery for the NERC community
GridPP The Grid for UK Particle Physics
EU funded Grid Projects
e-science and the UK GRID
LHCb
ATLAS
CMS
CMS
LHC Computing Grid Project
climateprediction.net• Launch ensemble of coupled simulations of 1950-2000 and compare with observations. • Largest climate model ensemble ever (by factor of >200)• >45,000 users, >15,000 complete model runs, >1,000,000 model years in ~3 months (this is
equivalent to 1.5 Earth Simulators)
• Screensaver” requires – 10 CPU days on a 1.4GHz P4,>128MB memory, 600MB disk space
• Global outreach (participants in all 7 continents, inc. Antarctica!)• Generated much interest in schools (coolkidsforacoolclimate.com)
http://www.nbirn.nethttp://www.nbirn.net
Testbed for a biomedical knowledge infrastructure Creation and support federated bioscience
databases Data integration Interoperable analysis tools Datamining software Scalable and extensible
• Driven by research needs pull, not technology push
Testbed for a biomedical knowledge infrastructure Creation and support federated bioscience
databases Data integration Interoperable analysis tools Datamining software Scalable and extensible
• Driven by research needs pull, not technology push
What is BIRN?What is BIRN?
BIRN TodayBIRN Today Established three neuroscience testbeds building on
previously funded R01 research projects:- Mouse BIRN - Morph BIRN- Functional BIRN - BIRN Coordinating Center
Integrating the activities of the advanced biomedical imaging and clinical research centers in the US.
Developing hardware and software infrastructure for managing distributed data: creation of data grids.
Exploring data using “intelligent” query engines that can make inferences upon locating “interesting” data.
Building bridges across tools and data formats.
Changing the use pattern for research data from the individual laboratory/project to shared use.
Established three neuroscience testbeds building on previously funded R01 research projects:
- Mouse BIRN - Morph BIRN- Functional BIRN - BIRN Coordinating Center
Integrating the activities of the advanced biomedical imaging and clinical research centers in the US.
Developing hardware and software infrastructure for managing distributed data: creation of data grids.
Exploring data using “intelligent” query engines that can make inferences upon locating “interesting” data.
Building bridges across tools and data formats.
Changing the use pattern for research data from the individual laboratory/project to shared use.
IT Infrastructure to hasten the derivation of new understanding and treatment of disease through use of distributed knowledge
IT Infrastructure to hasten the derivation of new understanding and treatment of disease through use of distributed knowledge
BIRN NetworkBIRN Network
Neil Geddes
NeSC, May 2004
Through the NEESgrid, researchers will:•perform tele-observation and tele-operation of experiments; •publish to and make use of a curated data repository using standardized markup; •access computational resources and open-source analytical tools; •access collaborative tools for experiment planning, execution, analysis, and publication.
The components of the NEESgrid system will be completed by September, 2004, when management and operation of the NEES system will be turned over to a consortium of earthquake engineer researchers and practitioners.
Home > About
About NEESgrid will link earthquake researchers across the U.S. with leading-edge computing resources and research equipment, allowing collaborative teams (including remote participants) to plan, perform, and publish their experiments.
Generic Experiment in Progress (an instance or “test”)
Web Portal
SimulationPrograms
User Interface
NEES Equipment
RemoteCollaborator
EquipmentSpecialist
RemoteInvestigator
camea
Collaboratione-mail, VTC, web pages
Local Storage
video
Local Video Processor(Internet Appliance)
VideoServer
Video Server
NEESPOP
Grid Services
LabVIEWDAQ
videovideo
da
ta
Operation and ControlLines
data and video
Collaboration Services
Tele-Presence and Video Servers
(liv
e)
vid
eo
str
ea
ms
Streaming DataServer
Neil Geddes
NeSC, May 2004
Compute Grids, Data Grids and Service Grids
What the future holds ?
Neil Geddes
NeSC, May 2004
community-initiated forum of thousands of individuals from industry and research leading the global standardization effort for grid computing. GGF's primary objectives are to promote and support the development, deployment, and implementation of Grid technologies and applications via the creation and documentation of "best practices" - technical specifications, user experiences, and implementation guidelines.
The drive toward standardisation
•Horizontal and e-business framework •Web Services •Security •Public Sector •Vertical industry applications•WS-RF (from GGF)
OASIS is a not-for-profit, global consortium that drives the development, convergence and adoption of e-business standards
Neil Geddes
NeSC, May 2004
Enabling Grids for E-science in Europefor Everyone
Neil Geddes
NeSC, May 2004
EGEE - Consortia
10 European Consortia (incl. GEANT/TERENA/DANTE)+ US + Russia
UK e-Science:PPARC + Core Programme
Oxford and Leeds (White Rose Grid)
Manchester and CCLRC-RAL
Also includes: http://www.csar.cfs.ac.uk/
256 Itanium2 processor SGI Altix512 processor Origin3800
•Thus, the NGS provides access to over 2000 processors, over 36TB of "data-grid"
capacity, common scientific applications and extensive data archives.
•Other resource providers anticipated to join in the future …
http://www.hpcx.ac.uk/Full installation = 1600 IBM p690+ Regatta processors currently 1236 processors
EMBL Nucleotide Sequences
NCBI, BLAST, EMBOSS, FASTA, Gaussian
More than just computation and data resources…
In future will include services to facilitate collaborative (grid) computing•Authentication (PKI X509)•Job submission/batch service•Authorisation•Certificate management•Virtual Organisation management•Data access/integration services (SRB/OGSA-DAI/DQPS)•Information service•National Registry (of registry’s)•Data replication•Data caching•Grid monitoring•Accounting
Neil Geddes
NeSC, May 2004
Concluding Remarks
• Huge worldwide research activity• Push towards standardisation and intersection with e-
Business (web services)• Increasing grid infrastructure deployed
‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access
to information.’
Tony Blair, 2002
Neil Geddes
NeSC, May 2004
The End
Response of Atlantic circulation to freshwater forcing
The Particle Physics Challenge
Storage – Raw recording rate 0.1 – 1 GByte/sec
Accumulating at ~10 PetaBytes/year
10 PetaBytes of disk
Processing – >100,000 of today’s fastest PCs
CMS
LHCb
ATLAS
CERN/LHC Community
Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users
Top Related