Introduction to grids Taavi Hupponen, CSC. Definition? There are as many definitions as there are...
-
Upload
morris-powers -
Category
Documents
-
view
213 -
download
0
Transcript of Introduction to grids Taavi Hupponen, CSC. Definition? There are as many definitions as there are...
Introduction to grids
Taavi Hupponen, CSC
Definition?
There are as many definitions as there are grids… Power grid analogy really isn’t a very good one Grids aim to provide easy, efficient and secure access
to distributed resources
How to recognize a grid?• Resource sharing (cpu, storage…)• Spans over organization borders• Security• Based on open standards
Grid types
Categorization of grids is more or less artificial, most grids fall into several categories
Computational grids• The traditional grid• Connecting clusters, workstations and supercomputers• Examples: EGEE, DEISA, SETI@home
Data grids• Easy, efficient and powerful access for data• Uniform interface, distribution and replication of large data sets• Examples: Bridges, BIRN, peer-to-peer file sharing networks like BitTorrent?
Knowledge grids, services grids
Building blocks
Most of the grids are built of same basic blocks, including
• Computing elements• Storage elements• User interface• Job management• User management• Security
Middleware
The building blocks are implemented by the middleware of the grid Middleware acts between an application and the operating systems
of the grid nodes The term ’middleware’ is used quite loosely, it can mean almost
anything Examples:
• LCG-2 and gLite (EGEE)
• Nordugrid ARC (SweGrid, M-grid)
• Unicore (DEISA)
• Globus Toolkit
Unfortunately middlewares don’t work very well together, work is being done to improve grid interoperability
Common grid user interfaces
Command-line interfaces• Still the most common way of using grids• Almost like using a batch job system in a local cluster:
Write the job descriptionSubmit the jobPoll for statusGet the results
• In addition: certificate handling
Graphical clients• Often include workflow features
Web portals• Either hide or expose the grid middleware• One portal for one or more grids (P-GRADE)
Security in grids
With most grids, security has been considered from the beginning, unlike with for example World Wide Web
Grid security:• Is based on Public Key Infrastructure (PKI), which is a robust
security mechanism used by for example ssh and ssl• Usernames and passwords are replaced by certificates• Certificates are provided by trusted entities called Certificate
Authorities • PKI provides authentication, integrity and confidentiality
Virtual organisations
Access to grid resources is often controlled in Virtual Organisation level instead of individual users so
VOs are based on collaboration, geographical location, scientific field
Example: Biomed VO in EGEE
In its simplest form: list of user identities, can also include the programs that are to be used
Putting programs into the grid
Programs installed by grid admins• Either at all or only at some nodes• There usually is a a common set of programs that can be found on
each node of a grid (basic utilities, compilers etc.)• Nodes have mechanisms for advertising which programs are installed
Programs installed by grid users• Program is sent to the node with the job description and input data• You need to consider hardware architecture, operating system and
library issues
What kind of problems fit into grids?
Non-parallel problems• As if running on local workstation
Embarassingly parallel problems• The problem is easily split into smaller independent jobs that
can be distributed inside a site or even among several sites• Very well suited for grids
Most problems are in-between and are best executed inside one site
Grid examples
EGEE• Grid of heterogenous clusters and workstations• Over 30,000 cpu, 5 Petabytes of storage• EGEE project ended in March 2006, EGEE II started in April 2006• Funded by EU FP6• http://www.eu-egee.org
DEISA• Grid of supercomputers (mostly IBM)• For High Performance Computing applications• Funded by EU FP6• http://www.deisa.org
Challenges
Constant development makes it challenging for users and admins to keep up
Distribution adds overhead, decreases control and transparency
Usability issues
Grid interoperability issues
Benefits
Grids don’t increase resources – they make usage of existing resources more efficient
• Load-balancing, idle resources to use
Handling of large computations or data sets that aren’t possible within single site (CERN LHC)
Increased collaboration
Grids are still developing, but already offer good opportunities.