HEPiX From the Beginning Alan Silverman 28 th October 2011 HEPiX Vancouver 2011.
Large Cluster Workshop 7 th September 2001 Alan Silverman Large-Scale Cluster Computing Workshop...
-
Upload
anthony-richards -
Category
Documents
-
view
216 -
download
0
Transcript of Large Cluster Workshop 7 th September 2001 Alan Silverman Large-Scale Cluster Computing Workshop...
7th September 2001
Large Cluster Workshop Alan Silverman
Large-Scale Cluster Large-Scale Cluster Computing WorkshopComputing Workshop
held at Fermilabheld at Fermilab22-2522-25thth May 2001 May 2001
Alan Silverman and Dane Skow
CHEP 2001, Beijing
7th September 2001
CHEP 2001 Alan Silverman
2Large Cluster Workshop
OutlineOutline
Background and Goals Attendees The Challenge to be faced Format of the Workshop Panel Summaries Conclusions References
CHEP 2001 Alan Silverman
3Large Cluster Workshop
BackgroundBackground
Sponsored by HEPIX, in particular by the Large Cluster SIG
In background reading on Grid technologies, we found many papers and USENIX-type talks on cluster techniques, methods and tools.
But often with results and conclusions based on small numbers of nodes.
What is the “real world” doing? Gathering practical experience was the primary
goal
CHEP 2001 Alan Silverman
4Large Cluster Workshop
Goals Goals
Understand what exists and what might scale to large clusters (1000-5000 nodes and up).
And by implication, predict what might not scale Produce the definitive guide to building and
running a cluster - how to choose/select/test the hardware; software installation and upgrade tools; performance mgmt, logging, accounting, alarms, security, etc, etc
Maintain this.
CHEP 2001 Alan Silverman
5Large Cluster Workshop
The AttendeesThe Attendees
Participation was targeted at sites with a minimum cluster size (100-200 nodes)
Invitations were sent, not only to HENP sites but to other sciences, including biophysics. We also invited participation by technical representatives from commercial firms (sales people refused!)
Our target was 50-60 people, chosen to optimise interaction and discussion
64 people registered, 60 attended
CHEP 2001 Alan Silverman
6Large Cluster Workshop
The ChallengeThe Challenge
Fermilab Run II and CERN LHC experiments will need clusters measured in thousands of nodes
1 billion people surfing the Web
1 billion people surfing the Web
105
104
103
102
Level 1 Rate (Hz)
High Level-1 Trigger(1 MHz)
High No. ChannelsHigh Bandwidth(500 Gbit/s)
High Data Archive(PetaByte)
LHCB
KLOE
HERA-B
CDF IIaD0 IIa
CDF
H1ZEUS
UA1
LEP
NA49
ALICE
Event Size (bytes)
104 105 106
ATLASCMS
106
107
KTeV
CHEP 2001 Alan Silverman
8Large Cluster Workshop
The ChallengeThe Challenge
Fermilab Run II and CERN LHC experiments will need clusters measured in thousands of nodes
Should or could a cluster emulate a mainframe? How much can HENP compute models be adjusted to
make the most efficient use of clusters? Where do clusters not make sense? What is the real total cost of ownership of clusters? Can we harness the unused CPU power of desktops? How to use clusters for high I/O applications? How to design clusters for high availability?
CHEP 2001 Alan Silverman
9Large Cluster Workshop
LHC Computing PlansLHC Computing Plans
As proposed by MONARC, computing will be arranged in Tiers where Tier 0 runs at CERN, Tier 1 are Regional Centres, Tier 2 are National Centres and so on down to Tier 4 on desks
Grid Computing will be an important constituent But we will still need to manage very large clusters,
do we have the tools and the resources? Many sessions at this conference have already
covered MONARC, Grid Computing and associated tools; but few at the fabric level.
CHEP 2001 Alan Silverman
10Large Cluster Workshop
Workshop LayoutWorkshop Layout
Apart from a few plenary sessions, typically to set the scale of the problem as compared to where we are today, the workshop was arranged in 2 streams of highly-interactive panels
Each panel was presented with some initial questions to consider as a starting point
Each panel was “seeded” with 2 or 3 short informal talks relevant to the panel topic
The panels were summarised on the last day Proceedings are in preparation and will be
published soon
CHEP 2001 Alan Silverman
11Large Cluster Workshop
Examples of Cluster Examples of Cluster Acquisition ProceduresAcquisition Procedures
FNAL and CERN have formal tender cycles with technical evaluations. But FNAL can select the bidders, CERN must invite bids Europe-wide and the lowest valid bid wins.
Also, FNAL qualifies N suppliers for 18-24 months while CERN rebids each major order, lowest bid wins. Variety is the spice of life?
KEK funding agency demands long-term leases. The switch to PCs was delayed by in-place leases with RISC vendors
NERSC is funded by user groups buying CPU slices and disc space but NERSC still decides the configurations and NERSC still own the systems.
CHEP 2001 Alan Silverman
12Large Cluster Workshop
Panel A1 - Panel A1 - Configuration Configuration ManagementManagement
Identified a number of key tools (for example, VA Linux’s VACM and SystemImager, Chiba City tools) in use and some, strangely, not much used in HENP (eg. cfengine)
Tool sharing not so common – historical constraints, different local environment, less of an intellectual challenge
Almost no prior modelling - previous experience much more the prime planning “method”
CHEP 2001 Alan Silverman
13Large Cluster Workshop
Panel A2 - Installation, Panel A2 - Installation, UpgradingUpgrading
Dolly+ at KEK – uses a logical ring structure for speed (presented earlier this week in CHEP)
ROCKS toolkit at San Diego – uses vendor tools and stores everything in packages; if you doubt the validity of the configuration, re-install the node
European DataGrid WP4 – major challenge for first milestone (due mid-Oct); intermediary solution chosen
Burn-in tests rare (FNAL and NERSC yes) but look at CTCS from VA Linux (handle with care!)
CHEP 2001 Alan Silverman
14Large Cluster Workshop
Panel A3 - MonitoringPanel A3 - Monitoring
BNL wrote their own tools but use vendors’ tools where possible (e.g. for the AFS and LSF services)
FNAL and CERN started projects (NGOP and PEM respectively) when market surveys produced no tool sufficiently flexible, scalable or affordable
Bought-in tools in this area for our scales of cluster are expensive and a lot of work to implement but one must not forget the ongoing support costs of in-house developments
CHEP 2001 Alan Silverman
15Large Cluster Workshop
Panel A4 – Grid ComputingPanel A4 – Grid Computing
Three relevant efforts – European DataGrid, PPDG and GriPhyN. Refer to presentations earlier in this conference for details of these
Clear parallels and overlaps – it will be important to keep these in mind to avoid developing conflicting schemes which will have common (LHC) users
No PPDG or GriPhyN equivalent of the European DataGrid Work Package 4 – Fabric Management; is this a problem/risk?
CHEP 2001 Alan Silverman
16Large Cluster Workshop
Panel B1 – Data AccessPanel B1 – Data Access
Future direction heavily related to Grid activities All tools must be freely available Network bandwidth and error rates/recovery can be the
bottleneck “A single active physics collaborator can generate up to 20
TB of data per year” (Kors Bos, NIKHEF) Genomics team at Uni of Minnesota needed to access
“opportunistic cycles” on desktops via Condor because resources scheduled for 2008 are needed now because their science has moved so fast
CHEP 2001 Alan Silverman
17Large Cluster Workshop
Panel B2 – CPU, Resource Panel B2 – CPU, Resource AllocAlloc
30% of the workshop audience used LSF, 30% used PBS, 20% used Condor
FNAL developed FBS and then FBSng CCIN2P3 developed BQS The usual trade-off – resources needed to develop
one’s own tool or adapt public domain tools against cost of a commercial tool and less flexibility with regard to features
Platform (who were represented) claimed to be listening and understood the issue as regards LSF
CHEP 2001 Alan Silverman
18Large Cluster Workshop
Panel B3 - SecurityPanel B3 - Security
BNL and FNAL were (are) adopting formal Kerberos-based pilot security schemes
Elsewhere the usual procedures are in place – CRACK password checking, firewalls, local security response teams, etc
Many sites, especially those seriously hacked, forbid access from offsite with clear text passwords
Smart cards and certificates are starting to be used
CHEP 2001 Alan Silverman
19Large Cluster Workshop
Panel B4 – Load BalancingPanel B4 – Load Balancing
For distributed application sharing, use remote file sharing or perform local node re-synchronisation?
Link applications to libraries dynamically (the users usual preference) or statically (normally the sys admin’s choice)?
Frequent use of a cluster alias and DNS for load balancing; some quite clever algorithms in use
Delegate queue management to users – peer pressure works much better on abusers
CHEP 2001 Alan Silverman
20Large Cluster Workshop
Other HighlightsOther Highlights
Introduction to the IEEE Task Force on Cluster Computing: most of us did not know it existed!
Description of the issues facing Bio-physicists such as those at the Sanger Centre in the UK
Quote of the week - “a cluster is a great error amplifier” (Chuck Boeheim, SLAC)
Report from the Supercomputer Scalable Cluster Conference: they seem to consist largely (wholly?) of ASCI sites. They are already at the multi-thousand node cluster level but for them money seems to be little problem. They promised to keep in touch though.
CHEP 2001 Alan Silverman
21Large Cluster Workshop
ConclusionsConclusions
How to produce conclusions when the goal was to share experiences and discuss technologies?
Each delegate will have his/her own conclusions, suggestions to follow-up, ideas to investigate, tools to experiment with, and so on.
CHEP 2001 Alan Silverman
22Large Cluster Workshop
My ConclusionsMy Conclusions
It was a valuable sharing of experiences Many tools were exposed, some frequently
mentioned, others new to many in the audience Clusters are here to stay but they don’t solve every
problem and they bring their own, especially in the area of systems administration
Growing awareness of in-house development costs but also management and operational costs
Don’t forget the resources locked up in desktops
CHEP 2001 Alan Silverman
23Large Cluster Workshop
Cluster Builders GuideCluster Builders Guide
A framework covering all (we hope) aspects of designing, configuring, acquiring, building, installing, administering, monitoring, upgrading a cluster.
Not the only way to do it but it should make cluster owners think of the correct questions to ask and hopefully where to start looking for answers.
Section headings to be filled in as we gain experience.
1. Cluster Design Considerations1.1 What are characteristics of the computational problems ?
– 1.1.1 Is there a “natural” unit of work ?• 1.1.1.1 Executable size• 1.1.1.2 Input data size
• ……….
1.2 What are characteristics of the budget available ?– 1.2.1 What initial investment is available ?– 1.2.2 What is the annual budget available ?
– …………
……. 5. Operations
5.1 Usage
5.2 Management– 5.2.1 Installation
– 5.2.2 Testing
– ……….
CHEP 2001 Alan Silverman
25Large Cluster Workshop
Future MeetingsFuture Meetings
HEPiX (and HEPNT) - Oct 15 to 18, NERSC, LBNL (Berkeley, California) See web site http://wwwinfo.cern.ch/hepix/ for details
IEEE TFCC - Oct 8 to 11, Newport Beach, Calif. Large Cluster Workshop - late 2002 or early 2003.
By invitation; contact me if interested to receive news
CHEP 2001 Alan Silverman
26Large Cluster Workshop
ReferencesReferences
Most of the overheads presented at the workshop can be found on the web site
http://conferences.fnal.gov/lccws/ You will also find there the full programme, and
soon (end October?) the Proceedings (now in preparation) and some useful cluster links (including many links within the Proceedings).
Other useful links for clusters IEEE Cluster Task Force http://www.ieeetfcc.org Top500 Clusters http://clusters.top500.org