The CRI compute cluster CRUK Cambridge Research Institute.
-
Upload
archibald-tyler -
Category
Documents
-
view
232 -
download
0
Transcript of The CRI compute cluster CRUK Cambridge Research Institute.
The CRI compute clusterCRUK Cambridge Research Institute
The CRUK Cambridge Research InstituteFounded to enable translational research Basic biology Early phase clinical trials Late phase translational studies
Leveraging the specialist experience and facilities provided by The University of Cambridge Addenbrooke’s Hospital
A CRUK facility, hosting CRUK core services (including Information Systems) CRUK Groups and Group Leaders Cambridge University Groups and Group Leaders
Research objectives with significant Information Systems demandsGenomics - Clonal sequencing (Solexa) generating ~32TB per annum per sequencer using 8-16 CPU cores full timeHistopathology Scanners generating 16TB per annumMicroscopy Generating 8+TB per annum Processing time series sequencesIn vivo imaging MRI, PET-CTSystems Biology 20+ systems biology researchers working on expression data, network
models etc.
Multiple groups, similar requirements
MRI imaging
ComputeHigh performance
StorageLong term
storage
Genomics
Bioinformatics
Tavare Group
Institute
2007/2008 Architectural consolidation
MRI imaging
HP Blade Cluster
HP LustreSFS storage
Long termstorage
Genomics
Bioinformatics
MacOS X SAN
I/O storage
“Virtual” group infrastructure using LSF
MRI imaging
HP Blade Cluster
HP LustreSFS storage
Long termstorage
Genomics
Bioinformatics
MacOS X SAN
I/O storage Institute
Tavare Storagepolicies
Genomics
MRI imaging
Bioinformatics
Institute
Tavare
Platform LSFjob scheduler
2008/2009 Storage consolidation
HP Blade Cluster
HP LustreSFS storage
Long termstorage
EMC SAN
I/O storage Institute
Tavare
Storagepolicies
Genomics
MRIimaging
Bio-informatics
Genomics
MRI imaging
Bioinformatics
Institute
Tavare
Platform LSFjob scheduler
The CRUK CRI cluster
The CRUK CRI cluster
Blades BladesHead node
I/O node
SFSstorage
Solexa storage
Aperiostorage
Ariolstorage
I/O storage
Networking
Blades BladesHead node
I/O node
SFSstorage
Solexa storage
Aperiostorage
Ariolstorage
I/O storage
Networking
Desktop clientInput files
Output files
LSF job submission
Linux home directoriesShared binaries for blades
/data for input – output to network/usr/local/bin for shared binaries
/lustre high performance storage
Seeing the cluster from the desktop
The I/O storage and linux homes are visible from the CRI network:
Filesystems/home 100GB Linux home directories Visible from all the cluster nodes Use for local code, scripts etc backed up/data 2.7TB Use for delivering data to and from the cluster Lower performance to the blades – not used for processing Not backed up, files over 2 weeks old may be deleted without warning/lustre 16TB High performance, use for processing Not backed up, files over 1 month old may be deleted without warning
Platform LSF - Queue structureOwnership of Blades: Core facilities
- Genomics Genomics (6x8 cores)
- Imaging Imaging (3x8 cores)
- Bioinformatics bioinformatics
- Information Systems information_systems
Groups- Tavare Lab
stlab (18x8 cores) high_memory (2x8 cores)
- Other Groups cluster (4x8 cores)
…But ownerhsip doesn’t necessarily match daily usage patterns.
Balanced Scheduling – Fairshare Policy
Group Share
Simon Tavaré Group 20
Genomics 5
Bioinformatics 6
Imaging 3
General 4
DynamicPriority
=number_of_shares
( cpu_time * CPU_TIME_FACTOR + run_time * RUN_TIME_FACTOR + (1 + job_slots) * RUN_JOB_FACTOR )
Information Systems Processes
User accounts Managed via central Service Desk Linux accounts bound to AD (Windows/Mac) accountsTroubleshooting Linux support in London and Cambridge Accessed via Service DeskSoftware installation All Blades share binaries Users can put local code in home directory IS department will install common code in /usr/local/bin
Summary: The CRUK Cambridge Research Institute is delivering a shared computational science infrastructure
Principle• “Virtualisation” to make scalable, easy to administer systems• Common architecture to deliver cost and service benefits
Practice• Blade architecture suitable for most computing needs• Networking and storage need careful design
Benefits• Optimal use of resources• Low wastage• Excess capacity “buffers” new experimental and development techniques•…to date, provision of compute power hasn’t limited science at the CRI