Merit Allocation Training 2022 - support.pawsey.org.au
Transcript of Merit Allocation Training 2022 - support.pawsey.org.au
2
Table of Contents
Photo Credit: The Rottnest Island Authority
• Capital Refresh Overview
• Setonix Overview (Hardware and Software available)
• Accounting Model
• Merit Allocation Schemes
• How do I apply?
3
Capital Refresh Overview
Photo Credit: The Rottnest Island Authority
• Capital Refresh Overview
• Computational Capability
• Setonix Availability Timeline
4
Capital Refresh Overview
• Several initial procurements completed• Garrawarla compute cluster• Astronomy high speed storage
• Cloud high throughput computing
• Acacia object storage system installation in progress
• Setonix will be arriving soon• Phase 1: CPU-based
• Phase 2: GPU-based and additional CPUs
Merit Allocation Training 2022
5
Computational CapabilityWhat is the increase in computational capacity?
• Merit allocations not bound to specific partitions, support for large-scale jobs using most of Setonix
• Significant increase in computational capacity for the available schemes on Setonix.
• Increase in double precision floating point operations available are:
1.1 petaflops
Magnus Setonix Phase 1
2.7 petaflops
Setonix Phase 2
50 petaflops
Merit Allocation Training 2022
6
Setonix Availability TimelineHow does it affect merit applications?
• Setonix’s full capacity will be available to researchers in 2023 allocation schemes
• In 2022 allocations will go through a transition from current to new model
Merit Allocation Training 2022
7
Setonix Overview
• Setonix Hardware Overview
• Key Hardware Changes
• Storage Overview
• Software Overview & Changes
8
Setonix Hardware OverviewPhase – 1 provides
• CPU compute
• Fast interconnect
• High Memory & Visualisation Nodes
• High-performance filesystems (LusterFS)
Phase – 2 will add
• Additional CPU compute
• Production-level GPU compute
• Slingshot upgrade to +200Gbs
Acacia system
• Large-volume storage (Object Store, CEPH, S3)
Merit Allocation Training 2022
9
Key Hardware Changes
• Moving from 24 core Intel nodes to 128 core AMD nodes
• Changing from 64 GB to 256 GB (more memory per node)
• Changing from 2.5 GB per core to 2 GB per core (slightly less memory per core)
• Exclusive node access to shared node access
• Project storage on /group will move to the Acacia object store
• Software installations on /group will move to the /software file system
Merit Allocation Training 2022
10
Storage Overview
Supercomputing File Systems
/home
• Like current Pawsey systems, minimal storage (NFS)
/software
• LusterFS storage used for software, replaces some functionality of /group
/scratch
• Fast LusterFS workflow storage. Data should be moved in/out of Object Store/offsite
Acacia Object Store
• Large-volume project storage, uses S3 interface
IMPORTANT: /group will no longer exist Merit Allocation Training 2022
11
Software Overview & Changes
Overview
• HPE/Cray provides optimised compilers, libraries and tools
• Pawsey supports software used in many scientific domains
Key Changes
• Moving from Intel architecture to AMD means a move from Intel compilers to AMD & Clang-based compilers
• Move from OpenACC to OpenMP
• Move from MAALI to Spackinstallation tool
Merit Allocation Training 2022
12
Accounting Model
• What is an accounting model?
• The Previous Accounting Model: Magnus
• The New Account Model: Setonix
• Setonix Account Model Examples
• The Accounting Model: Setonix vs Gadi
13
What is an accounting model?
The accounting model determines what a user is charged for, and how much.
• Traditionally, the consumable resource is the hourly usage of CPU cores of a supercomputer.
• Service Unit (SU) is the unit of measure for consumable supercomputing resources.• 1 SU is equivalent to 1 hour use of 1 CPU core.• Cost of a job (SU): number of CPU cores requested (CPU) × wall time (h).
Examples:
• 1 SU = 1 hour use of 1 CPU core = ½ hour use of 2 CPU cores.
• 576 SU = 24 hours on 1 Magnus node (24 CPU cores)= 4.5 hours on 1 Setonix node (128 CPU cores)
Merit Allocation Training 2022
14
The Previous Accounting Model: Magnus
• Consumable resource: hourly usage of CPU cores.
• 1 SU = 1 hour use of 1 CPU core
Exclusive node usage:
• Resources are allocated and charged for at a compute node granularity.
• At any time, at most one job has access to cores in a node.
• If a job doesn't use all the cores in a node, the consumable resource is wasted.
• Hence, it is also charged for time on idle cores.
• Cost of a job (SU): 24 × nodes × wall time.
Exclusive Node Use on Magnus
NODE
Merit Allocation Training 2022
15
The New Accounting Model: Setonix• Consumable resource: hourly usage of CPU cores (CPUs).
• 1 SU = 1 hour use of 1 CPU core
Proportional node usage:
• Resources are allocated at a sub-node granularity.
• Multiple jobs can run on the same node.
• A job is charged for the largest fraction of resources used.
• RAM consumption is mapped to CPU consumption.
• RAM consumption by 1 job may affect other jobs on the same node.
• Cost of a job (SU): largest fraction × nodes × wall time.
• Min: 1 SU per hour • Max: 128 SU per hour per node
Proportional Node Use on Setonix Phase 1
NODE
Merit Allocation Training 2022
16
Setonix Accounting Model Examples
Examples of Setonix proportional node usage
Example 1: RAM proportion (2/3) is bigger than CPU cores proportion (½).
Example 2: CPU cores proportion (2/3) is bigger than RAM proportion (½).
NODE NODE
Merit Allocation Training 2022
17
The Accounting Model – Setonix vs Gadi
Comparing the accounting models of Setonix and Gadi
When applying for NCMAS, remember that NCI charges 2 service units for 1 hour use of 1 core on Gadi.
ResourcesService Units
Setonix (128 cores per node)
Gadi (48 cores per node)
1 CPU core / h 1 2
1 CPU / h 64 48
1 Node / h 128 96
Merit Allocation Training 2022
18
Merit Allocation Schemes
• Merit Allocation Schemes on Setonix
• Timeline
• Early Access and GPU Migration
19
Merit Allocation Schemes on SetonixApplications for 2022 merit allocations are open for Setonix CPU partition:
The National Computational Merit Allocation Scheme (NCMAS)
• Annual allocation call open to the whole Australian research community• Meritorious, computational research projects
The Pawsey Partner Merit Allocation Scheme
• Annual call open to researchers in Pawsey Partner institutions• Meritorious, computational research projects• Partner institutions: CSIRO, Curtin University, Edith Cowan University, Murdoch
University and The University of Western AustraliaNOTE: The Pawsey Energy & Resources Merit Allocation Scheme will be discontinued. From 2022, there are no more calls for this Scheme. Researchers from the Australian energy and resources research community are encouraged to apply through NCMAS and Pawsey Partner schemes.
Merit Allocation Training 2022
20
Timeline
Dates Milestone18 August 2021 Applications open5 October 2021 Applications close (5pm AWST)1-2-3 December 2021 Allocation Committee meeting21 December 2021 Allocations announced1 Jan 2022 Access to allocations commences
The National Computational Merit Allocation Scheme (NCMAS)
The Pawsey Partner Merit Allocation SchemeDates Milestone31 August 2021 Applications open11 October 2021 Applications close (5pm AoE - Anywhere on Earth Time)2nd half of December 2021 Allocation Committee meeting22 December 2021 Allocations announced1 Jan 2022 Access to allocations commences
Merit Allocation Training 2022
21
Early Access and GPU Migration
Researchers can apply for early access to Setonix resources separately to the Merit Allocation Calls:
• Setonix CPU Early Adopters EOI will be sent to all current Magnus projects in Q4 2021
• Setonix GPU Early Science Call will be available for 2H 2022
Topaz GPU cluster will be available for GPU migration purposes:
• Access will be provided to Merit Allocation projects on request
• Programming environment supports AMD GPU porting (with HIP and OpenMP)
• Container environment supports AI/ML workloads
Merit Allocation Training 2022
22
How do I apply?
• Setonix Allocations Requests in 2022
• Examples: 1st and 2nd Request
• Benchmarking and Scaling
• Magnus vs Setonix Benchmarks Comparison
• Application Portal Information
• Help & Further Assistance
• Questions?
23
Setonix Allocation Requests in 2022
Scheme 1st Request(full year)
2nd Request(2H 2022 pro rata)
NCMASTotal capacity 100 MSU 150 MSU
Minimum request 250 kSU 1 MSU
Pawsey PartnerTotal capacity 110 MSU 190 MSU
Minimum request 100 kSU 1 MSU
In 2022 researchers applying through NCMAS and Pawsey Partner Schemes will do so separately for Setonix Phase 1 (1st Request) and Setonix Phase 2 (2nd Request) CPU allocations.
NOTE: 1 kSU = 1000 SU, 1 MSU = 1000000 SUMerit Allocation Training 2022
24
Examples: 1st and 2nd Requests
Example 1: Research Group A
Research Group A was awarded:
• 1st Request: 2 MSU, and
• 2nd Request: 10 MSU
Setonix Phase 2 becomes available for researchers on the first day of Q4 2022.
The real allocation of Research Group A is:
• Setonix Phase 1 available throughout the year: 2 MSU
• Setonix Phase 2 available in Q4 2022: 5 MSU
Example 2: Research Group B
Research Group B was awarded:
• 1st Request: 1 MSU, and
• 2nd Request: 5 MSU
Setonix Phase 2 becomes available for researchers on the first day of 2H 2022.
The real allocation of Research Group B is:
• Setonix Phase 1 available throughout the year: 1 MSU
• Setonix Phase 2 available in 2H 2022: 5 MSU
Merit Allocation Training 2022
25
Benchmarking and Scaling
• Benchmarking and Scaling information is an important part of the application• Demonstrates efficient use of a
supercomputer
• Include scaling information for a typical job you will run• Ideally provide scaling tests of your
jobs on Magnus• At minimum provide scaling tests of
the software on a system
Cores Nodes Walltime(hours)
Cost(Node hours)
128 8 11.2 90
256 16 4.4 70
512 32 2.1 67
1024 64 1.0 64
2048 128 0.91 116
4096 256 0.75 192
Real-world example using NWChem
• Which configuration is best for you and why?
• 1024-cores is the most efficient use of core hours, as well as giving good job walltime
• The choice should be used in the calculation of how much total allocation you require
Merit Allocation Training 2022
26
Magnus vs Setonix Benchmarks Comparison
• Setonix will not be available for benchmarking of codes and workflows for 2022 allocation requests.• SU cost per simulation comparison between Magnus and Setonix may vary.• For codes benchmarked by Pawsey and HPE Cray we have noted:
• Average of 20% increase in SU cost• Shorter time to solution per simulation (up to 3.7x)
We recommend:
• 1st SU request: calculate based on Magnus SU cost per simulation• 2nd SU request:
• Ask for additional capacity• Add 20% of your first SU request
IMPORTANT: Setonix allows for overuse with reduced priority. Overuse is capped at 50% of the original allocation.
Merit Allocation Training 2022
27Merit Allocation Training 2022
NCMAS - Application Portal
• Apply for Setonix access online through the NCMAS scheme:http://ncmas.nci.org.au/
28Merit Allocation Training 2022
Pawsey Partner Scheme - Application Portal
• Apply for Setonix access online via the Pawsey Partner scheme:https://ssl.linklings.net/organizations/pawsey/
29
Help & Further Assistance
Changes in Supercomputing Services for 2022
https://support.pawsey.org.au/documentation/display/US/Changes+in+Supercomputing+Services+for+2022
Pawsey webpage: https://pawsey.org.au
Pawsey Friends mailing list: https://pawsey.org.au/pawsey-friends/
Pawsey Twitter feed: @PawseyCentre
Pawsey YouTube Channel:
https://www.youtube.com/pawseysupercomputingcentre
User Support Portal: https://pawsey.org.au/support/
Merit Allocation Training 2022