Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.
-
Upload
adele-waters -
Category
Documents
-
view
220 -
download
1
Transcript of Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.
![Page 1: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/1.jpg)
Using GENI for computational science
Ilya Baldin
RENCI, UNC – Chapel Hill
![Page 2: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/2.jpg)
Networked Clouds
Cloud and Network Providers
Observatory
Wind tunnel
Science Workflows
![Page 3: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/3.jpg)
ExoGENI Testbed
![Page 4: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/4.jpg)
4
Computational/Data Science Projects on ExoGENI
• ADAMANT – Building tools for enabling workflow-based scientific applications on dynamic infrastructure (RENCI, Duke, USC/ISI)
• RADII – Building tools for supporting collaborative data-driven science (RENCI)
• GENI ScienceShakedown – ADCIRC storm surge modeling on GENI
• Goal of presentation to demonstrate some of the things that are possible with GENI today
![Page 5: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/5.jpg)
Presentation title goes here 5
ADAMANT
![Page 6: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/6.jpg)
Presentation title goes here 6
Scientific Workflows – Dynamic Use Case
![Page 7: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/7.jpg)
7
CC-NIE ADAMANT – Pegasus/ExoGENI
• Network Infrastructure-as-a-Service (NIaaS) for workflow-driven applications— Tools for workflows integrated with adaptive infrastructure
• Workflows triggering adaptive infrastructure— Pegasus workflows using ExoGENI— Adapt to application demands (compute, network, storage)— Integrate data movement into NIaaS (on-ramps)
• Target applications — Montage Galactic plane ensemble: Astronomy mosaics— Genomics: High-Throughput Sequencing
![Page 8: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/8.jpg)
8
ExoGENI: Enabling Features for Workflows
• On-Ramps / Stitchports— Connect ExoGENI to existing static
infrastructure to import/export
• Storage slivering— Networked storage: iSCSI target on
dataplane
— Neuca tools attach lun, format and mount filesystem
• Inter-domain links, multipoint broadcast networks
![Page 9: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/9.jpg)
Computational workflows in Genomics
• Several versions as we scaled:
• Single machine• Cluster based• MapSeq: specialized
code & Condor• Pegasus & Condor
RNA-Seq
WGS
![Page 10: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/10.jpg)
10
VM
VMVMVMVMVMVM
VMVMVMVMVMVM
VMVMVMVMVMVM
Cloud providers (compute, data)
Goal: learning to use NIaaS for biomedical research
VMVMSlice 1
VMVM
VM
Slice 2
VM
User or workflow provisioned & isolated slices
VMVMVMVM
Network providers
![Page 11: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/11.jpg)
Goal: Management of data flows in NIaaS
RENCI UNC
iRODS Data GridiCAT
RERE
VMVM
VM
Slice 2
VM
• Layer 2 connection within the slice
• Metadata control• Lab X can compute on Project
Y data in the cloud
• User X can move data from Study A to the cloud
• Data from Study W cannot remain on cloud resources
• Ease of access• Control over access• Auditing• Provenance
![Page 12: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/12.jpg)
12
Example ExoGENI requests auto-generated
![Page 13: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/13.jpg)
Application to NIaaS - Architecture
![Page 14: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/14.jpg)
Presentation title goes here 14
RADII
![Page 15: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/15.jpg)
RADII
• RADII: Resource Aware Data-centric Collaboration Infrastructure– Middleware to facilitate data-driven collaborations for domain
researchers and a commodity to the science community– Reducing the large gap between procuring the required
infrastructure and manage data transfers efficiently• Integration of data-grid (iRODS) and NIaaS (ORCA)
technologies on ExoGENI infrastructure – Novel tools to map data processes, computations, storage and
organization entities onto infrastructure with intuitive GUI based application
– Novel data-centric resource management mechanisms for provisioning and de-provisioning resources dynamically through out the lifecycle of collaborations
![Page 16: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/16.jpg)
Why iRODS in RADII?
– RADII Policies to iRODS Rule Language• Easy to map policies to iRODS Dynamic PEP
• Reduced complexity for RADII
– Distributed and Elastic Data Grid
– Resource Monitoring Framework
– Geo-aware Resource hierarchy creation via
composable iRODS
– Metadata tagging
![Page 17: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/17.jpg)
Resource Awareness
• iRODS RMS provides node specific resource utilization
• End-to-End parameters such as throughput, current network flow is important for judicious placement, replication and retrieval decision
• Created end-to-end Throughput, Latency and instantaneous transfer RX/TX per second monitoring.
• The best server selection based on end-to-end utility value:
![Page 18: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/18.jpg)
Experiment Topology
Figure: Experimental Setup Topology
![Page 19: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/19.jpg)
Experimental Setup
• The sites were : UCD, SL, UH, FIU• Parallel and multithreaded file ingestion from each of the clients• Total 400GB file ingestion from each client• One copy at the edge node and another replication based on utile
value.
![Page 20: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/20.jpg)
Edge Put and Remote Replication Time
Figure: Edge Node Put Time Figure: Remote Replication Time
![Page 21: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/21.jpg)
Presentation title goes here 21
ScienceShakedown
![Page 22: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/22.jpg)
Motivation
Hurricane Sandy (2012)
![Page 23: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/23.jpg)
Motivation
Real-time, on-demand computations of storm surge impacts• Hazards to coastal areas a major concern • Hazard/Threat Information needed ASAP (Urgently)• Critical need for:
– detailed high spatial resolution large compute resources
• Federal Forecast cycle every 6 hrs• Must be well within Cycle to be relevant/useful • I.e., New information at 5:59 is already old!!!
![Page 24: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/24.jpg)
Computing Storm Surge
• ADCIRC Storm Surge Model– FEMA-approved for Coastal Flood Insurance Studies – Very high spatial resolution (millions of triangles)– Typically use 256-1024 cores for real-time (one simulation!)
ADCIRC grid for coastal North Carolina
![Page 25: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/25.jpg)
Tackling Uncertainty
Research EnsembleNSF Hazards SEES project
22 members, H. Floyd (1999)
One simulation is NOT enough!Probabilistic Assessment of Hurricanes
A “few” likely hurricanesFully dynamic atmosphere (WRF)
![Page 26: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/26.jpg)
Why GENI?
• Current limitations: Real-time demands for compute resource
– Large demands for real-time compute resources during storms
– Not enough demand to dedicate a cluster year-round
![Page 27: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/27.jpg)
Why GENI?
• Current limitations: Real-time demands for compute resource
– Large demands for real-time compute resources during storms
– Not enough demand to dedicate a cluster year-round
• GENI enables
– Federation of resources
– Cloud bursting, urgent, on-demand
– High-speed data transfers to/from/between remote resources
– Replicate data/compute across geographic areas
• Resiliency, performance
![Page 28: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/28.jpg)
Storm Surge Workflow
Parallel task (32 Core MPI)
• Each ensemble member is a high-performance parallel task that calculates one storm
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
ComputeCore
![Page 29: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/29.jpg)
Slice Topology
• 11 GENI sites (1 ensemble manager, 10 compute sites)• Topology: 92 VMs (368 cores), 10 inter-domain VLANs, 1 TB iSCSI storage• HPC compute nodes: 80 compute nodes (320 cores) from 10 sites
![Page 30: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/30.jpg)
ADCIRC Results from GENIStorm Surge for 6 simulations
N11 N17
N01 N14 N16
N20
Small Threat
Big Threat
![Page 31: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/31.jpg)
31
Conclusions
• GENI testbed represents a kind of shared infrastructure suitable for prototyping of solutions for some computational science domains
• GENI technologies represent a collection of enabling mechanisms that can provide foundation for the future federated science cyberinfrastructure
• Different members of GENI federations offer different capabilities for their users, suitable for a variety of problems
![Page 32: Using GENI for computational science Ilya Baldin RENCI, UNC – Chapel Hill.](https://reader036.fdocuments.net/reader036/viewer/2022081603/56649f355503460f94c53191/html5/thumbnails/32.jpg)
32
Thank you!
• Funders
• Partners