RAMSES: Robust Analytical Models for Science at Extreme …

Post on 10-Nov-2021

4 views 0 download

Transcript of RAMSES: Robust Analytical Models for Science at Extreme …

RAMSES: Robust Analytical Models for Science at Extreme Scale

Presenter: Raj Kettimuthu (Argonne)

PI: Ian Foster (Argonne)Co-PIs: Gagan Agrawal (Ohio State), Nagi Rao (ORNL), Brad Settlemyer (LANL), Brian Tierney (LBL), and Don

Towsley (UMass)

Project Overview

Experiments

Database

Modeling

Estimation

Advisor

Estimators

Evaluators

Tester

Tools Develop easy-to-use tools to provide end-users with actionable advice

Develop and apply data-driven estimation methods: differential regression, surrogate models, etc.

Develop, evaluate, and refine component and end-to-end models

Conduct extensive, automated experiments to test models and build database

Exemplar Science Workflows

§ Five science workflows § Span a broad range of DOE science domains and modeling

problems§ File Transfer§ Light Source Workflows

– Tomographic Reconstruction, Diffuse Scattering § Distributed MapReduce§ In-situ Analysis§ Exascale Simulations

TCP Throughput Profiles

tconcaveregion

convexregion

RTT - ms

Thro

ughp

ut -

Gbps

§ Most common TCP throughput profile convex function of rtt

§ Observed dual-mode profiles: emulated 0-366ms rttconnections

– CUBIC, STCP: Smaller RTT - Concave region, Larger RTT- Convex region

§ Concave regions very desirable – Throughput does not decay as fast, rate of decrease slows down as rtt

Models of single TCP connections

STCP CUBIC

• Models account for increase/decrease rules, rtt, link capacity, max receive window size

• Validation against measurements• Useful for selecting best version and troubleshooting• Future directions: other versions, e.g., UDT, multiple

connections, account for I/O interactions

UDP-Based Transport: UDT

§ For dedicated 10G links, UDT provides higher throughput than CUBIC (linux default) § TCP and UDT Throughput transition-point depends on connection parameters –rtt, loss rate, host – NIC parameters, IP and UDP parameters§ Disk-to-Disk transfers (xdd) have lower transfer rate

xdd-read

xdd-write

CUBICSingle stream

UDT

Data Driven Models for File Transfer

§ Combines historical data with a correction term for current external load

§ Takes three pieces of input § Signature for a given transfer

– Concurrency level – Total known concurrency at source (“known load at source”) – Total known concurrency at destination (“known load at destination”)– File Size

§ Historical data – Transfer concurrency, known loads, and observed throughput for the

source-destination pair § Signatures and observed throughputs from the most recent

transfers for the source-destination pair § It produces an estimated throughput as an output.

Data Driven Models for File Transfer

§ Transfer Scheduling Algorithms– SEAL: Schedule transfers minimize average transfer slowdown– STEAL: Minimize slowdown for best-effort transfers and maximize

bandwidth utilization for batch transfers

Destination ≤1GB >1GB, ≤10GB >10G Overall

Gordon 4.26 9.3 6.67 8.31

Mason 3.55 9.4 8.22 8.76

Yellowstone 2.78 8.0 8.1 6.84

Blacklight 5.96 4.41 5.27 4.93

Darter 7.70 4.03 2.63 4.73

SEAL Evaluation – Turnaround Time 60% Load

Modeling In-situ Analysis

§ How often should we perform the analyses?

§ How often should the analyses output be written?

§ Analyses parameters– Time (Initialization, Auxiliary, Output)– Memory (Fixed, Auxiliary, Output)– Minimum interval between consecutive

steps– Importance – Threshold time for analyses

10

§ System parameters– I/O bandwidth– Rate of computation– Available memory

Problem Size

Net

wor

k ba

ndw

idth

/Pr

oces

s cou

nt

Results: Scheduling Analyses within Threshold

TotalThreshold (sec)

R1 (Radius of gyration)

R2 (Membrane density profile 2D histogram)

R3 (Protein density profile 2D histogram)

% within threshold

200 10 4 7 94.59

100 10 2 3 85.99

60 10 1 2 86.01

20 10 1 0 86.11

10 10 0 0 0.3

Table: Analysis frequencies, analysis times, and corresponding thresholds for 1 billion atoms rhodopsin simulation (1000 steps) in LAMMPS on 32768 cores (2048 nodes) of Mira.

Simulation: Rhodopsin protein benchmark, which consists of a protein embedded in a membrane and solvated with water and ions using LAMMPS.

11

Observation: More than 80% of the allowed threshold is used for analyses, when threshold > 20 s.

R1 R2 R3

Time

Mem

ory

Two In-Situ Modes

12

Time Sharing Mode: Minimizes memory consumption

Space Sharing Mode: Enhances resource utilization when simulation reaches its scalability bottleneck

§ Model computational part (MapReduce-like processing)§ Model memory

– Data locality between simulation and analytics (initial work only)

Performance Modeling with Disk Model for K-means with MATE(File Size = 1GB, K = 50, Num of Iterations = 1)

Modeling Computational Component in MATE/Smart

Modeling Computation Time for Parallel Tomographic Reconstruction

§ Computation– Number of intersected rays, and

horizontal and vertical linest x col^2 x (|sin(θ)|+|cos(θ)|)

14

0100002000030000400005000060000700008000090000

100000

0 10 20 30 40 50 60 70 80 90 100

110

120

130

140

150

160

170

Horizontal Vertical Total

0

0.005

0.01

0.015

0.02

0.025

65000

70000

75000

80000

85000

90000

95000

1 11 21 31 41 51 61 71 81 91 101

111

121

131

141

151

161

171

Estimated Real Error Ratio

P0

P2

P1

…P n

T0

T1

… T n

T2

Estimated Execution time vs. Real Reconstruction Time

RAMSES Meeting

15

0

500000000

1E+09

1.5E+09

2E+09

2.5E+09

3E+09

3.5E+09

4E+09

4.5E+09

0

100

200

300

400

500

600

700

80012

8

256

384

512

640

768

896

1024

1152

1280

1408

1536

1664

1792

1920

2048

Real Time

Estimated Exec Time (wrt 2K)

Estimated Computation

Questions?

Additive increase and additive decrease (AIAD) for optimal stream

Go to ”Insert (View) | Header and Footer" to add your organization, sponsor, meeting name here; then, click "Apply to All"17

For every step c (fixed number of epoch), do the following:

Tubes (ANL) to DMZ (UChicago)

Go to ”Insert (View) | Header and Footer" to add your organization, sponsor, meeting name here; then, click "Apply to All"18