1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program...

37
1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: [email protected] Elizabeth R. Jessup: [email protected] April 5, 2006

Transcript of 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program...

Page 1: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

1

Applying Automated Memory Analysis to

improve the iterative solver in the Parallel Ocean

ProgramJohn M. Dennis: [email protected]

Elizabeth R. Jessup: [email protected]

April 5, 2006

Page 2: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

2

MotivationMotivation

Outgrowth of PhD thesis Memory efficient iterative solversData movement is expensiveDeveloped techniques to improve memory efficiency

Apply Automated Memory Analysis to POP

Parallel Ocean Program (POP) solverLarge % of timeScalability issues

Outgrowth of PhD thesis Memory efficient iterative solversData movement is expensiveDeveloped techniques to improve memory efficiency

Apply Automated Memory Analysis to POP

Parallel Ocean Program (POP) solverLarge % of timeScalability issues

Page 3: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

3

Outline:Outline:

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling Curves Conclusions

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling Curves Conclusions

Page 4: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

4

Automated Memory Analysis?

Automated Memory Analysis?

Analyze algorithm written in Matlab

Predicts data movement if algorithm written in C/C++ or Fortran-> Minimum Required

Predictions allow:Evaluate design choicesGuide performance tuning

Analyze algorithm written in Matlab

Predicts data movement if algorithm written in C/C++ or Fortran-> Minimum Required

Predictions allow:Evaluate design choicesGuide performance tuning

Page 5: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

5

POP using 20x24 blocks (gx1v3)

POP using 20x24 blocks (gx1v3)

POP data structure Flexible block structure land ‘block’ elimination Small blocks

Better {load balanced, land block elimination}

Larger halo overhead Larger blocks

Smaller halo overheadLoad imbalancedNo land block elimination

Grid resolutions: test: (128x192) gx1v3 (320x384)

POP data structure Flexible block structure land ‘block’ elimination Small blocks

Better {load balanced, land block elimination}

Larger halo overhead Larger blocks

Smaller halo overheadLoad imbalancedNo land block elimination

Grid resolutions: test: (128x192) gx1v3 (320x384)

Page 6: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

6

Alternate Data Structure

Alternate Data Structure

2D data structure Advantages

Regular stride-1 access

Compact form of stencil operator

Disadvantages Includes land points

Problem specific data structure

2D data structure Advantages

Regular stride-1 access

Compact form of stencil operator

Disadvantages Includes land points

Problem specific data structure

1D data structure Advantages

No more land points General data structure

Disadvantages Indirect addressing Larger stencil operator

1D data structure Advantages

No more land points General data structure

Disadvantages Indirect addressing Larger stencil operator

Page 7: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

7

Outline:Outline:

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

Page 8: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

8

Data movementData movement

Working set load size (WSL)(MM --> L1 cache)Measure using PAPI (WSLM)Compute platforms:

Sun Ultra II (400Mhz)IBM POWER4 (1.3 Ghz)SGI R14K (500Mhz)

Compare with prediction (WSLP)

Working set load size (WSL)(MM --> L1 cache)Measure using PAPI (WSLM)Compute platforms:

Sun Ultra II (400Mhz)IBM POWER4 (1.3 Ghz)SGI R14K (500Mhz)

Compare with prediction (WSLP)

Page 9: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

9

Predicting Data Movement

Predicting Data Movement

solver w/2D (Matlab) solver w/1D (Matlab)

4902 Kbytes 3218 Kbytes

1D data structure --> 34% reduction in data movement

>

Predicts WSLP

Page 10: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

10

Measured versus Predicted data

movement

Measured versus Predicted data

movementSolver Ultra II POWER4 R14K

WSLP WSLM err WSLM err WSLM err

PCG2+2D v1

4902 5163 5% 5068 3% 5728 17%

PCG2+2D v2

4902 4905 0% 4865 -1% 4854 -1%

PCG2+1D 3218 3164 -2% 3335 4% 3473 8%

Page 11: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

11

Measured versus Predicted data

movement

Measured versus Predicted data

movementSolver Ultra II POWER4 R14K

WSLP WSLM err WSLM err WSLM err

PCG2+2D v1

4902 5163 5% 5068 3% 5728 17%

PCG2+2D v2

4905 0% 4865 -1% 4854 -1%

PCG2+1D 3218 3164 -2% 3335 4% 3473 8%

Excessive data movement

Page 12: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

12

Two blocks of source code

Two blocks of source code

do i=1,nblocksp(:,:,i)=z(:,:,i) + p(:,:,i)*ß

q(:,:,i) = A*p(:,:,i)

w0(:,:,i)=Q(:,:,i)*P(:,:,i)

enddodelta = gsum(w0,lmask)

do i=1,nblocksp(:,:,i)=z(:,:,i) + p(:,:,i)*ß

q(:,:,i) = A*p(:,:,i)

w0(:,:,i)=Q(:,:,i)*P(:,:,i)

enddodelta = gsum(w0,lmask)

ldelta=0do i=1,nblocks

p(:,:,i) = z(:,:,i) + p(:,:,i)* ß

q(:,:,i) = A*p(:,:,i)w0=q(:,:,i)*P(:,:,i)ldelta = ldelta + lsum(w0,lmask)

enddodelta=gsum(ldelta)

ldelta=0do i=1,nblocks

p(:,:,i) = z(:,:,i) + p(:,:,i)* ß

q(:,:,i) = A*p(:,:,i)w0=q(:,:,i)*P(:,:,i)ldelta = ldelta + lsum(w0,lmask)

enddodelta=gsum(ldelta)

PCG2+2D v1 PCG2+2D v2

w0 array accessed after loop!extra access of w0 eliminated

Page 13: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

13

Measured versus Predicted data

movement

Measured versus Predicted data

movementSolver Ultra II POWER4 R14K

WSLP WSLM err WSLM err WSLM err

PCG2+2D v1

4902 5163 5% 5068 3% 5728 17%

PCG2+2D v2

4902 4905 0% 4865 -1% 4854 -1%

PCG2+1D 3218 3164 -2% 3335 4% 3473 8%

Data movement matches predicted!

Page 14: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

14

Outline:Outline:

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

Page 15: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

15

Using 1D data structures in POP2

solver (serial)

Using 1D data structures in POP2

solver (serial)Replace solvers.F90Execution time on cache microprocessors

Examine two CG algorithms w/Diagonal precondPCG2 ( 2 inner products)PCG1 ( 1 inner product) [D’Azevedo 93]

Grid: test [128x192 grid points]w/(16x16)

Replace solvers.F90Execution time on cache microprocessors

Examine two CG algorithms w/Diagonal precondPCG2 ( 2 inner products)PCG1 ( 1 inner product) [D’Azevedo 93]

Grid: test [128x192 grid points]w/(16x16)

Page 16: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

16

0

1

2

3

4

5

6

POWER4 1.3 Ghz

Compute Platform

secon

ds f

or 2

0 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

0

1

2

3

4

5

6

POWER4 1.3 Ghz

Compute Platform

secon

ds f

or 2

0 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

Serial execution time on IBM POWER4 (test)

Serial execution time on IBM POWER4 (test)

56% reduction in cost/iteration

Page 17: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

17

Outline:Outline:

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

Page 18: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

18

Using 1D data structure in POP2 solver (parallel)

Using 1D data structure in POP2 solver (parallel) New parallel halo update

Examine several CG algorithms w/Diagonal precond PCG2 ( 2 inner products) PCG1 ( 1 inner product)

Existing solver/preconditioner technology: Hypre (LLNL)

http://www.llnl.gov/CASC/linear_solvers PCG solver Preconditioners:

Diagonal Hypre integration -> Work in progress

New parallel halo update Examine several CG algorithms w/Diagonal

precond PCG2 ( 2 inner products) PCG1 ( 1 inner product)

Existing solver/preconditioner technology: Hypre (LLNL)

http://www.llnl.gov/CASC/linear_solvers PCG solver Preconditioners:

Diagonal Hypre integration -> Work in progress

Page 19: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

19

Solver execution time for POP2 (20x24) on

BG/L (gx1v3)

Solver execution time for POP2 (20x24) on

BG/L (gx1v3)

0

5

10

15

20

25

30

35

40

64

# processors

Secon

ds f

or 2

00 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

Hypre (PCG+Diag)

0

5

10

15

20

25

30

35

40

64

# processors

Secon

ds f

or 2

00 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

Hypre (PCG+Diag)

48% cost/iteration

27% cost/iteration

Page 20: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

20

64 processors != PetaScale

Page 21: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

21

Outline:Outline:

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

MotivationBackgroundData movementSerial PerformanceParallel PerformanceSpace-Filling CurvesConclusions

Page 22: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

22

0.1 degree POP0.1 degree POP

Global eddy-resolving Computational grid:

3600 x 2400 x 40Land creates problems:

load imbalancesscalability

Alternative partitioning algorithm:Space-filling curves

Evaluate using Benchmark:1 day/ Internal grid / 7 minute timestep

Global eddy-resolving Computational grid:

3600 x 2400 x 40Land creates problems:

load imbalancesscalability

Alternative partitioning algorithm:Space-filling curves

Evaluate using Benchmark:1 day/ Internal grid / 7 minute timestep

Page 23: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

23

Partitioning with Space-filling Curves

Partitioning with Space-filling Curves

Map 2D -> 1DVariety of sizes

Hilbert (Nb=2n)

Peano (Nb=3m)

Cinco (Nb=5p) [New]Hilbert-Peano (Nb=2n3m)Hilbert-Peano-Cinco (Nb=2n3m5p) [New]

Partitioning 1D array

Map 2D -> 1DVariety of sizes

Hilbert (Nb=2n)

Peano (Nb=3m)

Cinco (Nb=5p) [New]Hilbert-Peano (Nb=2n3m)Hilbert-Peano-Cinco (Nb=2n3m5p) [New]

Partitioning 1D array

Nb

Page 24: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

24

Partitioning with SFCPartitioning with SFC

Partition for 3 processors

Page 25: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

25

POP using 20x24 blocks (gx1v3)

POP using 20x24 blocks (gx1v3)

Page 26: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

26

POP (gx1v3) + Space-filling curve

POP (gx1v3) + Space-filling curve

Page 27: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

27

Space-filling curve (Hilbert Nb=24)

Space-filling curve (Hilbert Nb=24)

Page 28: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

28

Remove Land blocksRemove Land blocks

Page 29: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

29

Space-filling curve partition for 8

processors

Space-filling curve partition for 8

processors

Page 30: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

30

POP 0.1 degree benchmark on Blue

Gene/L

POP 0.1 degree benchmark on Blue

Gene/L

Page 31: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

31

POP 0.1 degree benchmark

POP 0.1 degree benchmark

Courtesy of Y. Yoshida, M. Taylor, P. Worley

Page 32: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

32

ConclusionsConclusions

1D data structures in Barotropic SolverNo more land pointsReduces execution time vs 2D data structure

48% reduction in Solver time! (64 procs BG/L) 9.5% reduction in Total time! (64 procs POWER4)

Allows use of solver/preconditioner packagesImplementation quality critical!

Automated Memory Analysis (SLAMM)Evaluate design choicesGuide performance tuning

1D data structures in Barotropic SolverNo more land pointsReduces execution time vs 2D data structure

48% reduction in Solver time! (64 procs BG/L) 9.5% reduction in Total time! (64 procs POWER4)

Allows use of solver/preconditioner packagesImplementation quality critical!

Automated Memory Analysis (SLAMM)Evaluate design choicesGuide performance tuning

Page 33: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

33

Conclusions (con’t)Conclusions (con’t) Good scalability to 32K processors on BG/L Increase simulation rate by 2x on 32K processors SFC partitioning 1D data structure in solver Modify 7 source files

Future work Improve scalability

55% Efficiency 1K => 32K Better preconditioners Improve load-balance

Different block sizesImprove partitioning algorithm

Good scalability to 32K processors on BG/L Increase simulation rate by 2x on 32K processors SFC partitioning 1D data structure in solver Modify 7 source files

Future work Improve scalability

55% Efficiency 1K => 32K Better preconditioners Improve load-balance

Different block sizesImprove partitioning algorithm

Page 34: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

34

Acknowledgements/Questions?

Acknowledgements/Questions?

Thanks to: F. Bryan (NCAR)J. Edwards (IBM) P. Jones (LANL)K. Lindsay (NCAR)M. Taylor (SNL)H. Tufo (NCAR)W. Waite (CU)S. Weese (NCAR)

Thanks to: F. Bryan (NCAR)J. Edwards (IBM) P. Jones (LANL)K. Lindsay (NCAR)M. Taylor (SNL)H. Tufo (NCAR)W. Waite (CU)S. Weese (NCAR)

Blue Gene/L time:NSF MRI GrantNCARUniversity of ColoradoIBM (SUR) program

BGW Consortium DaysIBM research (Watson)

Blue Gene/L time:NSF MRI GrantNCARUniversity of ColoradoIBM (SUR) program

BGW Consortium DaysIBM research (Watson)

Page 35: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

35

Serial Execution time on Multiple platforms

(test)

Serial Execution time on Multiple platforms

(test)

0

1

2

3

4

5

6

7

8

9

10

IBM POWER4 IBM POWER5 IBM PPC 440 AMD Opteron Intel P4

(1.3 Ghz) (1.9 Ghz) (700 Mhz) (2.2 Ghz) (2.0 Ghz)

Compute Platform

secon

ds f

or 2

0 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

0

1

2

3

4

5

6

7

8

9

10

IBM POWER4 IBM POWER5 IBM PPC 440 AMD Opteron Intel P4

(1.3 Ghz) (1.9 Ghz) (700 Mhz) (2.2 Ghz) (2.0 Ghz)

Compute Platform

secon

ds f

or 2

0 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

Page 36: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

36

Total execution time for POP2 (40x48) on

POWER4 (gx1v3)

Total execution time for POP2 (40x48) on

POWER4 (gx1v3)

66

68

70

72

74

76

78

80

82

84

86

88

64

# of processors

secon

ds f

or 2

00 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

66

68

70

72

74

76

78

80

82

84

86

88

64

# of processors

secon

ds f

or 2

00 t

imestep

s

PCG2+2D

PCG1+2D

PCG2+1D

PCG1+1D

9.5% reduction

Eliminate need for ~216,000 CPU hours per year @ NCAR

Page 37: 1 Applying Automated Memory Analysis to improve the iterative solver in the Parallel Ocean Program John M. Dennis: dennis@ucar.edudennis@ucar.edu Elizabeth.

April 5, 2006 Petascale Computation for the Geosciences Workshop

37

POP 0.1 degreePOP 0.1 degreeblocksize

Nb Nb2 Max ||

36x24 100 10000 7545

30x20 120 14400 10705

24x16 150 22500 16528

18x12 200 40000 28972

15x10 240 57600 41352

12x8 300 90000 64074

Increasing || -->D

ecre

asin

g ov

erhe

ad -

->