Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman...

22
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX USA

Transcript of Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman...

Page 1: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

Investigating Adaptive Compilation using the MIPSpro Compiler

Keith D. Cooper Todd Waterman

Department of Computer Science

Rice University

Houston, TX USA

Page 2: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

2

Motivation

• Despite astonishing increases in processor performance certain applications still require a heroic compiler effort Scientific applications: weather, earthquake, and nuclear

physics simulations

• High quality compilation is difficult The solutions to many problems are NP-complete Many decisions that impact performance must be made

The correct choice can depend on the target machine, source program, and input data

Exhaustively determining the correct choices is impractical

• Typical compilers use a single preset sequence of decisions

• How do we determine the correct sequence for each context?

Page 3: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

3

Adaptive Compilation

• An adaptive compiler experimentally explores the decision space

Uses a process of feedback-driven iterative refinement Program is compiled repeatedly with a different sequence of

optimization decisions Performance is evaluated using either execution or estimation Performance results are used to determine future sequences

Sequence of compiler decisions is customized to always provide a high level of performance

Compiler easily accounts for different input programs, target machines and input data

• Can current compilers be used for adaptive compilation?

Page 4: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

4

Experimental Setup

• Searched for certain properties in a compiler Produces high quality executables Performs high-level optimizations Command-line flags that control optimization

• Selected the MIPSpro compiler Initial experiments showed that changing blocking sizes

could improve running times

• Loop Blocking A memory hierarchy transformation that reorders array

accesses to improve spatial and temporal locality Major impact on array based codes

Includes DGEMM -- a general matrix multiply routine Allows comparison with ATLAS

Page 5: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

5

ATLAS

• Automatically tuned linear algebra software

• Goal is to achieve hand-coded performance for linear algebra kernels without a programmer modifying the code for each processor Kernel is modified and parameterized once by a

programmer When ATLAS is installed on a machine experiments are run

to determine the proper parameters for the kernel

• Saves human time at the expense of additional machine time

• Adaptive compilation aims to take this tradeoff one step further

Page 6: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

6

Adjusting Blocking Size

• Compare three versions of DGEMM Compiled with MIPSpro and varying specified block sizes Built by ATLAS Compiled with MIPSpro using built-in blocking heuristic

• Test machine: SGI MIPS R10000 195 MHz processor 256 MB memory 32 KB L1 data cache 1 MB unified L2 cache

Page 7: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

7

DGEMM running time for 500 x 500 arrays

Page 8: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

8

DGEMM running time for 1000 x 1000 arrays

Page 9: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

9

DGEMM running time for 1500 x 1500 arrays

Page 10: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

10

DGEMM running times for square matrices

Page 11: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

11

Relative DGEMM running times

Page 12: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

12

L1 Cache Misses for DGEMM

Page 13: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

13

L2 Cache Misses for DGEMM

Page 14: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

14

Adjusting Blocking Size

• The performance of MIPSpro using the built-in blocking heuristic drops off substantially when the array size reaches 900 x 900 Far more L1 cache misses Fewer L2 cache misses Heuristic uses a rectangular blocking size that increases as

the total array size increases

• MIPSpro with adaptively chosen blocking sizes delivers performance close to ATLAS level Remains close as array size increases Fewer L1 and L2 cache misses than ATLAS

• Similar results were observed for non-square matrices as well

Page 15: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

15

Determining Blocking Size

• Exhaustively searching for blocking sizes is expensive

• Intelligent exploration of blocking sizes can find very good blocking sizes while only examining a few block sizes

• Our approach: Determine the result for block size 50 Sample higher and lower block sizes in increments of ten

until results are more than 10% from optimal Examine all of the block sizes within five of the best found

in the previous step

• This approach always found the best block size in our experiments

• Quicker approaches could be found at the expense of finding less ideal block sizes

Page 16: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

16

Search time required

Page 17: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

17

Making Adaptive Compilation General

• Making adaptive compilation general will require changing how compilers work

• Adaptive compilation is limited by the decisions the compiler exposes If the MIPSpro compiler only allowed blocking to be turned

on and off our experiments would not have been possible

• The interface between adaptive system and compiler needs to allow complex communication Which transformations are applied Granularity Optimization scope Detailed parameter settings

Page 18: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

18

Conclusions

• Adaptively selecting the appropriate blocking size for DGEMM provides performance close to ATLAS The standard compiler’s performance drops off for larger

array sizes Only a small portion of possible block sizes needs to be

examined

• Making adaptive compilation a successful technique for a wide variety of applications will require changes to the design of compilers

Page 19: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

19

Extra slides begin here.

Page 20: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

20

DGEMM running times for varying M

Page 21: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

21

DGEMM running times for varying N

Page 22: Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.

22

DGEMM running times for varying K