Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization...
Transcript of Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization...
![Page 1: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/1.jpg)
Super-optimizing LLVM IR
Duncan Sands
DeepBlueCapital / CNRS
![Page 2: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/2.jpg)
Thanks to
Googlefor sponsorship
![Page 3: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/3.jpg)
Super optimization
● Optimization → Improve code
![Page 4: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/4.jpg)
Super optimization
● Optimization → Improve code
● Super-optimization → Obtain perfect code
![Page 5: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/5.jpg)
Super optimization
● Optimization → Improve code
● Super-optimization → Obtain perfect code
Super-optimization → automatically find code improvements
![Page 6: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/6.jpg)
Super optimization
● Optimization → Improve code
● Super-optimization → Obtain perfect code
Super-optimization → automatically find code improvements
Idea from LLVM OpenProjects web-page(suggested by John Regehr)
![Page 7: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/7.jpg)
Goal
Automatically find simplifications missed by the LLVM optimizers
- And have a human implement them in LLVM
![Page 8: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/8.jpg)
Goal
- And have a human implement them in LLVM
Non goalDirectly optimize programs
Automatically find simplifications missed by the LLVM optimizers
![Page 9: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/9.jpg)
Goal
- And have a human implement them in LLVM
Non goalDirectly optimize programs
It doesn't matter if the simplifications foundare sometimes wrong
Automatically find simplifications missed by the LLVM optimizers
![Page 10: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/10.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y
![Page 11: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/11.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y Not done because of operand uses
![Page 12: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/12.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y
• (X<<1) - X → X
Not done because of operand uses
![Page 13: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/13.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y
• (X<<1) - X → X
Not done because of operand uses
Not done because of operand uses
![Page 14: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/14.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y
• (X<<1) - X → X
• non-negative number + power-of-two != 0 → true
Not done because of operand uses
Not done because of operand uses
![Page 15: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/15.jpg)
ExamplesMissed simplifications found in “fully optimized” code:
• X - (X - Y) → Y
• (X<<1) - X → X
• non-negative number + power-of-two != 0 → true
Not done because of operand uses
Not done because of operand uses
New!
![Page 16: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/16.jpg)
Process● Compile program to bitcode
![Page 17: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/17.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
![Page 18: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/18.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
![Page 19: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/19.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
● Analyse them for missing simplifications
![Page 20: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/20.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
● Analyse them for missing simplifications
● Implement the simplifications in LLVM
![Page 21: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/21.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
● Analyse them for missing simplifications
● Implement the simplifications in LLVM
Repeat
![Page 22: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/22.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
● Analyse them for missing simplifications
● Implement the simplifications in LLVM
● Profit!
Repeat
![Page 23: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/23.jpg)
Process● Compile program to bitcode
● Run optimizers on bitcode
● Harvest interesting expressions
● Analyse them for missing simplifications
● Implement the simplifications in LLVM
● Profit!
Repeat
Inspired by “Automatic Generation of Peephole Superoptimizers”by Bansal & Aiken (Computer Systems Lab, Stanford)
![Page 24: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/24.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
![Page 25: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/25.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Plugin pass that harvests code sequences
![Page 26: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/26.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Harvest code sequences after running standard optimizers
![Page 27: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/27.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Code sequences}
}
![Page 28: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/28.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Code sequences}
}Code sequence = maximal connected subgraph of theLLVM IR containing only supported operations
![Page 29: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/29.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Normalized expressions
![Page 30: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/30.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest details \ disableoutput bzip2.bc@07:@09{ ; In function: "mainGtU()", BB: "entry" %0 = zext i32 %i1 to i64}07:@07:@3c:12:@3c:@06:@07:24:28:20:@29{ ; In function: "bsPutUInt32()", BB: "bsW.exit" %28 = lshr i32 %u, 16 %29 = and i32 %28, 255 %49 = sub i32 24, %48 ; From BB: "bsW.exit24" %50 = shl i32 %29, %49 ; From BB: "bsW.exit24" %51 = or i32 %50, %47 ; From BB: "bsW.exit24"}...
Explanatory annotations(ignored)
![Page 31: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/31.jpg)
Harvesting$ opt load=./harvest.so stdcompileopts harvest \ disableoutput bzip2.bc@07:@0907:@07:@3c:12:@3c:@06:@07:24:28:20:@29...
Normalized & encoded form allows textual comparisons:
$ opt load=./harvest.so stdcompileopts harvest \ disableoutput bzip2.bc | sort | uniq c | sort r n 265 @00:07:@2b 178 @01:07:@0f 120 @00:@07:@2b ...
$ opt load=./harvest.so stdcompileopts harvest \ disableoutput bzip2.bc@07:@0907:@07:@3c:12:@3c:@06:@07:24:28:20:@29...
} Ordered by frequency of occurrence
![Page 32: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/32.jpg)
HarvestingMost common expressions in unoptimized bitcode from the LLVM testsuite:
07:0a → sext X00:07:2c → X != 007:09 → zext X05:07:0f → X +nsw -100:07:2b → X == 007:07:13 → X -nsw Y07:07:32 → X >=s Y01:07:0f → X +nsw 106:07:0a:16 → (sext X) * power-of-2
sext = sign-extend
zext = zero-extend
+nsw = add with no-signed wrap
-nsw = sub with no-signed wrap>=s = signed greater than or equal
power-of-2 = constant thatis a power of two
![Page 33: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/33.jpg)
ExpressionsICMP_SLT
ZeroZExt
Add
Register Register
● Directed acyclic graph - no loops!
● Integer operations only - no floating point!
● No memory operations (load/store)!
● No types!
● Limited set of constants (eg: Zero, One, SignBit)
Most integer operations supported (eg: ctlz, overflow intrinsics).Doesn't support byteswap (because of lack of types).
![Page 34: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/34.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
![Page 35: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/35.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
![Page 36: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/36.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
![Page 37: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/37.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
![Page 38: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/38.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
![Page 39: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/39.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Result does not depend on xCan replace x with (eg) 0
![Page 40: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/40.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
![Page 41: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/41.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Rafael Auler'sGSOC project
![Page 42: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/42.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Fast!
Alway
s a w
in!
![Page 43: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/43.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Fast!
Alway
s a w
in!
Fast!
Often
a win!
![Page 44: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/44.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Fast!
Alway
s a w
in!
Fast!
Often
a win!
Fast!
Somet
imes
a w
in!
![Page 45: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/45.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Fast!
Alway
s a w
in!
Fast!
Often
a win!
Fast!
Somet
imes
a w
in!
Slow!
Wor
k in
prog
ress
!
![Page 46: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/46.jpg)
Analysing expressions
Four modes:
● Constant folding
● Reduce to sub-expression
● Unused variables
● Rule reduction
zext x <s 0 → 0 (i.e. false)
((x + z) *nsw y) /s y → x + z
x - (x + y) → 0 - y
Repeatedly apply rules from a list.Search minimum of cost function.
Implement in LLVM'sInstructionSimplify analysis
Implement in LLVM'sInstCombine transform
![Page 47: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/47.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
![Page 48: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/48.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
i1 No choice
![Page 49: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/49.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodesi1
i1
Choice (chose smallest)
![Page 50: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/50.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodesi1 i1
i1
i1
No choice
![Page 51: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/51.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodesi1 i1
i1
i2
i1
Choice (chose smallest)
![Page 52: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/52.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodesi1 i1
i1
i2
i1
i2 No choice
![Page 53: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/53.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodesi1 i1
i1
i2
i1
i2
Strategies: (1) Random choice; (2) All small types.
![Page 54: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/54.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
● Assign values to terminal nodes & propagate up
i1 0 i1 1
i1
i2
i1
i2 0
Strategies: (1) Random choice; (2) All small types.
Choice
No choice
![Page 55: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/55.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
● Assign values to terminal nodes & propagate up
i1 0 i1 1
i1 1
i2 1
i1 0
i2 0
Strategies: (1) Random choice; (2) All small types.
![Page 56: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/56.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
● Assign values to terminal nodes & propagate up
i1 0 i1 1
i1 1
i2 1
i1 0
i2 0
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
![Page 57: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/57.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
● Assign values to terminal nodes & propagate up
i2 1 i2 1
i2 2
i3 2
i1 0
i3 0
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
Repeat many times
![Page 58: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/58.jpg)
Constant foldingICMP_SLT
ZeroZExt
Add
Register Register
● Assign types to nodes
● Assign values to terminal nodes & propagate up
● Result at the root always the same → found a constant fold
i2 1 i2 1
i2 2
i3 2
i1 0
i3 0
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
Repeat many times
Always zero
![Page 59: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/59.jpg)
False positives
Eg: A | (B + 1) | (C - 1) == 0
![Page 60: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/60.jpg)
False positives
Eg: A | (B + 1) | (C - 1) == 0
Mostly evaluates to “false”
![Page 61: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/61.jpg)
False positives
Eg: A | (B + 1) | (C - 1) == 0
A, B and C have i8 type → 1 / 2^24 chance of seeing “true”
Mostly evaluates to “false”
![Page 62: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/62.jpg)
False positives
Eg: A | (B + 1) | (C - 1) == 0
A, B and C have i8 type → 1 / 2^24 chance of seeing “true”
A, B and C have i1 type → 1 / 8 chance of seeing “true”
Mostly evaluates to “false”
![Page 63: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/63.jpg)
False positives
Eg: A | (B + 1) | (C - 1) == 0
A, B and C have i8 type → 1 / 2^24 chance of seeing “true”
A, B and C have i1 type → 1 / 8 chance of seeing “true”
Mostly evaluates to “false”
Use of small types hugely reduces the number of false positives
![Page 64: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/64.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
![Page 65: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/65.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
Implemented as: “non-negative-number + power-of-two != 0”
![Page 66: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/66.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
● ( (X >s Y) ? X : Y ) >=s X → true
![Page 67: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/67.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
● ( (X >s Y) ? X : Y ) >=s X → true
“max(X, Y) >= X”. Implemented several max/min folds.
![Page 68: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/68.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
● ( (X >s Y) ? X : Y ) >=s X → true
● X rem ( Y ? X : 1 ) → 0
● (Y /u X) >u Y → false
![Page 69: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/69.jpg)
ExamplesConstant folds found in “fully optimized” code:
● ( ( (X + Y) >>L power-of-two ) & Z ) + power-of-two == 0 → false
● ( (X >s Y) ? X : Y ) >=s X → true
● X rem ( Y ? X : 1 ) → 0
● (Y /u X) >u Y → false
Require reasoning aboutundefined behaviour
![Page 70: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/70.jpg)
Undefined behaviour
ICMP_UGT
UDiv
Register (X) Register (Y)
(X /u Y) >u X → false
![Page 71: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/71.jpg)
Undefined behaviour
ICMP_UGT
UDiv
Register (X) Register (Y)
(X /u Y) >u X → false
i8 42 i8 0
undefined
![Page 72: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/72.jpg)
Undefined behaviour
ICMP_UGT
UDiv
Register (X) Register (Y)
(X /u Y) >u X → false
i8 42 i8 0
undefined
undefined
Any operation with an undef operand gets an undef result
![Page 73: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/73.jpg)
Undefined behaviour
ICMP_UGT
UDiv
Register (X) Register (Y)
(X /u Y) >u X → false
i8 42 i8 0
undefined
undefined
Any operation with an undef operand gets an undef result
● Avoids false negatives
● May result in subtle false positives
![Page 74: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/74.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
![Page 75: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/75.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
● Assign types to nodesStrategies: (1) Random choice; (2) All small types.
i3 i3
i3
i3
![Page 76: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/76.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
● Assign types to nodes
● Assign values to terminal nodes & propagate upStrategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
i3 2 i3 1
i3 2
i3 2
![Page 77: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/77.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
● Assign types to nodes
● Assign values to terminal nodes & propagate up
● See if some node always has same value as root (or undef)
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
i3 2 i3 1
i3 2
i3 2
Same
![Page 78: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/78.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
● Assign types to nodes
● Assign values to terminal nodes & propagate up
● See if some node always has same value as root (or undef)
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
Repeat many timesi3 1 i3 2
i3 2
i3 1
Same
![Page 79: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/79.jpg)
Reduce to subexpressionSDiv
MulNSW
Register (X) Register (Y)
(X *nsw Y) /s Y → X
● Assign types to nodes
● Assign values to terminal nodes & propagate up
● See if some node always has same value as root (or undef)→ found a subexpression reduction
Strategies: (1) Random choice; (2) All small types.
Strategies: (1) Random inputs; (2) Every possible input.
Repeat many times
Always same
![Page 80: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/80.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
![Page 81: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/81.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Z = X *nsw Y
...
W = Z /s Ycall @foo(W, Y, Z)
![Page 82: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/82.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Z = X *nsw Y
...
W = Z /s Ycall @foo(W, Y, Z)
X not used again
![Page 83: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/83.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Z = X *nsw Y
...
W = Z /s Ycall @foo(W, Y, Z)
X not used againTwo registers needed (for Y, Z)
![Page 84: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/84.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Z = X *nsw Y
...
W = Z /s Ycall @foo(W, Y, Z)
Z = X *nsw Y
...
... W not computed ...call @foo(X, Y, Z)
Transform: W → X
![Page 85: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/85.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Z = X *nsw Y
...
... W not computed ...call @foo(X, Y, Z)
Three registers needed (for X, Y, Z)
![Page 86: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/86.jpg)
Register pressure(X *nsw Y) /s Y → X Is this always a win?
Transform increases the number of long lived registers by one.May require spilling to the stack.
![Page 87: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/87.jpg)
Unused variablesX +nsw Z >=s Z +nsw Y
Z is an “unused variable”
![Page 88: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/88.jpg)
Unused variablesX +nsw Z >=s Z +nsw Y
Z is an “unused variable”
For every choice of the other variables (X, Y)the result of the expression does not dependon the value of Z (or is undefined)
![Page 89: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/89.jpg)
Unused variablesX +nsw Z >=s Z +nsw Y
Z is an “unused variable”
For every choice of the other variables (X, Y)the result of the expression does not dependon the value of Z (or is undefined)
Replaced Z with 0
Transform: X +nsw Z >=s Z +nsw Y → X >=s Y
![Page 90: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/90.jpg)
Unused variablesX +nsw Z >=s Z +nsw Y
Z is an “unused variable”
For every choice of the other variables (X, Y)the result of the expression does not dependon the value of Z (or is undefined)
Replaced Z with 0
Transform: X +nsw Z >=s Z +nsw Y → X >=s Y
Detect similarly to constant folding etc.
![Page 91: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/91.jpg)
ExamplesUnused variables found in “fully optimized” code:
● X >=s X +nsw Y
● ((X + Y) + -1) == X
● Y >>exact X == 0
● Y <<nsw X == 0
X is unused
![Page 92: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/92.jpg)
Problems with unused variables
● More false positives than other modes
![Page 93: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/93.jpg)
Problems with unused variables
● More false positives than other modes
● May increase register pressure
![Page 94: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/94.jpg)
Problems with unused variables
● More false positives than other modes
● May increase register pressure
● May increase the amount of computation
![Page 95: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/95.jpg)
Problems with unused variables
● More false positives than other modes
● May increase register pressure
● May increase the amount of computation
Eg: (A + B) * (C + D) == B * C + B * D
B is an unused variable
![Page 96: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/96.jpg)
Problems with unused variables
● More false positives than other modes
● May increase register pressure
● May increase the amount of computation
Eg: (A + B) * (C + D) == B * C + B * D
B is an unused variable
Transforms to: A * C + A * D == 0
![Page 97: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/97.jpg)
Problems with unused variables
● More false positives than other modes
● May increase register pressure
● May increase the amount of computation
Eg: (A + B) * (C + D) == B * C + B * D
B is an unused variable
Transforms to: A * C + A * D == 0
Requires computing A*C, A*D etc.
![Page 98: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/98.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
![Page 99: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/99.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
![Page 100: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/100.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
![Page 101: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/101.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
![Page 102: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/102.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
![Page 103: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/103.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
AllOnesValue & Y Cost: 11
![Page 104: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/104.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
AllOnesValue & Y Cost: 11
Y Cost: 3
![Page 105: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/105.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
AllOnesValue & Y Cost: 11
Y Cost: 3
![Page 106: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/106.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
AllOnesValue & Y Cost: 11
Y Cost: 3
Time: 1 minute
![Page 107: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/107.jpg)
Rule reductionRequires a list of rules, eg:
rule (0 And 1) => (1 And 0); // Commutativity rule (0 And AllBitsSet) <=> 0; // AllBitsSet is And-identity rule ((0 Or 1) And 2) <=> ((0 And 2) Or (1 And 2)); // Distributivity rule (0 Or AllBitsSet) => AllBitsSet; // AllBitsSet is Or-annihilator.
(X & Y) | Y Cost: 22
(X & Y) | (Y & AllOnesValue) Cost: 30
(X & Y) | (AllOnesValue & Y) Cost: 30
(X | AllOnesValue) & Y Cost: 22
AllOnesValue & Y Cost: 11
Y Cost: 3SubExpr: 0.05 secs UnusedVar: 0.08 secs
Time: 1 minute
![Page 108: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/108.jpg)
Rule reduction problems● Slow
![Page 109: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/109.jpg)
Rule reduction problems● Slow
● Needs more rules
![Page 110: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/110.jpg)
Rule reduction problems● Slow
● Needs more rules
● Can this approach find unexpected simplifications?
(zext X) + power-of-two == 0 → false
![Page 111: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/111.jpg)
Rule reduction problems● Slow
● Needs more rules
● Can this approach find unexpected simplifications?
(zext X) + power-of-two == 0 → false
Needs more work!
![Page 112: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/112.jpg)
Profit!
![Page 113: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/113.jpg)
Profit?Approximate % speed-up: constant folds
400.perlbench401.bzip2
403.gcc429.mcf
445.gobmk456.hmmer
458.sjeng462.libquantum
464.h264ref471.omnetpp
473.astar483.xalancbmk
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
![Page 114: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/114.jpg)
Profit?!Approximate % speed-up: constant folds & reduce to sub-expr:
400.perlbench401.bzip2
403.gcc429.mcf
445.gobmk456.hmmer
458.sjeng462.libquantum
464.h264ref471.omnetpp
473.astar483.xalancbmk
-2
-1
0
1
2
3
4
![Page 115: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/115.jpg)
Improvements● Work directly with LLVM IR
![Page 116: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/116.jpg)
Improvements● Work directly with LLVM IR
define i64 @combine(i64 %x) { %xl = trunc i64 %x to i32 %h = lshr i64 %x, 32 %xh = trunc i64 %h to i32 %eh = zext i32 %xh to i64 %el = zext i32 %xl to i64 %h2 = shl i64 %eh, 32 %r = or i64 %h2, %el ret i64 %r}
Simplifies to: ret %x
![Page 117: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/117.jpg)
Improvements● Work directly with LLVM IR
define i64 @combine(i64 %x) { %xl = trunc i64 %x to i32 %h = lshr i64 %x, 32 %xh = trunc i64 %h to i32 %eh = zext i32 %xh to i64 %el = zext i32 %xl to i64 %h2 = shl i64 %eh, 32 %r = or i64 %h2, %el ret i64 %r}
Simplifies to: ret %x
((zext (trunc (X >>l pow-2))) << pow-2) | (zext (trunc X))
Impossible to find, due to● Type-free expressions● Limited number of constants
![Page 118: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/118.jpg)
Improvements● Work directly with LLVM IR
(Constant folding, subexpression reduction, unused variables)
How to avoid many false positives?
![Page 119: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/119.jpg)
Improvements● Work directly with LLVM IR
● Sort expressions by execution frequency rather than textual frequency
(Constant folding, subexpression reduction, unused variables)
How to avoid many false positives?
![Page 120: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/120.jpg)
Improvements● Work directly with LLVM IR
● Sort expressions by execution frequency rather than textual frequency
(Constant folding, subexpression reduction, unused variables)
How to avoid many false positives?
Eg: generate fake debug info using the encoded expression forthe “function”.
Hottest “functions” reported by profiling tools are the hottestexpressions!
![Page 121: Super-optimizing LLVM IRllvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdfSuper optimization Optimization → Improve code Super-optimization → Obtain perfect code Super-optimization](https://reader034.fdocuments.net/reader034/viewer/2022050521/5fa48f4eb4e68b19b277a4e2/html5/thumbnails/121.jpg)
svn://topo.math.u-psud.fr/harvest
Getting it