Ruby World

26
Applying Compiler Technology to Ruby Sept 8, 2009 Evan Phoenix Wednesday, September 16, 2009

description

Evan Phoenixs presentation at the Ruby World conference, Sept 7th, 2009.

Transcript of Ruby World

Page 1: Ruby World

Applying Compiler Technology to Ruby

Sept 8, 2009

Evan Phoenix

Wednesday, September 16, 2009

Page 2: Ruby World

What makes Ruby great can make Ruby slow.

Wednesday, September 16, 2009

Page 3: Ruby World

‣ Highly Dynamic

Wednesday, September 16, 2009

Page 4: Ruby World

‣ Highly Dynamic

• Very high level operations

• New code can be introduced at anytime

• Dynamic typing

• Exclusively late bound method calls

• Easier to implement as an interpreter

Wednesday, September 16, 2009

Page 5: Ruby World

Haven’t other languages had these same features/

weaknesses?

Wednesday, September 16, 2009

Page 6: Ruby World

‣Prior Work

Wednesday, September 16, 2009

Page 7: Ruby World

‣Prior Work

• Smalltalk

• 1980-1994: Extensive work to make it fast

• Self

• 1992-1996: A primary research vehicle for making dynamic languages fast

• Java / Hotspot

• 1996-present: A battle hardened engine for (limited) dynamic dispatch

Wednesday, September 16, 2009

Page 8: Ruby World

‣What Can We Learn From Them?

Wednesday, September 16, 2009

Page 9: Ruby World

‣What Can We Learn From Them?

• Complied code is faster than interpreted code

• It’s very hard (almost impossible) to figure things out staticly

• The type profile of a program is stable over time

• Therefore:

• Learn what a program does and optimize based on that

• This is called Type Feedback

Wednesday, September 16, 2009

Page 10: Ruby World

‣Code Generation (JIT)

• Eliminating overhead of interpreter instantly increases performance a fixed percentage

• Naive code generation results in small improvement over interpreter

• Method calling continues to dominate time

• Need a way to generate better code

• Combine with program type information!

Wednesday, September 16, 2009

Page 11: Ruby World

‣Type Profile

• As the program executes, it’s possible to see how one method calls another methods

• The relationship of one method and all the methods it calls is the type profile of the method

• Just because you CAN use dynamic dispatch, doesn’t mean you always do.

• It’s common that a call site always calls the same method every time it’s run

Wednesday, September 16, 2009

Page 12: Ruby World

21%

1 class98%

1: 25245 2: 275 3: 86 4: 50 5: 35 6: 6 7: 10 8: 5 9: 5 10: 2 10+: 34

Call sites running Array specs

Wednesday, September 16, 2009

Page 13: Ruby World

‣Type Profiling (Cont.)

• 98% of all method calls are to the same method every time

• In other words, 98% of all method calls are statically bound

Wednesday, September 16, 2009

Page 14: Ruby World

‣Type Feedback

• Optimize a semi-static relationship to generate faster code

• Semi-static relationships are found by profiling all call sites

• Allow JIT to make vastly better decisions

• Most common optimization: Method Inlining

Wednesday, September 16, 2009

Page 15: Ruby World

‣Method Inlining

• Rather than emit a call to a target method, copy it’s body at the call site

• Eliminates code to lookup and begin execution of target method

• Simplifies (or eliminates) setup for target method

• Allows for type propagation, as well as providing a wider horizon for optimization.

• A wider horizon means better generated code, which means less work to do per method == faster execution.

Wednesday, September 16, 2009

Page 16: Ruby World

Implementation

Wednesday, September 16, 2009

Page 17: Ruby World

‣Code Generation (JIT)

• Early experimentation with custom JIT

•Realized we weren’t experts

•Would take years to get good code being generated

• Switched to LLVM

Wednesday, September 16, 2009

Page 18: Ruby World

‣LLVM

• Provides an internal AST (LLVM IR) for describing work to be done

• Text representation of AST allows for easy debugging

• Provides ability to compile AST to machine code in memory

• Contains thousands of optimizations

• Competitive with GCC

Wednesday, September 16, 2009

Page 19: Ruby World

‣Type Profiling

• All call sites use a class called InlineCache, one per call site

• InlineCache accelerates method dispatch by caching previous method used

• In addition, tracks a fixed number of receiver classes seen when there is a cache miss

• When compiling a method using LLVM, all InlineCaches for a method can be read

• InlineCaches with good information can be used to accurately find a method to inline

Wednesday, September 16, 2009

Page 20: Ruby World

‣When To Compile

• It takes time for a method’s type information to settle down

• Compiling too early means not having enough type info

• Compiling too late means lost performance

• Use simple call counters to allow a method to “heat up”

• Each invocation of a method increments counter

• When counter reaches a certain value, method is queued for compilation.

• Threshold value is tunable: -Xjit.call_til_compile

• Still experimenting with good default values

Wednesday, September 16, 2009

Page 21: Ruby World

‣How to Compile

• To impact runtime as little as possible, all JIT compilation happens in a background OS thread

• Methods are queued, and background thread reads queue to find methods to compile

• After compiling, function pointers to JIT generated code are installed in methods

• All future invocations of method use JIT code

Wednesday, September 16, 2009

Page 22: Ruby World

‣Benchmarks

0

2.25

4.5

6.75

9

1.8 1.9 rbx rbx jit rbx jit +blocks

2.59

3.60

5.90

5.30

8.02

Seconds

def foo() ary = [] 100.times { |i| ary << i }end

300,000 times

Wednesday, September 16, 2009

Page 23: Ruby World

‣Benchmarks

0

7.5

15

22.5

30

1.8 1.9 rbx rbx jit rbx jit +blocks

12.0112.54

25.36

5.264.85

Seconds

def foo() hsh = {} 100.times { |i| hsh[i] = 0 }end

100,000 times

Wednesday, September 16, 2009

Page 24: Ruby World

‣Benchmarks

0

1.75

3.5

5.25

7

1.8 1.9 rbx rbx jit rbx jit +blocks

2.662.68

6.26

2.09

3.64

Seconds

def foo() hsh = { 47 => true } 100.times { |i| hsh[i] }end

100,000 times

Wednesday, September 16, 2009

Page 25: Ruby World

‣Benchmarks

0

2

4

6

8

1.8 1.9 jruby rbx rbx jit rbx jit +blocks

1.531.53

7.27

1.891.58

7.36

Seconds

tak(18, 9, 0)

Wednesday, September 16, 2009

Page 26: Ruby World

‣Conclusion

• Ruby is a wonderful language because it is organized for humans

• By gather and using information about a running program, it’s possible to make that program much faster without impacting flexibility

• Thank You!

Wednesday, September 16, 2009