Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for...

28
Fast numerics in Python - NumPy and PyPy Maciej Fijalkowski SEA, NCAR 22 February 2012 fijal PyPy

Transcript of Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for...

Page 1: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Fast numerics in Python - NumPyand PyPy

Maciej Fijałkowski

SEA, NCAR

22 February 2012

fijal PyPy

Page 2: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

What is this talk about?

What is PyPy and why?Numeric landscape in PythonWhat we achieved in PyPyWhere we’re going?

fijal PyPy

Page 3: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

What is PyPy?

An efficient implementation of PythonlanguageA framework for writing efficient dynamiclanguage implementationsAn open source project with a lot of volunteereffort, released under the MIT licenseAgile development, 13000 unit tests, continuousintegration, sprints, distributed teamI’ll talk today about the first part (mostly)

fijal PyPy

Page 4: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

PyPy status right now

An efficient just in time compiler for thePython languageRelatively “good’ on numerics (compared toother dynamic languages)Example - real time video processing2-300x faster on Python code

fijal PyPy

Page 5: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Why should you care?

If I write this stuff in C/fortran/assembler it’ll befaster anywaymaybe, but ...

fijal PyPy

Page 6: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Why should you care? (2)

Experimentation is importantImplementing something faster, in human time,leaves more time for optimizations andimprovementsFor novel algorithms, clearer implementationmakes them easier to evaluate (Python often iscleaner than C)

Sometimes makes it possible in the first place

fijal PyPy

Page 7: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Why should you care? (2)

Experimentation is importantImplementing something faster, in human time,leaves more time for optimizations andimprovementsFor novel algorithms, clearer implementationmakes them easier to evaluate (Python often iscleaner than C)

Sometimes makes it possible in the first place

fijal PyPy

Page 8: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Why would you care even more?

Growing communityEverything is for free with reasonable licensingThere are many smart people out thereaddressing hard problems

fijal PyPy

Page 9: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Example of why would you care

You spend a year writing optimized algorithmsfor a GPUNext year a new generation of GPUs come alongYour algorithms are no longer optimized

Alternative - express your algorithmsLeave low-level details to people who havenothing better to do

... like me (I don’t know enough Physics to dothe other part)

fijal PyPy

Page 10: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Example of why would you care

You spend a year writing optimized algorithmsfor a GPUNext year a new generation of GPUs come alongYour algorithms are no longer optimized

Alternative - express your algorithmsLeave low-level details to people who havenothing better to do

... like me (I don’t know enough Physics to dothe other part)

fijal PyPy

Page 11: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Example of why would you care

You spend a year writing optimized algorithmsfor a GPUNext year a new generation of GPUs come alongYour algorithms are no longer optimized

Alternative - express your algorithmsLeave low-level details to people who havenothing better to do

... like me (I don’t know enough Physics to dothe other part)

fijal PyPy

Page 12: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Numerics in Python

numpy - for array operationsscipy, scikits - various algorithms, alsoexposing C/fortran librariesmatplotlib - pretty picturesipython

fijal PyPy

Page 13: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

There is an entire ecosystem!

Which I don’t even know very wellPyCUDA

pandas

mayavi

fijal PyPy

Page 14: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

What’s important?

There is an entire ecosystem built by peopleIt’s available for free, no shady licensingIt’s being expandedIt’s growingIt’ll keep up with hardware advancments

fijal PyPy

Page 15: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Problems with numerics in python

Stuff is reasonably fast, but...

Only if you don’t actually write much PythonArray operations are fine as long as they’revectorizedNot everything is expressable that wayNumpy allocates intermediates for eachoperation, suboptimal

fijal PyPy

Page 16: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Problems with numerics in python

Stuff is reasonably fast, but...

Only if you don’t actually write much PythonArray operations are fine as long as they’revectorizedNot everything is expressable that wayNumpy allocates intermediates for eachoperation, suboptimal

fijal PyPy

Page 17: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Our approach

Build a tree of operationsCompile assembler specialized for aliasing andoperationsExecute the specialized assembler

fijal PyPy

Page 18: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Examples

a, b, c are single dimensional arraysa+a would generate different code than a+ba+b*c is as fast as a loop

fijal PyPy

Page 19: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Performance comparison

NumPyPyPy GCCa+b 0.6s 0.4s 0.3s

(0.25s)a+b+c 1.9s 0.5s 0.7s

(0.32s)5+ 3.2s 0.8s 1.7s

(0.51s)

Pathscale is actually slower

fijal PyPy

Page 20: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Performance comparion SSE

Branch only so far!

PyPySSE

PyPy GCC

a+b 0.3s 0.4s 0.3s(0.25s)

a+b+c 0.35s 0.5s 0.7s(0.32s)

5+ 0.36s 0.8s 1.7s(0.51s)

fijal PyPy

Page 21: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Status

This works reasonably wellFar from implementing the entire numpy,although it’s in progressAssembler generation backend needs worksVectorization in progress

fijal PyPy

Page 22: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Status benchmarks - slightly morecomplex

laplace solutionsolutions:

NumPy: 4.3slooped: too long to run ~2100s

PyPy: 1.6slooped: 2.5s

C: 0.9s

fijal PyPy

Page 23: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Progress plan

Express operations in high-level languagesLet us deal with low level details

However, retain knobs and buttons for advancedusersDon’t get penalized too much for not using them

fijal PyPy

Page 24: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Progress plan

Express operations in high-level languagesLet us deal with low level details

However, retain knobs and buttons for advancedusersDon’t get penalized too much for not using them

fijal PyPy

Page 25: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Few words about the future

Predictions are hard

Especially when it comes to futureTake this with a grain of salt

fijal PyPy

Page 26: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Few words about the future

Predictions are hard

Especially when it comes to futureTake this with a grain of salt

fijal PyPy

Page 27: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

This is just the beginning...

PyPy is an easy platform to experiment withWe did not spend a whole lot of time dealing withthe low-level optimizationsAutomatic vectorization over multiple threadsSSE, GPU, dynamic offloadingOptimizations based on machine cache sizeWe’re running a fundraiser, make your employerdonate money

fijal PyPy

Page 28: Maciej Fijałkowski - SEA · An efficient implementation of Python language A framework for writing efficient dynamic language implementations An open source project with a lot

Q&A

http://pypy.org/

http://buildbot.pypy.org/numpy-status/latest.html

http://morepypy.blogspot.com/

Any questions?

fijal PyPy