Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

40
Conda: A Cross-Platform Package Manager for Any Binary Distribution Aaron Meurer Ilan Schnell Continuum Analytics, Inc

Transcript of Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Page 1: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Conda: A Cross-Platform

Package Manager for Any

Binary Distribution Aaron Meurer

Ilan Schnell

Continuum Analytics, Inc

Page 2: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

or,

Solving the Packaging

Problem

Page 3: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What is the packaging problem?

Page 4: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

History

Page 5: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Two sides

Installing Building

Page 6: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Two sides

Installing Building

User Developer

Page 7: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Installing

• setup.py install

• easy_install

• pip

• apt-get

• rpm

• emerge

• homebrew

• port

• fink

• …

Page 8: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

setup.py install

• fine if it’s pure Python, not so much if it isn’t

• you have to have compilers installed

distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1

Page 9: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

setup.py install

You are your own package manager

Page 10: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

pip

• Only works with Python

• Not so great for scientific packages that depend on big C libraries

• Try installing h5py if you don’t have HDF5

Page 11: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

pip

You are a “self integrator”

Page 12: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Building

Page 13: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Problems• distutils is not really designed for compiled packages

• numpy.distutils “fork”

• setuptools is over complicated

• import setuptools monkeypatches distutils

• Entry points require pkg_resources

• pkg_resources.DistributionNotFound: flake8==2.1.0

• Each egg adds an entry to sys.path

• import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

Page 14: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Package maintainers hate having packages that no one can install

Page 15: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What is the packaging problem?

Page 16: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)
Page 17: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What about wheels?

• Python package specific

• Can’t build wheels for C libraries

• Can’t make a wheel for Python itself

• Still doesn’t address problem that some metadata is only in the package

itself

• You are still a “self integrator”

Page 18: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

System Packaging solutions

yum (rpm)

apt-get (dpkg)

Linux OSXmacports

homebrew

fink

Windows

chocolatey

npackd

Page 19: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

System Packaging solutions

yum (rpm)

apt-get (dpkg)

Linux OSXmacports

homebrew

fink

Windows

chocolatey

npackd

Cross-platform

conda

Page 20: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Conda• System level package manager (Python agnostic)

• Python, hdf5, and h5py are all conda packages

• Cross platform (works on Windows, OS X, and Linux)

• Doesn’t require administrator privileges

• Installs binaries (no more compiler woes)

• Metadata stored separately in the repository index

• Uses a SAT solver to resolve dependency before packages are

installed

Page 21: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Basic conda usageInstall a package conda install sympy

List all installed packages conda list

Search for packages conda search llvm

Create a new environment conda create -n py3k python=3

Remove a package conda remove nose

Get help conda install --help

Page 22: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Advanced usageInstall a package in an

environmentconda install -n py3k sympy

Update all packages conda update --all

Export list of packages conda list --export packages.txt

Install packages from an export conda install --file packages.txt

See package history conda list --revisions

Revert to a revision conda install --revision 23

Remove unused packages and

cached tarballs

conda clean -pt

Page 23: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What is a conda package?

Page 24: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata

/lib

/include

/bin

/man/info

files

index.json

Page 25: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata

/lib

/include

/bin

/man/info

files

index.json

Files are not Python specific.

Any kind of program at all can be a conda package.

Metadata is static.

Page 26: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Python Agnostic

• A conda package can be anything

• Python packages

• Python itself

• C libraries (GDAL, netCDF4, dynd, …)

• R

• Node JS

• Perl

Page 27: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Installation

• The tarball is unarchived in the pkgs directory

• Files are hard-linked to the install path

• Shebang lines and other instances of a place-holder prefix are

replaced with the install prefix

• The metadata is updated, so that conda knows that it is installed

• post-link script is run (these are rare)

And that’s it

conda install sympy

Page 28: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Installation

And that’s it

conda install sympy

Page 29: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Environments• Environments are simple: just link the package to a different directory

• Hard-links are very cheap, and very fast

• Conda environments are completely independent installations of

everything

• No fiddling with PYTHONPATH or symlinking site-packages

• “Activating” an environment just means changing your PATH so that

its bin/ or Scripts/ comes first.

• Unix:

• Windows:

conda create -n py3k python=3.4

source activate py3k

activate py3k

Page 30: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Environments

/python-3.4.1-0

/bin/python /sympy-0.7.5-0

/bin/isympy

/lib/python3.4/

site-packages/

sympy

/envs

/sympy-env

/bin/python

/bin/isympy

/lib/python3.4/

site-packages/

sympy

Hard links

/pkgs

/test

/bin/python

Page 31: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

EnvironmentsUses:

• Testing (python 2.6, 2.7, 3.3)

• Development

• Trying new packages from PyPI

• Separating deployed apps with different

dependency needs

• Trying new versions of Python

• Reproducible science

Page 32: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Building

Page 33: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Conda Recipes

• meta.yaml contains metadata

• build.sh is the build script for Unix and

bld.bat is the build script for Windows

meta.yaml

build.sh

bld.bat

(optional)

fix.patch

run_test.py

post-link.sh

conda build path/to/recipe/

Page 34: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Example meta.yaml

Page 35: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Conda Recipes

• Lots more

• Command line entry points

• Fine-grained control over conda’s relocation logic

• Inequalities for versions of dependencies (like >=1.2,<2.0)

• “Preprocessing selectors” allow using the same meta.yaml

for many platforms

• See http://conda.pydata.org/docs/build.html for full

documentation

conda build path/to/recipe/

Page 36: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

• conda build is only a convenient wrapper

• You can also build packages manually just by following the package

specification (http://conda.pydata.org/docs/spec.html)

Page 37: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Sharing • Once you have a conda package,

the easiest way to share it is to upload it to Binstar

• Others can install your package with

conda install -c binstar_username package

• Or add your channel to their configuration withconda config -—add channels

binstar_username

Page 38: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Self Hosting

• You can also self-host

• Store packages in a directory by platform (osx-64, linux-32, linux-64,

win-32 ,win-64)

• Run conda index on that directory to generate the repodata.json

• Serve this up, or use a file:// url as a channel

• Binstar is just a very convenient hosted wrapper around conda index

conda index directory/osx-64

Page 39: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Final words

• conda is completely open source (BSD) https://github.com/conda/conda

• We have a mailing list ([email protected])

• A big thanks to Continuum for paying me to work on open source

Page 40: Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)

Thanks!

Sean Ross-Ross (principal binstar.org developer)

Bryan Van de Ven (original conda author)

Ilan Schnell (principal conda developer)

Travis Oliphant (Continuum CEO)