Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)
-
Upload
aaron-meurer -
Category
Software
-
view
885 -
download
3
Transcript of Conda: A Cross-Platform Package Manager for Any Binary Distribution (SciPy 2014)
Conda: A Cross-Platform
Package Manager for Any
Binary Distribution Aaron Meurer
Ilan Schnell
Continuum Analytics, Inc
or,
Solving the Packaging
Problem
What is the packaging problem?
History
Two sides
Installing Building
Two sides
Installing Building
User Developer
Installing
• setup.py install
• easy_install
• pip
• apt-get
• rpm
• emerge
• homebrew
• port
• fink
• …
setup.py install
• fine if it’s pure Python, not so much if it isn’t
• you have to have compilers installed
distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1
setup.py install
You are your own package manager
pip
• Only works with Python
• Not so great for scientific packages that depend on big C libraries
• Try installing h5py if you don’t have HDF5
pip
You are a “self integrator”
Building
Problems• distutils is not really designed for compiled packages
• numpy.distutils “fork”
• setuptools is over complicated
• import setuptools monkeypatches distutils
• Entry points require pkg_resources
• pkg_resources.DistributionNotFound: flake8==2.1.0
• Each egg adds an entry to sys.path
• import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
Package maintainers hate having packages that no one can install
What is the packaging problem?
What about wheels?
• Python package specific
• Can’t build wheels for C libraries
• Can’t make a wheel for Python itself
• Still doesn’t address problem that some metadata is only in the package
itself
• You are still a “self integrator”
System Packaging solutions
yum (rpm)
apt-get (dpkg)
Linux OSXmacports
homebrew
fink
Windows
chocolatey
npackd
System Packaging solutions
yum (rpm)
apt-get (dpkg)
Linux OSXmacports
homebrew
fink
Windows
chocolatey
npackd
Cross-platform
conda
Conda• System level package manager (Python agnostic)
• Python, hdf5, and h5py are all conda packages
• Cross platform (works on Windows, OS X, and Linux)
• Doesn’t require administrator privileges
• Installs binaries (no more compiler woes)
• Metadata stored separately in the repository index
• Uses a SAT solver to resolve dependency before packages are
installed
Basic conda usageInstall a package conda install sympy
List all installed packages conda list
Search for packages conda search llvm
Create a new environment conda create -n py3k python=3
Remove a package conda remove nose
Get help conda install --help
Advanced usageInstall a package in an
environmentconda install -n py3k sympy
Update all packages conda update --all
Export list of packages conda list --export packages.txt
Install packages from an export conda install --file packages.txt
See package history conda list --revisions
Revert to a revision conda install --revision 23
Remove unused packages and
cached tarballs
conda clean -pt
What is a conda package?
What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata
/lib
/include
/bin
/man/info
files
index.json
What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata
/lib
/include
/bin
/man/info
files
index.json
Files are not Python specific.
Any kind of program at all can be a conda package.
Metadata is static.
Python Agnostic
• A conda package can be anything
• Python packages
• Python itself
• C libraries (GDAL, netCDF4, dynd, …)
• R
• Node JS
• Perl
Installation
• The tarball is unarchived in the pkgs directory
• Files are hard-linked to the install path
• Shebang lines and other instances of a place-holder prefix are
replaced with the install prefix
• The metadata is updated, so that conda knows that it is installed
• post-link script is run (these are rare)
And that’s it
conda install sympy
Installation
And that’s it
conda install sympy
Environments• Environments are simple: just link the package to a different directory
• Hard-links are very cheap, and very fast
• Conda environments are completely independent installations of
everything
• No fiddling with PYTHONPATH or symlinking site-packages
• “Activating” an environment just means changing your PATH so that
its bin/ or Scripts/ comes first.
• Unix:
• Windows:
conda create -n py3k python=3.4
source activate py3k
activate py3k
Environments
/python-3.4.1-0
/bin/python /sympy-0.7.5-0
/bin/isympy
/lib/python3.4/
site-packages/
sympy
/envs
/sympy-env
/bin/python
/bin/isympy
/lib/python3.4/
site-packages/
sympy
Hard links
/pkgs
/test
/bin/python
EnvironmentsUses:
• Testing (python 2.6, 2.7, 3.3)
• Development
• Trying new packages from PyPI
• Separating deployed apps with different
dependency needs
• Trying new versions of Python
• Reproducible science
Building
Conda Recipes
• meta.yaml contains metadata
• build.sh is the build script for Unix and
bld.bat is the build script for Windows
meta.yaml
build.sh
bld.bat
(optional)
fix.patch
run_test.py
post-link.sh
conda build path/to/recipe/
Example meta.yaml
Conda Recipes
• Lots more
• Command line entry points
• Fine-grained control over conda’s relocation logic
• Inequalities for versions of dependencies (like >=1.2,<2.0)
• “Preprocessing selectors” allow using the same meta.yaml
for many platforms
• See http://conda.pydata.org/docs/build.html for full
documentation
conda build path/to/recipe/
• conda build is only a convenient wrapper
• You can also build packages manually just by following the package
specification (http://conda.pydata.org/docs/spec.html)
Sharing • Once you have a conda package,
the easiest way to share it is to upload it to Binstar
• Others can install your package with
conda install -c binstar_username package
• Or add your channel to their configuration withconda config -—add channels
binstar_username
Self Hosting
• You can also self-host
• Store packages in a directory by platform (osx-64, linux-32, linux-64,
win-32 ,win-64)
• Run conda index on that directory to generate the repodata.json
• Serve this up, or use a file:// url as a channel
• Binstar is just a very convenient hosted wrapper around conda index
conda index directory/osx-64
Final words
• conda is completely open source (BSD) https://github.com/conda/conda
• We have a mailing list ([email protected])
• A big thanks to Continuum for paying me to work on open source
Thanks!
Sean Ross-Ross (principal binstar.org developer)
Bryan Van de Ven (original conda author)
Ilan Schnell (principal conda developer)
Travis Oliphant (Continuum CEO)