Managing Open Source software in the Docker era
Transcript of Managing Open Source software in the Docker era
Agenda▷ Overview of the Docker platform▷ New OSS governance challenges▷ OSS governance solutions▷ Automating compliance for Docker
What is Docker?
▷ A software platform that allows you to package an application with all of its dependencies into a standardized unit for software development and deployment.
▷ The concept is borrowed from shipping containers, which define a standard to ship goods globally.
Build Ship Run
Development Deployment
Why Docker now?
▷ Provides workflow, tools and a repository system to make it very easy to create deploy self-contained applications in a lightweight container
▷ Open source under a permissive license and well-integrated with most popular open source tools used for software development and deployment
▷ Built for the Cloud▷ Riding on top of the virtualization wave
In most cases, applications interact directly with the OS so there is little or no overhead from the Docker Engine.
Docker Containers vs. Virtual Machines
Virtual Machines Containers
Docker basics - terminology
▷ Container - a runtime instance of a docker image.▷ Image - an ordered collection of root filesystem
changes and execution parameters for use within a running container.
▷ Layer - a “slice” of the container filesystem; one layer is created by each Dockerfile instruction.
▷ Dockerfile - a text script that contains the commands you execute to build a Docker image.
▷ Registry - a repository of images. It can be public or private. This is a web service.
Docker images and containers
Dockerfile
Base image file
FROM xxxADD xyzRUN fooCMD bar
Running container
Docker “build”
Docker “run”
Image file
At runtime, the container sees files from the top layer down.
The orange file "masks" the red file at the same path in lower layer. Red file is present in lower layer but not visible.
Docker layers
Dockerfile
FROM xxxADD xyzCOPY bazRUN fooCMD bar
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
Each Dockerfile instruction line creates a new layer.
The FROM xxx base image and its layers become the bottom layers (0 to 3)
New layers (4-6) built from the Dockerfile become the top layers
New OSS governance challenges
▷ By design Docker crosses traditional boundaries between development and distribution/deployment
▷ Traditional control points for distribution/deployment may not apply○ Packaging a product for distribution○ DevOps processes for deployment (internal or Cloud)
▷ Reuse of base images from the public Docker Hub is very common in order to get a quick start
Issues - Component traceability
▷ Creating a Docker image is similar to a traditional build process, but○ Components may be pulled from any mix of public and
private repos
○ Artifacts from Dockerfile instructions may be hidden
inside the image
○ There is no build log
▷ Once an image is built there is no easy way to trace a Dockerfile back to the layers it created
Issues - Public Docker images
▷ Current Docker usage often starts with public images from Docker Hub
▷ Unlike a standard Linux Distro, these images are designed to be as small as possible and may be missing:○ License and copyright notices○ Copies of the corresponding source for Redistribution
▷ When you ship or deploy an image you are responsible for compliance for all layers included in that image
Issues - Private Docker images
▷ Dockerfile instructions to install or update packages may pull code from unexpected locations:
○ Internal or public repositories○ Complexity compounded by dependencies
▷ There is no audit trail for what you installed with Dockerfile instructions (i.e. no build log)
▷ When you ship or deploy an image you are responsible for compliance for all layers included in that image
Key OSS governance questions
▷ Which OSS components are included in each Docker image and what are their licenses?
▷ What are my Attribution obligations?
▷ What are my Redistribution obligations?○ You will be distributing Copyleft-licensed Linux user
space packages in Docker images○ Much more complex than with a standard Linux Distro
▷ How do I organize compliance by product/application?
OSS Governance - Risks
▷ Your risk level for OSS compliance depends on use case(s):○ Internal development --> Low○ Internal deployment --> Medium○ Cloud deployment to customers --> Higher○ Product distribution to customers --> Highest
▷ New risks related to security vulnerabilities○ Similar component traceability challenges
▷ Rear-view software audits are no longer practical○ More complex by an order of magnitude○ Thousands of software components is common
○ Rate of software updates is much higher
▷ The scope of the challenge depends on:○ The controls applied by the team that produces a
Docker image,○ and by the team that created its base FROM image,○ all the way down to the original root filesystem.
OSS Governance - Audits
OSS Governance solutions
▷ Update development processes
▷ Update compliance processes
▷ Update provisioning controls
▷ Instrument Docker build processes
Update development processes
▷ Update development process standards○ Define specific standards for building and deploying
Docker images○ Apply comparable standards to any Docker images
from a supplier
▷ Consider that when you distribute Docker images, you have effectively become a Linux Distro supplier○ Or more precisely a supplier of multiple Distros○ Best case will be to minimize the size and number of
these Distros
Update compliance processes
▷ Define how Attribution notices will be provided with:○ Cloud deployment to customers○ Product distribution to customers
▷ Define how source code for Copyleft-licensed component-versions will be collected
▷ Define expectations for supplier-provided Docker images (including from OSS projects)
Update provisioning controls
▷ Only use public or third-party Docker images with clear provenance and documented components
▷ Consider building your own base images○ Limit components to those you need○ Control the update/refresh cycle
▷ Set clear standards for how components are provisioned inside each image
Instrument Docker build processes▷ Capture origin and license for components as they are
added or updated during the image build○ Use verbose logging for image builds (default)○ Collect copies of all components as installed or ADDed
○ Document each Dockerfile instruction including why/how components are provisioned
▷ Keep as much provenance data as possible within each image ○ Do not remove existing notices or licenses○ Use MAINTAINER and LABEL Dockerfile tags
Summary
▷ Docker is a powerful new open source technology○ Accelerates development to deployment○ Especially well-suited for the Cloud○ Extremely rapid adoption
▷ Like any new technology, it requires adopters to update and adapt policies, processes and tools○ First understand the new form of old risks○ Update policies, processes and tools before you use
Docker for product deployment or distribution○ The Docker community will likely provide solutions for
the missing tools sooner than later
Resources
▷ Why Docker? http://blog.codeship.com/why-docker/
▷ Introduction to Docker (Twitter)https://www.youtube.com/watch?v=4W2YY-qBla0
▷ When and How to Use Docker https://youtu.be/OgiyiuqqOuk
▷ Contain Yourself (Harvard CS50 course) https://live.cs50.net/docker
CreditsSpecial thanks to the people who made these awesome free resources:
▷ Presentation template by SlidesCarnival▷ Photographs by Unsplash▷ Images from the Noun Project
○ Shipping container by Zahi Asa
○ Polaroid by Michael Stüker
○ Server by Viktor Minuvi
○ Layers by David Swanson
○ Spreadsheet by Hello Many
○ Checklist by Prasad Ghone
○ Check List by Julynn B.
○ Box by Mourad Mokrane