AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

Marshall Tappen and Ernesto Gonzalez

Amazon Fulfillment Technologies

November 30, 2016

MAC301

Transforming Industrial

Processes with Deep Learning

What to Expect from the Session

• Description of how Amazon Fulfillment Technologies has

used computer vision to improve our processes.

• Walk through how we combined deep learning and

traditional computer vision to automate an industrial

process.

• What are the challenges and the opportunity created by

deep learning classifiers?

Overview of fulfillment process

One thing you have to understand about

fulfillment centers

Bins can hold anything

Misplaced inventory “disappears”

Amazon Confidential 5

Associate

rearranged

inventory

picking

items.

Misplaced inventory “disappears”

We call this

inventory

defect

Items fall out of pods

Our solution: use computer vision to locate

inventory defects

First step: get a physical system to capture

images

Station

Outbound

Inbound frame

Totes and

conveyance

Capture set of images as pod arrives at

the stationArrival Image

Departure Image

TowerStation

Associate interacts with pod

Arrival Image

Departure Image

Station

Photographed again as pod leaves

Arrival Image

Departure Image

Station

General strategy

• We want to take advantage of deep learning.

• The cameras capture images of an entire pod, but we

need data at the bin level.

• We will have a two-step process:

1. Extracting bins from images

2. Analyzing bin Images

Computer vision step 1: pod image to bin

images

No problem, use 2-D barcodes!

Bands block the

barcodes

Solution, if we can detect the trays

And we can detect the sides

We have a set of points to match with a recipe of the

pod’s geometry

Map the coordinate system of the database to

the face of the pod in the image

Detecting the side of a pod: downsample image

and convert to grayscale

2046 X 2046 Image 512 X 512 Image

Correlate* with left rail template

Filter

* In practice, we use normalized cross-correlation

Threshold

Fit a line (similar process for the other side)

We can detect trays in the same way

Now we

locations to

tie the

virtual

template to

the image!

Transformation between image and pod

physical coordinates is called a homography

We can verify

that it works by

calculating the

boundary of

each bin in the

image and

coloring it in.

How can we use computer vision?

• Automatic

identification of

every item?

• Automatic identification of every item?(TOO HARD)

• Automatic counting of every item?

What does computer vision need to tell us?

• Automatic

identification of every

item?(TOO HARD)

• Automatic counting

of every item? (TOO

Instead, we can look for changes

Inbound to the Station Outbound from the Station

Our first attempt was with hand-engineered

computer vision

It’s hard!

Must be robust to items rolling or shuffling inside

the bin, illumination changes, specularity, etc.

The big insight

• We realized our problem was just binary classification.

• Two images in, one label out.

• Why not try this deep-learning thing?

We did the simplest thing possible

• Take the first image,

convert it to grayscale,

and put it in the red

channel of a new image

• Take the second image

and put it in the blue

channel

• Now, we have a single

image to pass to the

neural network

It worked great!

Best Hand-

Engineered Model

CIFAR CNN

Krizhevsky’s CNN

Processing pipeline

Pod Image

Bin Extraction

Bin Images

Defect

Detection

Implementation details

• Implemented in OpenCV in Python

• C++ extensions for some steps

• Neural net uses Caffe

• Trained on G2 instances

• Runs on CPU in FC server room

• Can tolerate latency in our current use-pattern

Software architecture

Inventory

Correlator

Service

Remote

Website

(Defect

Detection)

Site Server Room AWS

Inventory

Bin Count

Elimination

• Get Bin Defect

Result

• Get Bin Space

Available

Capture

Router

Extraction

Process

Storage

Service

Pod Face

Images

Put Bin

Images

Get Pod

Images

Camera

Controller

File Pusher

Barcode

Extraction

Device (s)

DEVICE

Get Bin

Applications

DynamoDB

Get Work for Remote

Counting

Automatic

identification of every

item?(TOO HARD)

Automatic counting

of every item?

Could we just count the number of items in the

• At this point, we have lots of data.

• Some of it has errors from inventory defects, but

networks have proven resilient to this kind of thing.

• Why not just train a network to directly count bins?

Using a convolutional neural network

• We used the Caffe implementation of GoogLeNet [1]

[1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent

Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE International

Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

Maps cleanly onto classification paradigm

• Treat it as a multi-class classification problem

Neural

Network

This saved the project

• Hit the targets we needed

• Eliminated a lot of hardware (no more before/after shots

needed)

• Made the project cost effective

• Here is what we learned:

• Don’t focus on algorithms, focus on DATA

How else can we use this data?

• We want to find free space

in the bin without having to

label data.

• We can guess from

dimensions of items.

• But where is the space at?

Train model to predict emptiness from an image

Emptiness scoreConv

olGoogleNet

This is a noisy,

probably incorrect

estimate!

But we can use layers in the network to find where the

space actually is!

emptiness scoreConv

olGoogleNet

1024 channels

Original image Activation map Binary mapOriginal image Activation map Binary map

And it works!

We are releasing a dataset

Takeaways

• We have great pattern recognition machinery now.

• Focus on the data:

• How can you get lots of it?

• What can you get for free?

• How much labeling do you really need?

• Is there a proxy problem?

Thank you!

Remember to complete

your evaluations!

AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

Technology

Transcript of AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

AWS re:invent 2015

Aws re:Invent 2016 Security Follow Up AWS Organizations

[AWS re:invent 2013 Report] AWS New EC2 Instance Types

AWS Re:Invent - Securing HIPAA Compliant Apps in AWS

AWS re:Invent re:Cap - AWS 보안 기능 업데이트 - 이종남

(PFC302) Performance Benchmarking on AWS | AWS re:Invent 2014

(SOV209) Introducing AWS Directory Service | AWS re:Invent 2014

Recap of AWS re:invent 2015

AWS re:Invent 2013 Recap

AWS Re:Invent Security Recap AWS SSO

AWS 2016 re:Invent Launch Summary

(SEC201) AWS Security Keynote Address | AWS re:Invent 2014

[AWS re:invent 2013 Report] AWS CloudTrail

AWS re:Invent 2015 re:Cap

NetApp Private Storage for AWS (ENT216) | AWS re:Invent 2013

AWS Security – Keynote Address (SEC101) | AWS re:Invent 2013

Navigating AWS re:Invent 2015

AWS Billing Deep Dive (DMG203) | AWS re:Invent 2013

AWS re:Invent Hackathon

AWS re:Invent 2016 Photo Report