Transcript of BDA 301 An Introduction to Amazon Rekognition, for Deep Learning-based Computer Vision
- 1. 2017, Amazon Web Services, Inc. or its Affiliates. All
rights reserved. David Pearson, AWS AI Services April 2017 Amazon
Rekognition Extract Rich Image Metadata from Visual Content
- 2. Amazon AI Intelligent Services Powered By Deep Learning
- 3. Rich Metadata Index objects, scenes, facial attributes,
persons Amazon Rekognition Deep Learning-Based Image Recognition
Service
- 4. Deer 98.8% Wildlife 95.1% Conifer 95.1% Spruce 95.1% Wood
78.3% Tree 63.5% Forest 63.5% Vegetation 61.9% Pine 60.6% Outdoors
54.0% Flower 53.9% Plant 52.9% Nature 50.7% Field 50.7% Grass
50.7%
- 5. { "Image": { "Bytes": blob, "S3Object": { "Bucket":
"string", "Name": "string", "Version": "string" } }, "MaxLabels":
number, "MinConfidence": number } DetectLabels Amazon S3 Image
Bucket
- 6. DetectLabels "Labels": [ { "Confidence": 98.9294204711914,
"Name": "Moss" }, { "Confidence": 98.9294204711914, "Name": "Plant"
}, { "Confidence": 97.35887908935547, "Name": "Creek" }, {
"Confidence": 97.35887908935547, "Name": "Outdoors" }, {
"Confidence": 97.35887908935547, "Name": "Stream" }, {
"Confidence": 97.35887908935547, "Name": "Water" },
- 7. Age Range 38-59 Beard: False 84.3% Emotion: Happy 86.5%
Eyeglasses: False 99.6% Eyes Open: True 99.9% Gender: Male 99.9%
Mouth Open: False86.2% Mustache: False 98.4% Smile: True 95.9%
Sunglasses: False 99.8% Bounding Box Height: 0.36716.. Left:
0.40222.. Top: 0.23582.. Width: 0.27222.. Landmarks EyeLeft
EyeRight Nose MouthLeft MouthRight LeftPupil RightPupil
LeftEyeBrowLeft LeftEyeBrowRight LeftEyeBrowUp : Quality Brightness
52.5% Sharpness 99.9%
- 8. "BoundingBox": { "Height": 0.3449999988079071, "Left":
0.09666666388511658, "Top": 0.27166667580604553, "Width":
0.23000000417232513 }, "Confidence": 100, "Emotions": [
{"Confidence": 99.1335220336914, "Type": "HAPPY" }, {"Confidence":
3.3275485038757324, "Type": "CALM"}, {"Confidence":
0.31517744064331055, "Type": "SAD"} ], "Eyeglasses": {"Confidence":
99.8050537109375, "Value": false}, "EyesOpen": {Confidence":
99.99979400634766, "Value": true}, "Gender": {"Confidence": 100,
"Value": "Female} DetectFaces smart cropping & ad overlays
sentiment capture demographic analysis face editing &
pixelation
- 9. Similarity 93% Similarity 0%
- 10. "FaceMatches": [ {"Face": {"BoundingBox": { "Height":
0.2683333456516266, "Left": 0.5099999904632568, "Top":
0.1783333271741867, "Width": 0.17888888716697693}, "Confidence":
99.99845123291016}, "Similarity": 96 }, {"Face": {"BoundingBox": {
"Height": 0.2383333295583725, "Left": 0.6233333349227905, "Top":
0.3016666769981384, "Width": 0.15888889133930206}, "Confidence":
99.71249389648438}, "Similarity": 0 } ], "SourceImageFace":
{"BoundingBox": { "Height": 0.23983436822891235, "Left":
0.28333333134651184, "Top": 0.351423978805542, "Width":
0.1599999964237213}, "Confidence": 99.99344635009766} }
CompareFaces
- 11. Collection IndexFaces SearchFacesbyImage Nearest neighbor
search FaceID: 4c55926e-69b3-5c80-8c9b-78ea01d30690 Similarity: 97
FaceID: 02e56305-1579-5b39-ba57-9afb0fd8782d Similarity: 92 FaceID:
02e56305-1579-5b39-ba57-9afb0fd8782d Similarity: 85
- 12. Collections and Access Patterns Logging (public events;
daily visitor logs; digital libraries) One potentially large
collection per event / time period Enables wide searches Social
Tagging (photo storage and sharing) One collection per application
user Enables automated friend tagging Person Verification (employee
gate check) One collection for each person to be verified Enables
detection of stolen/shared IDs
- 13. Collection and Access Patterns # Collections # Faces per
Collection Person Verification Social Friend Tagging Event Logging
/ Wide Search 1M
- 14. Amazon Rekognition Console
https://console.aws.amazon.com/rekognition/home
- 15. Amazon Rekognition Customers Law Enforcement and Public
Safety Travel and Hospitality Digital Marketing and Advertising
Media and Entertainment Internet of Things (IoT)
- 16. Law Enforcement and Public Safety Washington County Sheriff
(OR) To follow leads from citizens & security cameras, a person
spends days manually searching thousands of images The mobile and
web app powered by Amazon Rekognition compares new images with
photos of previous offenders: Helps identify unknown theft suspects
from security footage Provides leads by identifying possible
witnesses & accomplices Identifies persons of interest who do
not have identification
- 17. Travel and Hospitality Anticipatory guest experiences for
hotels using Amazon Rekognition for facial recognition and
sentiment capture Kaliber is using Amazon Rekognition to help front
desk agents enhance relationships with guests: Recognize guests
early for instant and personalized service Receive rich,
contextualized guest information in real time Track guest sentiment
throughout their stay Drive an 80% increase in guest satisfaction
scores
- 18. Guest Workflow Walk in Be recognized Be greeted Capture
sentiment to trigger actionsEnjoy personalized serviceLeave with a
fond farewell Kaliber allows us to bond with our guests from the
second they walk in my hotel. GM of a 5-star property
- 19. hotel Simplified Architecture One master guest collection
enables single-workflow deployment across all properties Guest
recognition triggers real-time information retrieval Automated
pipeline processing in AWS improves reliability Automated image
sampling constantly improves recognition quality
- 20. Influencer Marketing Associate influencers with objects and
scenes in social media images in order to create high impact
campaigns for clients Using Amazon Rekognition for metadata
extraction: Create rich media indexes of images from social media
feeds, which the application associates with influencers Enable
analytics to profile environments where influence is strongest
Connect client brands with the influencers most likely to have
impact
- 21. Media and Entertainment Identify who is on camera for each
of 8 networks so that recorded video can be indexed and searched
Video frame-sampling facial recognition solution using Amazon
Rekognition: Indexed 97,000 people into a face collection in 1 day
Sample frames every 6 secs and test for image variance Upload
images to Amazon S3 and call Amazon Rekognition to find best facial
match Store time stamp and faceID metadata
- 22. C-SPAN Indexing Architecture Video feeds encoded from 8
locations (3 networks and 5 federal courthouses) Frames extracted
into JPGs and hosted in Amazon S3 Amazon SQS provides asynchronous
decoupling Search Amazon Rekognition collection for high similarity
matches Results cache drives search and discovery requests R3
hashing detects if a scene significantly changes
- 23. Amazon Rekognition Customers Digital Asset Management Media
and Entertainment Travel and Hospitality Influencer Marketing
Systems Integration Digital Advertising Consumer Storage Law
Enforcement Public Safety eCommerce Education
- 24. Amazon Rekognition for Media Metadata Generation Shane
Murphy, Cloud Solutions Engineer Mark Kelly, Director Cloud
Operations
- 25. Company Background Scripps Networks Interactive Lifestyle
Media Develop web and video content for distribution to
international audiences in 6 continents 190 million+ consumers each
month Dozens of digital platforms, hundreds of thousands of images,
and petabytes of video. 2016: Digital content grew 700% 2017: Will
produce 2,500 hours of linear television content
- 26. Media Metadata Attributes Easy Size, resolution, name, etc.
Harder Classification. Room type, color scheme, brand category,
furniture style, etc. Must be fast Must be good (enough)
- 27. Problem Description Media management is core to our
business. Manually creating metadata is time intensive, tedious,
and expensive. Automation is amazing! But how?
- 28. Classification - Current State Cutting edge neural networks
Example MIT Places for Scene Recognition
http://places.csail.mit.edu/ Complicated, bloated, computationally
infeasible, static Only one problem type, but we have many classes
to identify
- 29. Lets Simplify! Our Strategy Divide and Conquer 1. Use
Amazon Rekognition to generate text labels for easy processing 2.
Use supervised machine learning to train multiple predictive models
3. Set up multiple fan-out pipelines for automated classification
workloads
- 30. Amazon Rekognition Step 1: Generate Labels Python (boto3)
example for img in training_images: labels= rekognize.detect_labels
( Image = { 'S3Object' : { 'Bucket' : SOURCE_BUCKET, 'Name' : img}
}, MinConfidence = MIN_CONFIDENCE)['Labels'] labels[0] = 'Plant
Potted Plant Furniture Indoors Interior Design Room Kitchen
- 31. Step 2a: Transform Labels Plant Room Table Lamp Furniture
AttributeN Image0 2 1 0 1 2 Image1 1 3 1 1 0 Image2 1 1 1 0 1 Bag
of Words
- 32. Step 2b Derive Relationships Split the training data, use
most of it to train, some to test Options - Decision trees, random
forest, k nearest neighbors, multinomial logistic regression
Specifics determined by problem description and tuning (art and
science)
- 33. Step 3 Predict New Metadata Input Labels = Plant Potted
Plant Indoors Interior Design Room Bedroom Lamp Lampshade Table
Lamp Apartment Housing Lighting Dining Room Shelf Furniture Table
Tabletop Dining Room
- 34. Lets Simplify! Strategy Transform to easier use case Sample
video frames -> feed through Amazon Rekognition, classifiers,
and other analysis engines and parsers
- 35. Use Case Fanout Video Pipeline Amazon Rekognition Amazon
Elasticsearch Service Amazon S3
- 36. So what? Room type classification initial results 75%
accurate Immediate savings in image and video classification:
$500,000 Time to market thousands of hours saved per year Content
Grouping and Dynamic Generation
- 37. Challenges in Our Approach Integration with Amazon Machine
Learning Lack of Optical Character Recognition Model Management and
Lifecycles Real time generation
- 38. Future Directions Revenue Opportunities!!! Product
placement, logos, etc. Facial Recognition Landmark Detection
Cultural Sensitivity Compliance and Terms of Service
- 39. Thank You!
- 40. Amazon Rekognition Availability and Pricing Free Tier: 5000
images processed per month for first 12 months General Availability
in 3 regions: US East (N. Virginia), US West (Oregon); EU (Ireland)
Image Analysis Tiers Price per 1000 images processed First 1
million images processed* per month $1.00 Next 9 million images
processed* per month $0.80 Next 90 million images processed* per
month $0.60 Over 100 million images processed* per month $0.40
- 41. Developer Resources and more
https://aws.amazon.com/blogs/ai/
https://aws.amazon.com/rekognition
- 42. IoT Use Case real-time facial recognition at the edge AWS
Advanced Consulting Partner Migrations DevOps Managed Services
Software & Hardware Engineering User Experience & Visual
Design Rapid Prototyping AWS Competencies: DevOps, IoT,
Healthcare
- 43. NERF CS-18 N-Strike Elite Rapidstrike Adafruit 2.8 PiTFT
display Raspberry Pi 3 Amazon Rekognition Training Image
- 44. https://sturdy.cloud/sting/
- 45. Thank You! pearsond@amazon.com