BDA 301 An Introduction to Amazon Rekognition, for Deep Learning-based Computer Vision

52
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. David Pearson, AWS AI Services April 2017 Amazon Rekognition Extract Rich Image Metadata from Visual Content

Transcript of BDA 301 An Introduction to Amazon Rekognition, for Deep Learning-based Computer Vision

  1. 1. 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. David Pearson, AWS AI Services April 2017 Amazon Rekognition Extract Rich Image Metadata from Visual Content
  2. 2. Amazon AI Intelligent Services Powered By Deep Learning
  3. 3. Rich Metadata Index objects, scenes, facial attributes, persons Amazon Rekognition Deep Learning-Based Image Recognition Service
  4. 4. Deer 98.8% Wildlife 95.1% Conifer 95.1% Spruce 95.1% Wood 78.3% Tree 63.5% Forest 63.5% Vegetation 61.9% Pine 60.6% Outdoors 54.0% Flower 53.9% Plant 52.9% Nature 50.7% Field 50.7% Grass 50.7%
  5. 5. { "Image": { "Bytes": blob, "S3Object": { "Bucket": "string", "Name": "string", "Version": "string" } }, "MaxLabels": number, "MinConfidence": number } DetectLabels Amazon S3 Image Bucket
  6. 6. DetectLabels "Labels": [ { "Confidence": 98.9294204711914, "Name": "Moss" }, { "Confidence": 98.9294204711914, "Name": "Plant" }, { "Confidence": 97.35887908935547, "Name": "Creek" }, { "Confidence": 97.35887908935547, "Name": "Outdoors" }, { "Confidence": 97.35887908935547, "Name": "Stream" }, { "Confidence": 97.35887908935547, "Name": "Water" },
  7. 7. Age Range 38-59 Beard: False 84.3% Emotion: Happy 86.5% Eyeglasses: False 99.6% Eyes Open: True 99.9% Gender: Male 99.9% Mouth Open: False86.2% Mustache: False 98.4% Smile: True 95.9% Sunglasses: False 99.8% Bounding Box Height: 0.36716.. Left: 0.40222.. Top: 0.23582.. Width: 0.27222.. Landmarks EyeLeft EyeRight Nose MouthLeft MouthRight LeftPupil RightPupil LeftEyeBrowLeft LeftEyeBrowRight LeftEyeBrowUp : Quality Brightness 52.5% Sharpness 99.9%
  8. 8. "BoundingBox": { "Height": 0.3449999988079071, "Left": 0.09666666388511658, "Top": 0.27166667580604553, "Width": 0.23000000417232513 }, "Confidence": 100, "Emotions": [ {"Confidence": 99.1335220336914, "Type": "HAPPY" }, {"Confidence": 3.3275485038757324, "Type": "CALM"}, {"Confidence": 0.31517744064331055, "Type": "SAD"} ], "Eyeglasses": {"Confidence": 99.8050537109375, "Value": false}, "EyesOpen": {Confidence": 99.99979400634766, "Value": true}, "Gender": {"Confidence": 100, "Value": "Female} DetectFaces smart cropping & ad overlays sentiment capture demographic analysis face editing & pixelation
  9. 9. Similarity 93% Similarity 0%
  10. 10. "FaceMatches": [ {"Face": {"BoundingBox": { "Height": 0.2683333456516266, "Left": 0.5099999904632568, "Top": 0.1783333271741867, "Width": 0.17888888716697693}, "Confidence": 99.99845123291016}, "Similarity": 96 }, {"Face": {"BoundingBox": { "Height": 0.2383333295583725, "Left": 0.6233333349227905, "Top": 0.3016666769981384, "Width": 0.15888889133930206}, "Confidence": 99.71249389648438}, "Similarity": 0 } ], "SourceImageFace": {"BoundingBox": { "Height": 0.23983436822891235, "Left": 0.28333333134651184, "Top": 0.351423978805542, "Width": 0.1599999964237213}, "Confidence": 99.99344635009766} } CompareFaces
  11. 11. Collection IndexFaces SearchFacesbyImage Nearest neighbor search FaceID: 4c55926e-69b3-5c80-8c9b-78ea01d30690 Similarity: 97 FaceID: 02e56305-1579-5b39-ba57-9afb0fd8782d Similarity: 92 FaceID: 02e56305-1579-5b39-ba57-9afb0fd8782d Similarity: 85
  12. 12. Collections and Access Patterns Logging (public events; daily visitor logs; digital libraries) One potentially large collection per event / time period Enables wide searches Social Tagging (photo storage and sharing) One collection per application user Enables automated friend tagging Person Verification (employee gate check) One collection for each person to be verified Enables detection of stolen/shared IDs
  13. 13. Collection and Access Patterns # Collections # Faces per Collection Person Verification Social Friend Tagging Event Logging / Wide Search 1M
  14. 14. Amazon Rekognition Console https://console.aws.amazon.com/rekognition/home
  15. 15. Amazon Rekognition Customers Law Enforcement and Public Safety Travel and Hospitality Digital Marketing and Advertising Media and Entertainment Internet of Things (IoT)
  16. 16. Law Enforcement and Public Safety Washington County Sheriff (OR) To follow leads from citizens & security cameras, a person spends days manually searching thousands of images The mobile and web app powered by Amazon Rekognition compares new images with photos of previous offenders: Helps identify unknown theft suspects from security footage Provides leads by identifying possible witnesses & accomplices Identifies persons of interest who do not have identification
  17. 17. Travel and Hospitality Anticipatory guest experiences for hotels using Amazon Rekognition for facial recognition and sentiment capture Kaliber is using Amazon Rekognition to help front desk agents enhance relationships with guests: Recognize guests early for instant and personalized service Receive rich, contextualized guest information in real time Track guest sentiment throughout their stay Drive an 80% increase in guest satisfaction scores
  18. 18. Guest Workflow Walk in Be recognized Be greeted Capture sentiment to trigger actionsEnjoy personalized serviceLeave with a fond farewell Kaliber allows us to bond with our guests from the second they walk in my hotel. GM of a 5-star property
  19. 19. hotel Simplified Architecture One master guest collection enables single-workflow deployment across all properties Guest recognition triggers real-time information retrieval Automated pipeline processing in AWS improves reliability Automated image sampling constantly improves recognition quality
  20. 20. Influencer Marketing Associate influencers with objects and scenes in social media images in order to create high impact campaigns for clients Using Amazon Rekognition for metadata extraction: Create rich media indexes of images from social media feeds, which the application associates with influencers Enable analytics to profile environments where influence is strongest Connect client brands with the influencers most likely to have impact
  21. 21. Media and Entertainment Identify who is on camera for each of 8 networks so that recorded video can be indexed and searched Video frame-sampling facial recognition solution using Amazon Rekognition: Indexed 97,000 people into a face collection in 1 day Sample frames every 6 secs and test for image variance Upload images to Amazon S3 and call Amazon Rekognition to find best facial match Store time stamp and faceID metadata
  22. 22. C-SPAN Indexing Architecture Video feeds encoded from 8 locations (3 networks and 5 federal courthouses) Frames extracted into JPGs and hosted in Amazon S3 Amazon SQS provides asynchronous decoupling Search Amazon Rekognition collection for high similarity matches Results cache drives search and discovery requests R3 hashing detects if a scene significantly changes
  23. 23. Amazon Rekognition Customers Digital Asset Management Media and Entertainment Travel and Hospitality Influencer Marketing Systems Integration Digital Advertising Consumer Storage Law Enforcement Public Safety eCommerce Education
  24. 24. Amazon Rekognition for Media Metadata Generation Shane Murphy, Cloud Solutions Engineer Mark Kelly, Director Cloud Operations
  25. 25. Company Background Scripps Networks Interactive Lifestyle Media Develop web and video content for distribution to international audiences in 6 continents 190 million+ consumers each month Dozens of digital platforms, hundreds of thousands of images, and petabytes of video. 2016: Digital content grew 700% 2017: Will produce 2,500 hours of linear television content
  26. 26. Media Metadata Attributes Easy Size, resolution, name, etc. Harder Classification. Room type, color scheme, brand category, furniture style, etc. Must be fast Must be good (enough)
  27. 27. Problem Description Media management is core to our business. Manually creating metadata is time intensive, tedious, and expensive. Automation is amazing! But how?
  28. 28. Classification - Current State Cutting edge neural networks Example MIT Places for Scene Recognition http://places.csail.mit.edu/ Complicated, bloated, computationally infeasible, static Only one problem type, but we have many classes to identify
  29. 29. Lets Simplify! Our Strategy Divide and Conquer 1. Use Amazon Rekognition to generate text labels for easy processing 2. Use supervised machine learning to train multiple predictive models 3. Set up multiple fan-out pipelines for automated classification workloads
  30. 30. Amazon Rekognition Step 1: Generate Labels Python (boto3) example for img in training_images: labels= rekognize.detect_labels ( Image = { 'S3Object' : { 'Bucket' : SOURCE_BUCKET, 'Name' : img} }, MinConfidence = MIN_CONFIDENCE)['Labels'] labels[0] = 'Plant Potted Plant Furniture Indoors Interior Design Room Kitchen
  31. 31. Step 2a: Transform Labels Plant Room Table Lamp Furniture AttributeN Image0 2 1 0 1 2 Image1 1 3 1 1 0 Image2 1 1 1 0 1 Bag of Words
  32. 32. Step 2b Derive Relationships Split the training data, use most of it to train, some to test Options - Decision trees, random forest, k nearest neighbors, multinomial logistic regression Specifics determined by problem description and tuning (art and science)
  33. 33. Step 3 Predict New Metadata Input Labels = Plant Potted Plant Indoors Interior Design Room Bedroom Lamp Lampshade Table Lamp Apartment Housing Lighting Dining Room Shelf Furniture Table Tabletop Dining Room
  34. 34. Lets Simplify! Strategy Transform to easier use case Sample video frames -> feed through Amazon Rekognition, classifiers, and other analysis engines and parsers
  35. 35. Use Case Fanout Video Pipeline Amazon Rekognition Amazon Elasticsearch Service Amazon S3
  36. 36. So what? Room type classification initial results 75% accurate Immediate savings in image and video classification: $500,000 Time to market thousands of hours saved per year Content Grouping and Dynamic Generation
  37. 37. Challenges in Our Approach Integration with Amazon Machine Learning Lack of Optical Character Recognition Model Management and Lifecycles Real time generation
  38. 38. Future Directions Revenue Opportunities!!! Product placement, logos, etc. Facial Recognition Landmark Detection Cultural Sensitivity Compliance and Terms of Service
  39. 39. Thank You!
  40. 40. Amazon Rekognition Availability and Pricing Free Tier: 5000 images processed per month for first 12 months General Availability in 3 regions: US East (N. Virginia), US West (Oregon); EU (Ireland) Image Analysis Tiers Price per 1000 images processed First 1 million images processed* per month $1.00 Next 9 million images processed* per month $0.80 Next 90 million images processed* per month $0.60 Over 100 million images processed* per month $0.40
  41. 41. Developer Resources and more https://aws.amazon.com/blogs/ai/ https://aws.amazon.com/rekognition
  42. 42. IoT Use Case real-time facial recognition at the edge AWS Advanced Consulting Partner Migrations DevOps Managed Services Software & Hardware Engineering User Experience & Visual Design Rapid Prototyping AWS Competencies: DevOps, IoT, Healthcare
  43. 43. NERF CS-18 N-Strike Elite Rapidstrike Adafruit 2.8 PiTFT display Raspberry Pi 3 Amazon Rekognition Training Image
  44. 44. https://sturdy.cloud/sting/
  45. 45. Thank You! [email protected]