Lumos: A selfserve computer vision platform at AI NEXT conference

Lumos: A Self-Serve Computer Vision Platform

Fei YangResearch ScientistComputer Vision, AMLFacebook

}Why Computer Vision?


• Enhanced photo / video search

Search photos posted by my friendscontaining a black bear



• Detecting malicious content




• Helping visually impaired people




• Helping visually impaired people

• Smart camera

}Challenges of CV platform

• Large Scale

• Low Latency

• Reliability

• Flexibility

LumosFacebook’s Self-Serve Computer Vision Platform

Runs on Billions of images• Describes photos to the blind• Resurfaces notable memories • Provides better image and video search results• Protects people from objectionable content

More than 200 visual models• Currently trained and deployed• Dozens of teams across the company

self-serve build their own models

100+ Million examples in Lumosdatasets and growing fast

LumosLumos

DEEP RESIDUAL NETWORK


TASK (T1)TASK (T1)

Lumos



TASK (T1)TASK (T1)

TRAINING:WEEKS

Lumos



TASK (T1)TASK (T1) TASK (T2)TASK (T2)

TRAINING:WEEKS

Lumos



DEEP RESIDUAL NETWORKDEEP RESIDUAL NETWORK

TASK (T1)TASK (T1)

nn

n-1n-1

22

11

TASK (T2)TASK (T2)

LESS COMPUTE/LESS ACCURACY

MORE COMPUTE/MORE ACCURACY

Lumos


TASK (T1)TASK (T1)

nn

n-1n-1

22

11

TASK (T2)TASK (T2)

COMPUTE TIME:1-2 DAYS

ACCURACY:LESS

LumosLESS COMPUTE/LESS ACCURACY



TASK (T1)TASK (T1)

nn

n-1n-1

22

11

TASK (T2)TASK (T2)

COMPUTE TIME:1 MONTH

ACCURACY:MORE

LumosLESS COMPUTE/LESS ACCURACY



TASK (T1)TASK (T1)

nn

n-1n-1

22

11

TASK (T2)TASK (T2)

LESS COMPUTE/LESS ACCURACY


TASK (T3)TASK (T3) TASK (T4)TASK (T4) TASK (Tm)TASK (Tm)

Lumos

ACCU

RACY

COMPUTE

LumosTASK (T1)TASK (T1) TASK (T2)TASK (T2) TASK (T3)TASK (T3) TASK (T4)TASK (T4) TASK (Tm)TASK (Tm)

Lumos allows everyone at Facebook to build and deploy new computer vision models on the fly

• Collect training data for your new model• Train your new model at the right

accuracy/computational cost tradeoff• Refine your model based on live performance• Deploy your model to production

LumosLumos

On this DayOn this Day

AccessibilityAccessibility 360 Media Team360 Media TeamConnectivity LabConnectivity Lab

Protect and CareProtect and Care MomentsMomentsNews FeedNews Feed

Photo SearchPhoto Search

Lumos

Continuous Stream of Photos

Automatic Alt Text

ConnectivityOriginal GPW4 map Facebook high-res map

Detect Houses

• Indexing billions of photos• Finding similar photos in microseconds

Binary encoding

1011001011…01011110101101…00100001111000…10101111111001…00011010101010…10010001111110…10100101101001…11111001111000…10100001001001…0010

Compact representations

Query imageQuery image

Similar imagesSimilar images

• Clusters hundreds of millions photos into millions of clusters• Approach: A fast binary k-means algorithm

– Works directly on similarity-preserving binary hashes of images. – Clusters image hashes into binary centers. – Builds hash indexes of binary centers to speedup computation.

Video Understanding

Objects: Dog, Cat..Shot boundary detection

Caption:Dog chasing cat in garden while people are laughing

Action: Chasing

Scene: Garden

Summarization

Saliency Detection

Dynamic Compression

FuturePrediction

Video Q&A

Beating humans on identifying sports

Continuous stream of videos

Mobile Vision

Accuracy

Speed

Size

Small, Fast, Accurate models

Mobile Vision

Pose estimation

3

3D Point Cloud

}Thank you!

Lumos: A selfserve computer vision platform at AI NEXT conference

Technology

Transcript of Lumos: A selfserve computer vision platform at AI NEXT conference