Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

27
KAFKA + Building the World's Realtime Transit Infrastructure

Transcript of Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Page 1: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

KAFKA +• Building the World's Realtime Transit Infrastructure

Chinmay Soman
Chinmay to come up with a gif for this
Nora Sermez
[email protected] what tool did you use to capture this?
Chinmay Soman
I got it from one of Nico's RFCs :https://docs.google.com/document/d/1BGA4SfsYIB7vkAEOGd1GcaGSInasCb6EYI6hQ_WF6G8/edit#
Chinmay Soman
Chinmay Soman
Aaron Schildkroutdo we want to add stuff around work we've done around schemas, watchtower, etc - I think the more we can SHOW visually our innovations to make Kafka work at scale the better.
Chinmay Soman
Added a slide for Watchtower
Aaron Schildkrout
In general - bigger visuals are better. Think TED talk style.
Chinmay Soman
[email protected] : Any idea if we have some slides related to Heaven already ?
Nicolas Garcia Belmonte
[email protected] [email protected] I added smt. It's not comms approved but seems to be obfuscated enough. I would check though
Aaron Schildkrout
how do we do that?
Aaron Schildkrout
do we want to add stuff around work we've done around schemas, watchtower, etc - I think the more we can SHOW visually our innovations to make Kafka work at scale the better.
Chinmay Soman
I'll add a slide for watchtower + schemas
Aaron Schildkrout
not following these next two slides and what point i'm making
Aaron Schildkrout
another: maybe fraud in china?
Aaron Schildkrout
In general the background color of these slides feels a little dull and deadening. Whiter is probably better.
Chinmay Soman
[email protected] can you take care this please
Aaron Schildkrout
This and the previous slide should be merged I think - and then we should think through them. My instinct is that we actually want 1 slide for each key innovation that allowed us to make kafka work at scale.
Chinmay Soman
Adding 1 slide for each contribution
Aaron Schildkrout
Can I show heaven live for NYC?
Aaron Schildkrout
I can speak about kafka powering surge - we don't have to show the architecture. I think what's much more interesting is the actual use cases - so here ideally we would show an animated view of hex surge. I bet we have this lying around.
Aaron Schildkrout
need better visual here - to show something more hourly - like the nice new hourly views in data.dot
Aaron Schildkrout
an animated view would be good - maybe get from Nico or Abhishek
Chinmay Soman
[email protected] : can you help me get a good slide for this ?
Aaron Schildkrout
I want a slide that lists all the key components of our system that rely on RT/stream processing - and something like a kafka implementation. These would animate in as I explain them.
Page 2: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

For Illustration only

Page 3: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Page 4: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

SURGE - CIRCA 2013

Page 5: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

SURGE - CIRCA 2016

Chinmay Soman
[email protected] : This where we can talk about the new "hex" surge
Aaron Schildkrout
[email protected] pls fill in the XYZ and X in the notes.
Aaron Schildkrout
can we show hexagonal surge in alloy?
Chinmay Soman
[email protected] : Similar question, any idea who might have such slides ? I did a general scan on Google drive but to no avail
Nicolas Garcia Belmonte
I've seen a few shots here: https://newsroom.uber.com/new-partner-app/
Aaron Schildkrout
for surge, allenkey, etc - we can ask ted moran
Chinmay Soman
I reached out to Ted
Aaron Schildkrout
also pls fill in xyz in notes below.
Page 6: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

DATA CONSUMERS

Real-time, Fast Analytics

BATCH PIPELINE

Storm

ApplicationsData Science

AnalyticsReporting

KAFKA

VERTICA

RIDER APP

DRIVER APP

API / SERVICES

DISPATCH (gps logs)

Mapping & Logistic Ad-hoc

exploration

ELK

Samza Alerts,Dashboards

Debugging

REAL-TIME PIPELINE

HADOOP

Surge Mobile App

DATA PRODUCERS

KAFKA 8 ECOSYSTEM @UBER

Page 7: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Product Features

Predictive Models

Operational Analytics

Business Intelligence

INFRASTRUCTURE ECOSYSTEM

Page 8: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

NEAR REALTIME PRICE SURGING

PRODUCT FEATURES

Page 9: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

FRAUD -ANOMALY DETECTION

PREDICTIVE MODELS

Page 10: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

PREDICTIVE MODELS

ETA

Page 11: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

OPERATIONAL ANALYTICS

Page 12: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

UberEATs

OPERATIONAL ANALYTICS

Page 13: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

XP

OPERATIONAL ANALYTICS

Page 14: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

BUSINESS INTELLIGENCE

Page 15: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

KAFKA 8KAFKA 7 MIGRATOR

Limited Availability

Difficult to Scale

Not multi-DC Multi-lang incompatibility Multi-DC, multi-languagesupport

2013

2014

2015 - 2016

KAFKA 7 WORLD

Difficult to Operate

Producer Scale Issues

High Availability

High Scalability

Kafka 7 + Mirrormaker

Deployed everywhere

Kafka 7 migratorDeployed

everywhere

New Kafka 8pipeline

Page 16: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Kafka 7

Mirrormaker

2.0

Rest architectu

re

Data AuditAutomatedTopic Mgmt

Page 17: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Logs Business events

Async REST library

Data Audit

Local spooling

High throughput custom protocol

REST ARCHITECTURE

Rest Proxy

Page 18: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Automated Schema and Topic Management

Page 19: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Mirrormaker 2.0

Robust

Data Audit

Dynamic topics

MIRROR MAKER 2.0

Destination DCSource DC

Page 20: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Msg counts across multiple DCs

End-end latencies across multiple DCs

DATA AUDIT FOR KAFKA MESSAGES

Page 21: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Mirrormaker

2.0

Rest architectu

re

Data Audit Kafka 8

AutomatedTopic Mgmt

Page 22: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

A ROBUST FUTURE

0 data loss messaging systemData discovery and lineageQuota managementSelf-correcting brokersActive active data pipelines

Page 23: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Real-time Data

Dynamic SQL(ish)

Real-time decision

THE FUTURE

Real-time Data

Custom Application

Real-time decision

THE PRESENT

Page 24: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

TELEMATICS

Page 25: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

SELF DRIVING CAR

Page 26: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Page 27: Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Thank you, Kafka Community!