New Metrics Engine to Help Drive UBER

37
t.uber.com/scala2016 Sasha Ovsankin UBER New Metrics Engine to Help Drive Uber November 2016 1

Transcript of New Metrics Engine to Help Drive UBER

Page 1: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Sasha OvsankinUBER

New Metrics Engine to Help Drive Uber

November 2016

1

Page 2: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

“Transportation as reliable as running water, everywhere. for everyone”

2

Page 3: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Data & Analytics Engineer

About Me

Mathematical PhysicsLomonosov Moscow University

Contact

[email protected]://linkedin.com/in/sashaohttp://t.uber.com/scala2016

3

Page 4: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

What Do We Work On

Fuel Uber’s innovation, make software release cycle more robust and

data driven

Experimentation Platform Uber Data Platform

Cutting-edge data platforms powering Uber’s intelligence

4

Page 5: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

What This Talk Is About

Building a company-wide Metrics Platform is possible and practical,

and you should do it

5

Page 6: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Agenda

Why Metrics PlatformTechnologyProcessConclusion

6

Page 7: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

How Do You Want Your Metrics?

Aligned

Reliable

Trusted

7

Page 8: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Uber Situation

● Over 450 cities in over 70 countries

● Lots of growth: ○ 1B rides by Dec 2015, 2B rides by

June 2016● Teams have high level of

independence

8

Page 9: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

How do you make data-driven decisions in a business like that?

9

Page 10: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metrics Platform = Technology + Process

10

Page 11: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Our Metrics PlatformArchitecture and Process

Engines

Registry

Council

Web-UI

Spark / Hive / Real Time

BI Tool UI

DS / Ops / Product

Definition DSL Query DSL

11

Page 12: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Our Metrics Platform

Easy & Powerful

Integrated

Lightweight Process

12

Page 13: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric Walkthrough

Metric hours active

English description

Hours spent by drivers logged-in and online in the driver app

SQL select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac, right join dim.driver dr on dr.id=ac.driver_id

13

Page 14: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric walkthroughContinued

Add date select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac, right join dim.driver dr on dr.id=ac.driver_idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’

In San Francisco select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac, right join dim.driver dr on dr.id=ac.driver_idjoin dim.city c on c.id=ac.city_idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’ and c.name=’San Francisco’

14

Page 15: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric walkthroughContinued#2

Group by experiment treatment

select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac, right join dim.driver dr on dr.id=ac.driver_idjoin dim.city c on c.id=ac.city_idjoin xp.user_experiment xp on xp.user_id=dr.idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’ and c.name=’San Francisco’and xp.experiment_key=’crm_driveronboarding_wcdrip’group by xp.treatment

Group by driver type

select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac, right join dim.driver dr on dr.id=ac.driver_idjoin dim.city c on c.id=ac.city_idjoin xp.user_experiment xp on xp.user_id=dr.id join model.driver dm on dm.id=dr.idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’ and c.name=’San Francisco’and xp.experiment_key=’crm_driveronboarding_wcdrip’group by xp.treatment, model.driver.type

15

Page 16: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Complicated

Unmanageable

Fragile

16

Page 17: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Anatomy of a Metric

Preaggregationtransformations

Preaggregationtransformations

Aggregation

Aggregation

Post-aggregationformulasInput

Input

Input

Results

Dimensions

Metric definitions

Filters

FinalJoin

dim1

dim2

m1

m2

...

dim1

dim2

m1

m2

...

select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac,right join dim.driver dr on dr.id=ac.driver_idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’group by driver.city

17

Page 18: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric = Formula + Query

select(core.avg_driver_hours_active)where(dim_city.name===”San Fransisco”)over(days(7) upto today)groupBy(driver_model.category, user_experiment.treatment)

select sum(ac.minutes_active / 60) / count(*)from derived.driver_activity ac,right join dim.driver dr on dr.id=ac.driver_idjoin dim.city c on c.id=ac.city_idwhere ac.timestamp >= ‘2016-10-01’ and ac.timestamp < ‘2016-11-01’and c.name=’San Francisco’...

avg_driver_hours_active = sum(agg.driver_activity.minutes_active / 60) / count(dim.driver)

FormulaQuery

18

Page 19: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Metrics DSL: Formula

val hours_online = driver_activity.minutes_active / 60val all_drivers = count(dim_driver)val avg_driver_hours_online = sum(hours_online) / all_drivers

sum count

/

/

driver_activity.minutes_active 60

dim_driver

19

Page 20: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Metrics DSL: Query

val query= select(avg_driver_hours_online) where(dimDriver.partner_city_id==="San Francisco") over(days * 7 towards today) groupBy(dimDriver.partner_city_id)

val df= engine.toDF(query)

DSL DataFrame Output20

Page 21: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

✔ Easy & Powerful

Integrated

Lightweight Process

21

Page 22: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Engine Core

Company Schema Repository

22

Page 23: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Engine Core

Company Schema Repository

Table Schemas

Foregn key Relationships

23

Page 24: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Engine Core

Company Schema Repository

Table Schemas

Foreign key Relationships

Engine Configuration

24

Page 25: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Engine Core

Company Schema Repository

Table Schemas

Foreign key Relationships

Engine Configuration

Engine Core

Query

25

Page 26: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Engine Core

Company Schema Repository

Table Schemas

Foreign key Relationships

Engine Configuration

Engine Core

Query

Execution Plan

26

Page 27: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

✔ Easy & Powerful

✔ Integrated

Lightweight Process

27

Page 28: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Our Metrics PlatformArchitecture and Process

Engines

Registry

Council

Web-UI

Spark / Hive / Real Time

BI Tool UI

DS / Ops / Product

Definition DSL Query DSL

28

Page 29: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric Creation Process

29

Page 30: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Metric Management Web UI

Video link: https://youtu.be/we3q6O4eZIg 30

Page 31: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Our Metrics PlatformTechnology

✔ Easy & Powerful

✔ Integrated

✔ Lightweight Process

31

Page 32: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

● Experimentation● Product groups● Financial reporting● Real time decision making● Fraud detection

Users

32

Page 33: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

● Futher adoption within Uber● Further work on DSL● More Engines● Real Time ● Open Source?

Future Direction

Interested?

http://t.uber.com/[email protected]

33

Page 34: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

What this talk was about

Building company-wide Metrics Platform is possible and practical,

and you should do it

34

Page 35: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

The Metrics Platform Team

Contact us:● http://t.uber.com/scala2016● [email protected]

35

We are hiring!

Page 36: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Questions?

36

Page 37: New Metrics Engine to Help Drive UBER

t.uber.com/scala2016

Thank you

Proprietary and confidential © 2016 Uber Technologies, Inc. All rights reserved. No part of this document may be

reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or

by any information storage or retrieval systems, without permission in writing from Uber. This document is intended

only for the use of the individual or entity to whom it is addressed and contains information that is privileged,

confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified

that the information contained herein includes proprietary and confidential information of Uber, and recipient may not

make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person

other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber.

Image credits: ● Erik bij de Vaate● Bernard Spragg

37