Evolving the Netflix API

61
Evolving the Netflix API Katharina Probst Engineering Manager, API October 2015

Transcript of Evolving the Netflix API

Page 1: Evolving the Netflix API

Evolving the Netflix APIKatharina ProbstEngineering Manager, APIOctober 2015

Page 2: Evolving the Netflix API

What is Netflix?

Page 3: Evolving the Netflix API

> 1000 Devices

Page 5: Evolving the Netflix API

We’re going global!

Page 6: Evolving the Netflix API

Source: https://help.netflix.com/en/node/14164

Recent additions: Spain, Portugal, Italy

Current availability

Page 7: Evolving the Netflix API

NetflixOriginals

Page 8: Evolving the Netflix API

Do we need a Netflix API?

Page 9: Evolving the Netflix API

API

Personali-zationEngine

User Info Ratings Similar

MoviesA/B TestEngine….

Page 10: Evolving the Netflix API

Uses

❏ Discovery❏ Signup❏ Playback

❏ Internal teams only

API

Page 11: Evolving the Netflix API

Goals

❏ Flexibility❏ Resiliency❏ Scalability❏ Excellent tools

API

Page 12: Evolving the Netflix API

Goals

❏ Flexibility❏ Resiliency❏ Scalability❏ Excellent tools

API

Page 13: Evolving the Netflix API

Lots of devices, lots of variety

Page 14: Evolving the Netflix API

Different interaction models

Page 15: Evolving the Netflix API

And just to make things a little more interesting….

❏ A/B tests❏ profiles❏ localization

Page 16: Evolving the Netflix API

What we felt we had

What we needed

Page 17: Evolving the Netflix API

❏ Reduce network chattiness

❏ Support device optimizations

❏ Enable faster development for internal users

Page 18: Evolving the Netflix API

Local MethodRemote API

GET/users/{user_id}/lists

apiGateway .getLists(userId)

Page 19: Evolving the Netflix API
Page 20: Evolving the Netflix API

Discrete HTTP requests pay network tax repeatedly

Page 21: Evolving the Netflix API

Single, optimized request; pay network tax once

Page 22: Evolving the Netflix API

Single, optimized request; pay network tax once

Client data assembly logic pushed to server

Page 23: Evolving the Netflix API

Add server-side scripting capability

❏ Enable independent development & device optimization

❏ Profit

Page 24: Evolving the Netflix API

❏ UI (script) changes can happen independently

❏ Script changes can be pushed to running servers, so decoupled from API push schedule

❏ Server+UI changes usually involve API team

Impact on velocity and collaboration

Page 25: Evolving the Netflix API

RxJava Hystrix

Java Service Layer

Mid-tierServices

UI Teams

Client Server

Internet

Application

/tv/home

API Team

Service Teams

Page 26: Evolving the Netflix API

ELB ZuulMid-tier Services

ScriptableBackend

ScriptableBackend

+

API Layer

Page 27: Evolving the Netflix API

Goals

❏ Flexibility❏ Resiliency❏ Scalability❏ Excellent tools

API

Page 28: Evolving the Netflix API

https://github.com/Netflix/Hystrixresilience patterns for distributed sys

Page 29: Evolving the Netflix API

Hystrix Primer

❏ Protection from and control over

latency and failure from dependencies

❏ Stop cascading failures in a complex

distributed system

❏ Fail fast and rapidly recover

❏ Fall back and gracefully degrade

Page 30: Evolving the Netflix API

PersonalizationEngine

SimilarMovies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

API

Page 31: Evolving the Netflix API

PersonalizationEngine

Similar movies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

API

Page 32: Evolving the Netflix API

API

PersonalizationEngine

Similar movies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

BewareCascading Failure!

Page 33: Evolving the Netflix API

PersonalizationEngine

Similar Movies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

API

Page 34: Evolving the Netflix API

PersonalizationEngine

Similar Movies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

Fallback Response

Local Fallback Avoids CascadingFailure!

API

Page 35: Evolving the Netflix API

PersonalizationEngine

Similar Movies

MovieMetadata Ratings User Info Instant

QueueA/B TestEngine

Fallback Response

Use FIT to test such failures

API

Page 36: Evolving the Netflix API

Goals

❏ Flexibility❏ Resiliency❏ Scalability❏ Excellent tools

API

Page 37: Evolving the Netflix API

Autoscaling & Capacity Management

http://nflx.it/1LvqLUi

Page 38: Evolving the Netflix API

AWS Controls Reactive, does not scale up fast enough

Page 39: Evolving the Netflix API

Fine-grained Control with Scryer Complements AWS Controls

❏ Faster scale-up, improved cost❏ Use reactive policy for organic scale down

Page 40: Evolving the Netflix API

Goals

❏ Flexibility❏ Resiliency❏ Scalability❏ Excellent tools

API

Page 41: Evolving the Netflix API

Run 1% of your traffic on the new code and see how it does

Page 42: Evolving the Netflix API

❏ Errors: 2xx, 4xx, 5xx❏ latency❏ network❏ busy threads❏ load❏ ...

So you’ve run a canary. Now what?

Control Canary

Page 43: Evolving the Netflix API
Page 44: Evolving the Netflix API

Successful canary

red/black push

Page 45: Evolving the Netflix API

Continuous Delivery

http://techblog.netflix.com/2015/09/moving-from-asgard-to-spinnaker.html

Page 46: Evolving the Netflix API

Quickly see status of all clusters

http://techblog.netflix.com/2015/09/moving-from-asgard-to-spinnaker.html

Page 47: Evolving the Netflix API

Script Management

Page 48: Evolving the Netflix API
Page 49: Evolving the Netflix API

Deployment & Ops

Page 50: Evolving the Netflix API

Deployment & Ops

Page 51: Evolving the Netflix API

Deployment & Ops

Page 52: Evolving the Netflix API

Real-time analysis

http://www.slideshare.net/g9yuayon/qcon-talk-on-netflix-mantis-a-stream-processing-system

Submit a query, see requests in real time.

Page 53: Evolving the Netflix API

Looking ahead - current challenges

❏ Breaking up the monolith❏ Script isolation❏ Thin client libraries

❏ New interaction models

Page 54: Evolving the Netflix API

Looking ahead

Source: http://techcrunch.com/2014/03/08/success-reality-and-the-myth-of-up-and-to-the-right/

Page 55: Evolving the Netflix API

Looking ahead

❏ Breaking up the monolith❏ Script isolation❏ Thin client libraries

❏ New interaction models

Page 56: Evolving the Netflix API

● > 900 active endpoints

● ~ 30 client libraries● 78 thread pools● high memory usage

Breaking up the monolith

Page 57: Evolving the Netflix API

Script isolation & node

❏ Groovy scripts run as part of API process

❏ UI teams would like to use other languages (in particular node.js) API remote

service layer

Service client libraries

UI/device scripts (node)

Falcor

var response = model.get("todos[0..2]

['name','done']");

Page 58: Evolving the Netflix API

Thin client libraries

❏ Many client libraries contain a lot of business logic and have a lot of dependencies

❏ Move business logic and dependencies to server

API remote service layer

Service client libraries

UI/device scripts (node)

Falcor

Page 59: Evolving the Netflix API

Looking ahead

❏ Breaking up the monolith❏ Script isolation❏ Thin client libraries

❏ New interaction models

Page 60: Evolving the Netflix API

New interaction models

❏ request/response❏ request/stream❏ fire-and-forget❏ event subscription❏ channel

API remote service layer

Service client libraries

UI/device scripts (node)

Falcor

http://reactivesocket.io

Page 61: Evolving the Netflix API

In the beginning...

Katharina Probst | [email protected] | www.linkedin.com/in/katharinaprobst