Scaling the Netflix API - From Atlassian Dev Den

124
Scaling the Netflix API Daniel Jacobson @daniel_jacobson http://www.linkedin.com/in/danieljacobson http://www.slideshare.net/danieljacobson

description

The term "scale" for engineering often is used to discuss systems and their ability to grow with the needs of its users. This is clearly an important aspect of scaling, but there are many other areas in which an engineering organization needs to scale to be successful in the long term. This presentation discusses some of those other areas and details how Netflix (and specifically the API team) addresses them.

Transcript of Scaling the Netflix API - From Atlassian Dev Den

Page 1: Scaling the Netflix API - From Atlassian Dev Den

Scaling the Netflix API

Daniel Jacobson@daniel_jacobson

http://www.linkedin.com/in/danieljacobsonhttp://www.slideshare.net/danieljacobson

Page 2: Scaling the Netflix API - From Atlassian Dev Den

Please read the notes associated with each slide for

the full context of the presentation

Page 3: Scaling the Netflix API - From Atlassian Dev Den

What do I mean by “scale”?

Page 4: Scaling the Netflix API - From Atlassian Dev Den
Page 5: Scaling the Netflix API - From Atlassian Dev Den

But There Are Many Ways to Scale!

OrganizationSystems

Devices

Development

Testing

Page 6: Scaling the Netflix API - From Atlassian Dev Den

But first, some background…

Page 7: Scaling the Netflix API - From Atlassian Dev Den

Global Streaming Videofor TV Shows and Movies

Page 8: Scaling the Netflix API - From Atlassian Dev Den

More than 36 Million Subscribers

More than 40 Countries

Page 9: Scaling the Netflix API - From Atlassian Dev Den

Netflix Accounts for 33% of Peak Internet Traffic in North America

Netflix subscribers are watching more than 1 billion hours a month

Page 10: Scaling the Netflix API - From Atlassian Dev Den
Page 11: Scaling the Netflix API - From Atlassian Dev Den

2007

Page 12: Scaling the Netflix API - From Atlassian Dev Den

Netflix REST API:One-Size-Fits-All (OSFA)

Solution

Page 13: Scaling the Netflix API - From Atlassian Dev Den

Image courtesy of Jay Mac 3 on Flickr

Page 14: Scaling the Netflix API - From Atlassian Dev Den

Netflix API Requests by AudienceAt Launch In 2008

External Developers

Page 15: Scaling the Netflix API - From Atlassian Dev Den
Page 16: Scaling the Netflix API - From Atlassian Dev Den
Page 17: Scaling the Netflix API - From Atlassian Dev Den

Image courtesy of Jay Mac 3 on Flickr

Page 18: Scaling the Netflix API - From Atlassian Dev Den

Netflix API Requests by AudienceFrom 2011

External Developers

Page 19: Scaling the Netflix API - From Atlassian Dev Den

Global Streaming Product

Three aspects of the Streaming Product:• Discovery• Sign-Up• Streaming

Page 20: Scaling the Netflix API - From Atlassian Dev Den

Member Sign-Up

Page 21: Scaling the Netflix API - From Atlassian Dev Den

Discovery

Page 22: Scaling the Netflix API - From Atlassian Dev Den

Discovery

Page 23: Scaling the Netflix API - From Atlassian Dev Den

Today, Netflix API Supports Discovery and Sign-Up

Page 24: Scaling the Netflix API - From Atlassian Dev Den

But Soon, Will Support Streaming

Page 25: Scaling the Netflix API - From Atlassian Dev Den

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 26: Scaling the Netflix API - From Atlassian Dev Den

Distributed Architecture

Page 27: Scaling the Netflix API - From Atlassian Dev Den
Page 28: Scaling the Netflix API - From Atlassian Dev Den

1000+ Device Types

Page 29: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies Reviews A/B Test

Engine

Dozens of Dependencies

Page 30: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 31: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 32: Scaling the Netflix API - From Atlassian Dev Den

http://www.slideshare.net/reed2001/culture-1798664

Page 33: Scaling the Netflix API - From Atlassian Dev Den

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 34: Scaling the Netflix API - From Atlassian Dev Den

System Resiliency

Page 35: Scaling the Netflix API - From Atlassian Dev Den

Distributed Architecture

Page 36: Scaling the Netflix API - From Atlassian Dev Den

Dependency Relationships

Page 37: Scaling the Netflix API - From Atlassian Dev Den

2,000,000,000Requests Per Day to the

Netflix API

Page 38: Scaling the Netflix API - From Atlassian Dev Den

30Distinct Dependent

Services for the Netflix API

Page 39: Scaling the Netflix API - From Atlassian Dev Den

14,000,000,000Netflix API Calls Per Day to those Dependent Services

Page 40: Scaling the Netflix API - From Atlassian Dev Den

0Dependent Services with

100% SLA

Page 41: Scaling the Netflix API - From Atlassian Dev Den

99.99% = 99.7%30

0.3% of 2B = 6M failures per day

2+ Hours of Downtime Per Month

Page 42: Scaling the Netflix API - From Atlassian Dev Den

99.99% = 99.7%30

0.3% of 2B = 6M failures per day

2+ Hours of Downtime Per Month

Page 43: Scaling the Netflix API - From Atlassian Dev Den

99.9% = 97%30

3% of 2B = 60M failures per day

20+ Hours of Downtime Per Month

Page 44: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 45: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 46: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 47: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 48: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 49: Scaling the Netflix API - From Atlassian Dev Den
Page 50: Scaling the Netflix API - From Atlassian Dev Den

Circuit Breaker Dashboard

Page 51: Scaling the Netflix API - From Atlassian Dev Den
Page 52: Scaling the Netflix API - From Atlassian Dev Den

Call Volume and Health / Last 10 Seconds

Page 53: Scaling the Netflix API - From Atlassian Dev Den

Call Volume / Last 2 Minutes

Page 54: Scaling the Netflix API - From Atlassian Dev Den

Successful Requests

Page 55: Scaling the Netflix API - From Atlassian Dev Den

Successful, But Slower Than Expected

Page 56: Scaling the Netflix API - From Atlassian Dev Den

Short-Circuited Requests, Delivering Fallbacks

Page 57: Scaling the Netflix API - From Atlassian Dev Den

Timeouts, Delivering Fallbacks

Page 58: Scaling the Netflix API - From Atlassian Dev Den

Thread Pool & Task Queue Full, Delivering Fallbacks

Page 59: Scaling the Netflix API - From Atlassian Dev Den

Exceptions, Delivering Fallbacks

Page 60: Scaling the Netflix API - From Atlassian Dev Den

Error Rate# + # + # + # / (# + # + # + # + #) = Error Rate

Page 61: Scaling the Netflix API - From Atlassian Dev Den

Status of Fallback Circuit

Page 62: Scaling the Netflix API - From Atlassian Dev Den

Requests per Second, Over Last 10 Seconds

Page 63: Scaling the Netflix API - From Atlassian Dev Den

SLA Information

Page 64: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 65: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 66: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Page 67: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Fallback

Page 68: Scaling the Netflix API - From Atlassian Dev Den

Personalization

EngineUser Info Movie

MetadataMovie Ratings

Similar Movies

API

Reviews A/B Test Engine

Fallback

Page 69: Scaling the Netflix API - From Atlassian Dev Den

System Infrastructure

Page 70: Scaling the Netflix API - From Atlassian Dev Den

AWS Cloud

Page 71: Scaling the Netflix API - From Atlassian Dev Den
Page 72: Scaling the Netflix API - From Atlassian Dev Den
Page 73: Scaling the Netflix API - From Atlassian Dev Den

Autoscaling

Page 74: Scaling the Netflix API - From Atlassian Dev Den

Autoscaling

Page 75: Scaling the Netflix API - From Atlassian Dev Den

Forced Failure

Page 76: Scaling the Netflix API - From Atlassian Dev Den
Page 77: Scaling the Netflix API - From Atlassian Dev Den

Global System

Page 78: Scaling the Netflix API - From Atlassian Dev Den

More than 36 Million Subscribers

More than 40 Countries

Page 79: Scaling the Netflix API - From Atlassian Dev Den

ZuulGatekeeper for the Netflix Streaming Application

Page 80: Scaling the Netflix API - From Atlassian Dev Den

Zuul

• Multi-Region Resiliency

• Insights• Stress Testing• Canary Testing• Dynamic Routing

• Load Shedding• Security• Static Response

Handling• Authentication

Page 81: Scaling the Netflix API - From Atlassian Dev Den

Isthmus

Page 82: Scaling the Netflix API - From Atlassian Dev Den

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 83: Scaling the Netflix API - From Atlassian Dev Den
Page 84: Scaling the Netflix API - From Atlassian Dev Den
Page 85: Scaling the Netflix API - From Atlassian Dev Den

Screen Real Estate

Page 86: Scaling the Netflix API - From Atlassian Dev Den

Controller

Page 87: Scaling the Netflix API - From Atlassian Dev Den

Technical Capabilities

Page 88: Scaling the Netflix API - From Atlassian Dev Den

One-Size-Fits-AllAPI

Request

RequestRequest

Request

Request

Request

RequestRequest

Request

Request

RequestRequest

Request

Request

Request

Request

Page 89: Scaling the Netflix API - From Atlassian Dev Den

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 90: Scaling the Netflix API - From Atlassian Dev Den

Courtesy of South Florida Classical Review

Page 91: Scaling the Netflix API - From Atlassian Dev Den
Page 92: Scaling the Netflix API - From Atlassian Dev Den

Resource-Based API

vs.

Experience-Based API

Page 93: Scaling the Netflix API - From Atlassian Dev Den

Resource-Based Requests

• /users/<id>/ratings/title• /users/<id>/queues• /users/<id>/queues/instant• /users/<id>/recommendations• /catalog/titles/movie• /catalog/titles/series• /catalog/people

Page 94: Scaling the Netflix API - From Atlassian Dev Den

REST API

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

Network Border Network Border

Page 95: Scaling the Netflix API - From Atlassian Dev Den

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

OSFA API

Network Border Network Border

SERVER CODE

CLIENT CODE

Page 96: Scaling the Netflix API - From Atlassian Dev Den

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

OSFA API

Network Border Network Border

DATA GATHERING,FORMATTING,AND DELIVERY

USER INTERFACERENDERING

Page 97: Scaling the Netflix API - From Atlassian Dev Den
Page 98: Scaling the Netflix API - From Atlassian Dev Den
Page 99: Scaling the Netflix API - From Atlassian Dev Den

Experience-Based Requests

• /ps3/homescreen

Page 100: Scaling the Netflix API - From Atlassian Dev Den

JAVA API

Network Border Network Border

RECOMMENDATIONS

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

Groovy Layer

Page 101: Scaling the Netflix API - From Atlassian Dev Den
Page 102: Scaling the Netflix API - From Atlassian Dev Den

RECOMMENDATIONSA

ZXSXX C CCC

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

JAVA API

SERVER CODE

CLIENT CODE

CLIENT ADAPTER CODE(WRITTEN BY CLIENT TEAMS, DYNAMICALLY UPLOADED TO SERVER)

Network Border Network Border

Page 103: Scaling the Netflix API - From Atlassian Dev Den

RECOMMENDATIONSA

ZXSXX C CCC

MOVIE DATA

SIMILAR MOVIES

AUTH MEMBERDATA

A/B TESTS

START-UP

RATINGS

JAVA API

DATA GATHERING

DATA FORMATTINGAND DELIVERY

USER INTERFACERENDERING

Network Border Network Border

Page 104: Scaling the Netflix API - From Atlassian Dev Den
Page 105: Scaling the Netflix API - From Atlassian Dev Den

Scaling…

OrganizationSystems

Devices

Development

Testing

Page 106: Scaling the Netflix API - From Atlassian Dev Den

Dependency Relationships

Page 107: Scaling the Netflix API - From Atlassian Dev Den
Page 108: Scaling the Netflix API - From Atlassian Dev Den

Testing Philosophy:

Act Fast, React Fast

Page 109: Scaling the Netflix API - From Atlassian Dev Den

That Doesn’t Mean We Don’t Test

• Unit tests

• Functional tests

• Regression scripts

• Continuous integration

• Capacity planning

• Load / Performance tests

Page 110: Scaling the Netflix API - From Atlassian Dev Den

Cloud-Based Deployment Techniques

Page 111: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

Page 112: Scaling the Netflix API - From Atlassian Dev Den

Single Canary InstanceTo Test New Code with Production Traffic

(around 1% or less of traffic)

Current Code

In Production

API Requests from the Internet

Error!

Page 113: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

Page 114: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

Perfect!

Page 115: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 116: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 117: Scaling the Netflix API - From Atlassian Dev Den

Error!

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 118: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 119: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

Perfect!

Page 120: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 121: Scaling the Netflix API - From Atlassian Dev Den

Current Code

In Production

API Requests from the Internet

New Code

Getting Prepared for Production

Page 122: Scaling the Netflix API - From Atlassian Dev Den

API Requests from the Internet

New Code

Getting Prepared for Production

Page 123: Scaling the Netflix API - From Atlassian Dev Den

https://www.github.com/Netflix

Page 124: Scaling the Netflix API - From Atlassian Dev Den

Scaling the Netflix API

Daniel Jacobson@daniel_jacobson

http://www.linkedin.com/in/danieljacobsonhttp://www.slideshare.net/danieljacobson

HelpWanted!