MicroServices at Netflix - challenges of scale

31
MicroServices at NETFLIX Best Practices & Tools of the trade Sudhir Tonse Manager, Cloud Platform @stonse http://linkedin.com/in/ sudhirtonse Nitesh Kant Platform Architect @NiteshKant http://linkedin.com/in/ niteshkant

Transcript of MicroServices at Netflix - challenges of scale

MicroServices at NETFLIX

MicroServices at NETFLIXBest Practices & Tools of the tradeSudhir TonseManager, Cloud Platform @stonsehttp://linkedin.com/in/sudhirtonseNitesh KantPlatform Architect @NiteshKanthttp://linkedin.com/in/niteshkant

Old DataCenter (2008)Everything in one WebApp (.war) AWS Cloud (2012)100s of Fine Grained Services

PositivesIsolation brings better Availability*Independent Speed of Delivery (by different teams)Decentralized Governance (DevOps)

ChallengesDistributed Systems are inherently Complex

Operational Overhead (100s of services; DevOps model absolutely required)

Service Interface Versioning, Mismatches?

Testing (Need the entire ecosystem to test)

Fan out of Requests -> Increases n/w traffic

ClaimMicroServices increase your overall availability

True?Yes but wait!

One missing ; brought down ALL of Netflix

Introduced MicroServices ...

Uptime SLAAssume a Monolithic Service with 99.99% availability

What if you have ...~30 Microservices (each with 99.99% SLA)??

RealityOne rogue (dependency) micro service CAN bring your whole site down!

How?

Service Hosed!!

Combined Effective SLA (Availability)== 2 HOURS of downtime per month== 99.7 % uptime!!

But what if I want better?MicroServices does not automatically mean better Availability - Unless you have Fault Tolerant Architecture

Guard your Service!Use Hystrix (http://github.com/netflix/hystrix)

Service Discovery & LoadbalancersChoiceCentral Loadbalancer? (H/W or S/W)

OR

2. Client based S/W Loadbalancer?

Client based Smart LoadbalancerUse Ribbon (http://github.com/netflix/ribbon)

Tools of the Trade

OR

Service Dependency View

Distributed Tracing

Chattiness (and Fan Out)~2 Billion Requests per day on Edge Service

Results in ~20 Billion Fan out requests in ~100 MicroServices

Fan out

IPC 2.0 .. the next frontier @NiteshKant

Netflix IPC Stack (1.0)

Apache

HTTP ClientEureka (Service Registry)Server (Karyon)ApacheTomcatClient

HystrixEVCacheRibbonLoad BalancingEureka IntegrationMetrics (Servo)Bootstrapping (Governator)Metrics (Servo)Admin ConsoleHTTPEureka Integration

RegistrationFetch Registry

A Blocking Architecture

Netflix IPC Stack (2.0)

Client (Ribbon 2.0)

Eureka (Service Registry)Server (Karyon)

Ribbon TransportLoad BalancingEureka IntegrationMetrics (Servo)Bootstrapping (Governator)Metrics (Servo)Admin ConsoleHTTPEureka IntegrationRegistrationFetch Registry

RibbonHystrixEVCache

RxNetty

RxNettyUDPTCPWebSocketsSSEA Completely Reactive Architecture

Synchronous ApplicationsTomcatConnectorApplication codeHystrix Apache HTTP ClientConn 1Thread 1Thread 1Thread 1* Thread 1Conn 2Thread 2Thread 2Thread 2* Thread 2Conn nThread nThread nThread n* Thread n....*If there isnt any application driven thread change

Synchronous ApplicationsTomcatConnectorApplication codeHystrix Apache HTTP ClientConn 1Thread 1Thread 1Thread 1* Thread 1Conn 2Thread 2Thread 2Thread 2* Thread 2Conn nThread nThread nThread n* Thread n....

Large # of connections / Large # of external dependencies => tons of threads.*If there isnt any application driven thread change

Asynchronous applicationsApplication codeRxNettyHystrix RxNetty

Eventloop 1

Eventloop 4

Eventloop 1*Eventloop 4**If there isnt any application driven thread changeN connections per eventloop

Request processing in EventloopHystrix used for throttling not for achieving asynchronicity.Eventloops are shared between In & OUT

Asynchronous ApplicationsApplication codeRxNettyHystrix RxNetty

Eventloop 1

Eventloop 4

Eventloop 1*Eventloop 4**If there isnt any application driven thread change

Eventloop 2

Eventloop 3

Eventloop 1*Eventloop 4*....

Eventloop 4

Eventloop 1

Eventloop 1*Eventloop 4*# of processors => # of eventloops. No dependence on # of connections

TakeawayMicroServices is a better architecture compared to Monolithic Apps

However

Beaware of the challenges - Use Best Practices and battle-tested OSS components

http://netflix.github.co