AWS Re:Invent 2012 - Chaos Monkey & The Netflix Simian Army
-
Upload
ariel-tseitlin -
Category
Technology
-
view
2.553 -
download
4
description
Transcript of AWS Re:Invent 2012 - Chaos Monkey & The Netflix Simian Army
Ariel Tseitlin
Chaos Monkey & The Simian Army
About Netflix
With more than 30 million streaming members in the United States, Canada, Latin America, the United Kingdom, Ireland and the Nordics, Netflix, Inc. (NASDAQ: NFLX) is the world's leading internet subscription service for enjoying movies and TV programs[1][1] http://ir.netflix.com/
Personalization Engine User Info
Movie Metadata
Movie Ratings
Similar Movies
API
ReviewsA/B Test Engine
2B requests per day
into the Netflix API
12B outbound requests per day to API dependencies
A complex distributed system
Growth is good (and scary)
30x growth in two years!
Growth is good (and scary)
Things will break
Chaos Monkey taught us…
• State is bad• Clusters are good• Surviving instance failure is a low bar
The Sick and Wounded
Latency Monkey
Latency Monkey taught us
• Startup resiliency is often missed• An ongoing unified approach to runtime dependency
management is important (visibility & transparency gets missed otherwise)
• Know thy neighbor (unknown dependencies)
Clutter happens
Janitor Monkey taught us…
• Label everything• Clutter builds up
Ranks of the Simian Army
• Chaos Monkey
• Chaos Gorilla
• Latency Monkey
• Janitor Monkey
• Conformity Monkey
• Circus Monkey
• Doctor Monkey
• Howler Monkey
• Security Monkey
• Chaos Kong
• Efficiency Monkey
Big impact on availability
• Results of the monkeys
Open
We are sincerely eager to hear your feedback on this
presentation and on re:Invent.
Please fill out an evaluation form when you have a
chance.