Painful success - lessons learned while scaling up

215
PAINFUL SUCCESS Jesper Richter-Reichhelm (@jrirei) / session 4853

Transcript of Painful success - lessons learned while scaling up

PAINFUL SUCCESSJesper Richter-Reichhelm (@jrirei) / session 4853

Pain

Flash client Backend

7M players / month

7M players / month

10K API calls / second

7M players / month

10K API calls / second

200K DB operations / second

7M players / month

10K API calls / second

200K DB operations / second

100K DB writes / second

7M players / month

10K API calls / second

200K DB operations / second

100K DB writes / second

2 devops since 2009

Success

Painful Success

Painful Success

2009Decisions

What is the right software stack?

SQL

app

lb

SQL

Browser

sql sql

appapp

2010Scaling Up

0%#

20%#

40%#

60%#

80%#

100%#

0# 5# 10# 15# 20# 25# 30# 35# 40#

0%#

20%#

40%#

60%#

80%#

100%#

0# 5# 10# 15# 20# 25# 30# 35# 40#

0%#

20%#

40%#

60%#

80%#

100%#

0# 5# 10# 15# 20# 25# 30# 35# 40#

Small things...

AMF responses

Checking connection‘status’

ActiveRecord caching

Always check backon reality!

Lesson 1:

Big Things...

Lot’s of tuning

Lot’s of tuning

and more memory

app

slave

master

app

slave

master master

slave

1. create new DBs

app

slave

master master

slave

1. create new DBs

2. setup replication

app

slave

master master

slave

1. create new DBs

2. setup replication

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. remove old DBs

app

master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. remove old DBs

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. remove old DBs

app

master

slave

1. create new DBs

2. setup replication

3. start using masters

4. remove old DBs

?

app

master

slave

1. create new DBs

2. setup replication

3. start using masters

4. remove old DBs

5. cut replication

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

app

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

5. Nooooooooo!

app

1. create new DBs

2. setup replication

3. start using masters

4. drop database;

5. Nooooooooo!

You will make mistakes!

Lesson 2:

More things...

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. truncate data

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. truncate data

AB

AB

AB

AB

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

BA

AB

AB

AB

AB

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

BA

AB

AB

AB

AB

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. truncate data

BA

AB

AB

AB

AB

app

slave

master master

slave

1. create new DBs

2. setup replication

3. start using masters

4. cut replication

5. truncate data

BA

AB

AB

AB

AB

app

slave

master master

slave slave

master master

slave

AC

AC

BD

BD

app

slave

master master

slave slave

master master

slave

AC

AC

BD

BD

AC

AC

BD

BD

app

slave

master master

slave slave

master master

slave

AC

AC

BD

BD

AC

AC

BD

BD

app

slave

master master

slave slave

master master

slave

AC

AC

BD

BD

AC

AC

BD

BD

The right thing...

Software is easy,data is hard!

Lesson 3:

24 GB

44 GB

in 8 days

Not enough memory

Not enough memory

=> no backups

Not enough memory

=> no backups

=> no new slaves

Not enough memory

=> no backups

=> no new slaves

=> big problem

24 GB

38 GB

in 3 days

24 GB

38 GB

in 3 days

Fixed in v2.2

2011Operation Hell

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

SQL

EBS EBS EBS EBS

lb lblblb

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

EBSSQLBrowserlb

lb

lb

lb

app

app

app

app

app

app

app app

0"

500,000"

1,000,000"

1,500,000"

2,000,000"

Apr*10" Jul*10" Oct*10" Jan*11" Apr*11" Jul*11" Oct*11"

0"

500,000"

1,000,000"

1,500,000"

2,000,000"

Apr*10" Jul*10" Oct*10" Jan*11" Apr*11" Jul*11" Oct*11"

AWS  outagein  Ireland

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

lb lblblb

Installed in 3 years:

Installed in 3 years:

98 MySQL machines

Installed in 3 years:

98 MySQL machines

195 Redis machines

2012/2013Refactoring

Lessons learned

Server Database

Server Database

Server Database

Server Database

Server Database

Server Database

Server Database

Server Database

Server Database

Server Database

One Game Session

Server Database

One Game Session

Rd

rd

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd

rd

Rd Rd

rd rd

SQL SQL SQL SQL SQL

app appapp

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app appapp app

app appapp app app app app app app app app appapp app app

lb lblblb

Rd

rd

app appapp

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app appapp app

app appapp app app app app app app app app appapp app app

lb lblblb

Rd

rd

app appapp

lb lblblb

1. Separate user and world DBs

1. Separate user and world DBs

2. Migrate to JRuby

1. Separate user and world DBs

2. Migrate to JRuby

3. Introduce life cycle

1. Separate user and world DBs

2. Migrate to JRuby

3. Introduce life cycle

4. Remove (user) DBs

1. Separate user and world DBs

2. Migrate to JRuby

3. Introduce life cycle

4. Remove (user) DBs

5. Enjoy!

Separate user DBs

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

lb lblblb

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL SQL

sql sql

SQL SQL

sql sql

SQL

sql

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

JRuby

JRuby v2

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Throughput

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Throughput

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Throughput

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

no benchmark

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

no benchmark

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

no benchmark

multithreaded

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

no benchmark

multithreaded

tuning

0

50

100

150

200

Rails 2.3.11MRI 1.8.7

Rails 3.2.8MRI 1.8.7

Rails 3.2.8MRI 1.9.3

Rails 3.2.8JRuby 1.7.2

Throughput

no benchmark

multithreaded

tuning

cpusaturation

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL SQL

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app app

app app

app app

app app

app app

app app

app app

app app app app app app app app app appapp app app

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

SQL SQL SQL SQL SQL

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

app appapp app app app app app app appapp app app

lb lblblb

app

app

app

app

app

app

app

app

app

app

app

app

app

app

NEW OLD

NEW OLD

NEW OLD

NEW OLD

Life cycle

app

app

app

servlet

app

servlet archiver

User Registryapp

servlet archiver

User Registry

S3

app

servlet archiver

User Registry

SQL Rd

S3

app

servlet archiver

User Registry

SQL Rd

S3

app

servlet archiver

World Data

User Registry

SQL Rd

S3

app

servlet archiver

World Data

User Registry

SQL Rd

S3

Load Balancer

app

servlet archiver

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd

rd

Rd

rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

SQL SQL

sql sql

SQL SQL

sql sql

SQL SQL

sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd Rd

rd rd

Rd

rd

Rd

rd

SQL SQL SQL SQL

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

Rd

rd

lb lblblb

Removing DBs

SQL SQL

sql

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQLSQL SQL

sql sql

SQL SQL

sql sql

Rd Rd

rd rd

Rd Rd

rd rd

Rd

rd

SQL SQL SQL SQL

Rd

rd

Rd

rd

Rd

rd

lb lblblb

SQL SQL

sql

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

SQL

sql

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

Enjoy

SQL

sql

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

SQL

sql

SQL

Rd

rd

Rd

rd

lb lblblb

Rd

rd

Rd

rd

Rd

rd

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

app appapp app app app app app app app app appapp app app

Rd

rd

Rd

rd

Rd

rd

lb lblblb

Looking Back

What is the right software stack?

What is the right software stack?

What is the right software stack?

What kind of application are we building?

User

User

Avatar

User

Avatar Cus-tomer

User

GardenAvatar Cus-tomer

User

GardenAvatar Cus-tomer

TilesDeco-rations

User

GardenAvatar Cus-tomer

TilesDeco-rations

User

GardenAvatar Cus-tomer

TilesDeco-rations

User

GardenAvatarCus-

tomer

TilesDeco-rations

User

GardenAvatar Cus-tomer

TilesDeco-rations

User

GardenAvatarCus-

tomer

TilesDeco-rations

http://en.wikipedia.org/wiki/File:Columbus_Breaking_the_Egg%27_(Christopher_Columbus)_by_William_Hogarth.jpg

http://en.wikipedia.org/wiki/File:Columbus_Breaking_the_Egg%27_(Christopher_Columbus)_by_William_Hogarth.jpg

http://en.wikipedia.org/wiki/File:Columbus_Breaking_the_Egg%27_(Christopher_Columbus)_by_William_Hogarth.jpg

http://en.wikipedia.org/wiki/File:Columbus_Breaking_the_Egg%27_(Christopher_Columbus)_by_William_Hogarth.jpg

http://en.wikipedia.org/wiki/File:Columbus_Breaking_the_Egg%27_(Christopher_Columbus)_by_William_Hogarth.jpg

What kind of application are we building?

Plain File System

Handle statethe right way!

Software is easy,data is hard!

Lesson 3:

You will make mistakes!

Lesson 2:

Always check backon reality!

Lesson 1:

Always check backon your assumptions!

Lesson 1:

Thank you

Jesper Richter-Reichhelm

@jrirei

woo.ga/backend

wooga.com/jobs

Jesper Richter-Reichhelm@jrirei

wooga.com/jobswoo.ga/backend