How to Handle 1,000,000 Daily Users Without Using a Cache (RailsWayCon 2012)
-
Upload
wooga -
Category
Technology
-
view
6.577 -
download
1
description
Transcript of How to Handle 1,000,000 Daily Users Without Using a Cache (RailsWayCon 2012)
HOW TO HANDLE
1,000,000 DAILY USERS(without using a cache)
Jesper Richter-‐Reichhelm, @jrireiTuesday, June 5, 2012
Tuesday, June 5, 2012
The overall architecture is not that complex
BackendFlash client
Tuesday, June 5, 2012
The overall architecture is not that complex
Game Session
Asynch. Communica2on
Flash client
Tuesday, June 5, 2012
Backend
The overall architecture is not that complex
State Changes
ValidaMon
Persistence
Tuesday, June 5, 2012
But the scale is interesMng
14 billion requests / month
Tuesday, June 5, 2012
But the scale is interesMng
14 billion requests / month
Tuesday, June 5, 2012
But the scale is interesMng
14 billion requests / month
>100,000 DB operaMons / second
Tuesday, June 5, 2012
But the scale is interesMng
14 billion requests / month
>100,000 DB operaMons / second
>50,000 DB updates / second
Tuesday, June 5, 2012
A journey to 1,000,000 daily users
Start of the journey
6 weeks of pain
Paradise
Conclusion
Tuesday, June 5, 2012
October 2009: wooga’s first simulaMon game
Tuesday, June 5, 2012
Instead of PHP we used Ruby
Tuesday, June 5, 2012
Our database was MySQL
Tuesday, June 5, 2012
Our database was MySQL
even user ids odd user ids
Tuesday, June 5, 2012
And we went into the cloud
Tuesday, June 5, 2012
Master-‐slave replicaMon for DBs worked fine
app app app
lb
db db
Tuesday, June 5, 2012
We added a few applicaMon servers over Mme
app app app app app app app app app
lb
db db
Tuesday, June 5, 2012
250K daily users and no problems
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Life was good
Tuesday, June 5, 2012
Life was well and I went on a nice vacaMon
<picture: Jesper in clot canyon>
TO DO
Tuesday, June 5, 2012
Tuesday, June 5, 2012
Our bane: MySQL hiccups
!"#
$!"#
%!"#
&!"#
'!"#
(!!"#
!# )# (!# ()# $!# $)# *!# *)# %!#
Tuesday, June 5, 2012
Our bane: MySQL hiccups
!"#
$!"#
%!"#
&!"#
'!"#
(!!"#
!# )# (!# ()# $!# $)# *!# *)# %!#
Tuesday, June 5, 2012
Our bane: MySQL hiccups
!"#
$!"#
%!"#
&!"#
'!"#
(!!"#
!# )# (!# ()# $!# $)# *!# *)# %!#
Tuesday, June 5, 2012
A journey to 1,000,000 daily users
Start of the journey
6 weeks of pain
Paradise
Conclusion
Tuesday, June 5, 2012
SQL queries generated by Rubyamf gem
AMF responses to Flash client
Tuesday, June 5, 2012
SQL queries generated by Rubyamf gem
AMF responses to Flash client
Wrong config...
... so associated data was included, too
Tuesday, June 5, 2012
SQL queries generated by Rubyamf gem
AMF responses to Flash client
Wrong config...
... so associated data was included, too
=> Easy to fix
Tuesday, June 5, 2012
More traffic using the same cluster
app app app app app app app app app
lb
db db
Tuesday, June 5, 2012
Config tweaks brought us to 300K DAU
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Config fixes
Tuesday, June 5, 2012
AcMveRecord’s checks caused 20% extra DB
Checking connecMon state
MySQL process list full of ‘status’ calls
Tuesday, June 5, 2012
AcMveRecord’s checks caused 20% extra DB
Checking connecMon state
MySQL process list full of ‘status’ calls
=> Fixed by 1 line of code
Tuesday, June 5, 2012
I/O on MySQL masters sMll was the bojleneck
New Relic: 60% of all UPDATEs on ‘Mles’ table
Tuesday, June 5, 2012
Tiles are part of the core game loop
Core game loop1) plant2) wait3) harvest
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards
old master
old slave
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards1) Setup new masters as slaves of old ones
old master
old slave
new master
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards1) Setup new masters
old master
old slave
new master
new slave
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards1) Setup new masters2) Start using new masters
old master
old slave
new master
new slave
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards1) Setup new masters2) Start using new masters3) Cut replicaBon
old master
old slave
new master
new slave
Tuesday, June 5, 2012
We started to shard on model, too
Adding new shards1) Setup new masters2) Start using new masters3) Cut replicaBon4) Truncate
old master
old slave
new master
new slave
Tuesday, June 5, 2012
4 DB masters and a few more servers
app app
app app app app app app app app
app appapp
lb
Blesdb
Blesdb
db db
app app app
Tuesday, June 5, 2012
Sharding by model brought us to 400K DAU
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Shard by model
Tuesday, June 5, 2012
We improved our MySQL setup
RAID-‐0 of EBS volumes
Tuesday, June 5, 2012
We improved our MySQL setup
RAID-‐0 of EBS volumes
Using XtraDB
Tuesday, June 5, 2012
We improved our MySQL setup
RAID-‐0 of EBS volumes
Using XtraDB
Tweaking my.cnf
Tuesday, June 5, 2012
Sharding gem circumvented AR’s internal cache
AcMveRecord caches SQL queries...
Tuesday, June 5, 2012
Sharding gem circumvented AR’s internal cache
AcMveRecord caches SQL queries...
... only in our development environment!
Tuesday, June 5, 2012
Sharding gem circumvented AR’s internal cache
AcMveRecord caches SQL queries...
... only in our development environment!
=> Fixed by 2 lines of code
Tuesday, June 5, 2012
I/O sMll was not fast enough
If 2 + 2 is not enough, ...
Tuesday, June 5, 2012
I/O sMll was not fast enough
If 2 + 2 is not enough, ...
… perhaps 4 + 4 masters will do?
Tuesday, June 5, 2012
It’s no fun to handle 8+8 MySQL DBs
app app app app appapp app
app app app app app app app app app
appapp
lb
Blesdb
Blesdb
db db
Tuesday, June 5, 2012
It’s no fun to handle 8+8 MySQL DBs
app app app app appapp app
app app app app app app app app app
appapp
lb
Blesdb
Blesdb
Blesdb
Blesdb
db db db db
Tuesday, June 5, 2012
At 500K DAU we were at a dead end
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Tuesday, June 5, 2012
At 500K DAU we were at a dead end
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Tuesday, June 5, 2012
I/O remained bojleneck for MySQL UPDATEs
Each DB master could do
about 1000 DB write/s.
Tuesday, June 5, 2012
I/O remained bojleneck for MySQL UPDATEs
Each DB master could do
about 1000 DB write/s.
That’s not enough!
Tuesday, June 5, 2012
Pick the right tool for the job!
Tuesday, June 5, 2012
Pick the right tool for the job!
Tuesday, June 5, 2012
Redis is fast but goes beyond simple key/value
Redis is a key-‐value storeHashes, Sets, Sorted Sets, ListsAtomic operaBons like set, get, increment
Tuesday, June 5, 2012
Redis is fast but goes beyond simple key/value
Redis is a key-‐value storeHashes, Sets, Sorted Sets, ListsAtomic operaBons like set, get, increment
50,000 transacMons/s on EC2Writes are as fast as reads
Tuesday, June 5, 2012
We could learn from another team using Redis
Tuesday, June 5, 2012
We could learn from another team using Redis
Tuesday, June 5, 2012
Shelf Mles : An ideal candidate for Redis
Shelf 2les:{ plant1 => 184,plant2 => 141,plant3 => 130,plant4 => 112,
… }
Tuesday, June 5, 2012
Shelf Mles : An ideal candidate for using Redis
Redis HashHGETALLHGETSETHINCRBY…
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
On-‐demand migraMons from MySQL to Redis
Tuesday, June 5, 2012
Typical migraMon throughput over 3 days
Tuesday, June 5, 2012
Migrate on the fly -‐ and clean up later
1. Let migraMon run unMl everything cools down
Tuesday, June 5, 2012
Migrate on the fly -‐ and clean up later
1. Let migraMon run unMl everything cools down
2.Migrate the rest manually
Tuesday, June 5, 2012
Migrate on the fly -‐ and clean up later
1. Let migraMon run unMl everything cools down
2.Migrate the rest manually
3. Remove migraMon code
Tuesday, June 5, 2012
Migrate on the fly -‐ and clean up later
1. Let migraMon run unMl everything cools down
2.Migrate the rest manually
3. Remove migraMon code
4.Wait unMl no fallback necessary
Tuesday, June 5, 2012
Migrate on the fly -‐ and clean up later
1. Let migraMon run unMl everything cools down
2.Migrate the rest manually
3. Remove migraMon code
4.Wait unMl no fallback necessary
5. Remove SQL table
Tuesday, June 5, 2012
A journey to 1,000,000 daily users
Start of the journey
6 weeks of pain
Paredise (or not?)
Conclusion
Tuesday, June 5, 2012
Again: Tiles are part of the core game loop
Core game loop1) plant2) wait3) harvest
Tuesday, June 5, 2012
Size majers for migraMons
Tuesday, June 5, 2012
Size majers for migraMons
MigraMon check overload
Tuesday, June 5, 2012
Size majers for migraMons
MigraMon check overloadMigraBon only on startup
Tuesday, June 5, 2012
Size majers for migraMons
MigraMon check overloadMigraBon only on startup
Overlooked an edge case
Tuesday, June 5, 2012
Size majers for migraMons
MigraMon check overloadMigraBon only on startup
Overlooked an edge caseOnly migrate 1% of users
Tuesday, June 5, 2012
Size majers for migraMons
MigraMon check overloadMigraBon only on startup
Overlooked an edge caseOnly migrate 1% of usersConBnue if everything is ok
Tuesday, June 5, 2012
In-‐memory DBs don’t like dumping to disk
Dumping to disk
Tuesday, June 5, 2012
In-‐memory DBs don’t like dumping to disk
Dumping to diskSAVE is blocking
Tuesday, June 5, 2012
In-‐memory DBs don’t like dumping to disk
Dumping to diskSAVE is blockingBGSAVE needs free RAM
Tuesday, June 5, 2012
In-‐memory DBs don’t like dumping to disk
Dumping to diskSAVE is blockingBGSAVE needs free RAM
Latency increase by 100%
Tuesday, June 5, 2012
In-‐memory DBs don’t like dumping to disk
Dumping to diskSAVE is blockingBGSAVE needs free RAM
Latency increase by 100%
=> BGSAVE on slaves every 15 minutes
Tuesday, June 5, 2012
Redis replicaMon starts with a BGSAVE
StarMng up a new slave by replicaMon
Tuesday, June 5, 2012
Redis replicaMon starts with a BGSAVE
StarMng up a new slave by replicaMonBGSAVE on master
Tuesday, June 5, 2012
Redis replicaMon starts with a BGSAVE
StarMng up a new slave by replicaMonBGSAVE on masterSlave imports dumped file
Tuesday, June 5, 2012
Redis replicaMon starts with a BGSAVE
StarMng up a new slave by replicaMonBGSAVE on masterSlave imports dumped file
=> No RAM means no new slaves
Tuesday, June 5, 2012
Redis had a memory fragmenMon problem
24 GB
44 GB
in 8 days
Tuesday, June 5, 2012
Redis had a memory fragmenMon problem
24 GB
38 GB
in 3 days
Tuesday, June 5, 2012
Redis had a memory fragmenMon problem
24 GB
38 GB
in 3 days
Fixed in
v2.2
Tuesday, June 5, 2012
If MySQL is a truck
Fast enough
Disk based
Robust
Fast enough disk based robust
Tuesday, June 5, 2012
If MySQL is a truck, Redis is a race car
Super fast
RAM based
Fragile
Super fast RAM based fragile
Tuesday, June 5, 2012
Big and staMc data in MySQL, rest goes to Redis
60 GB data
50% writes
256 GB data
10% writesh"p://www.flickr.com/photos/erix/245657047/
Tuesday, June 5, 2012
Lots of boxes, but automaMon helps a lot!
app app app app app app app app app app app appapp
app app app app app app app app app app app appapp
app app app app app app app app app app app appapp
lb lb
redis redis redis redis redisdb db db db db
Tuesday, June 5, 2012
We reached 1 million daily users!
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
1,000,000 -‐ Big party!
Tuesday, June 5, 2012
We started archiving inacMve users
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
50% DBreduc2on
Tuesday, June 5, 2012
We even survived a complete data center loss
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
EBS nomore!
Tuesday, June 5, 2012
We improved our MySQL schema on-‐the-‐fly
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
30% DBreduc2on
Tuesday, June 5, 2012
Meanwhile we have more than 2M daily users
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
Tuesday, June 5, 2012
A journey to 1,000,000 daily users
Start of the journey
6 weeks of pain
Paredise (or not?)
Conclusion
Tuesday, June 5, 2012
of sotware
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
EvoluMon every week
EVOLUTION
Tuesday, June 5, 2012
of sotware
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
EvoluMon every week
EVOLUTION
Tuesday, June 5, 2012
of sotware
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
EvoluMon every week
EVOLUTIONREVOLUTION
Tuesday, June 5, 2012
of sotware
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
EvoluMon every week
EVOLUTIONREVOLUTION
Tuesday, June 5, 2012
!"
#!!$!!!"
%$!!!$!!!"
%$#!!$!!!"
&$!!!$!!!"
'()*%!" +,-*%!" ./0*%!" +12*%%" '()*%%" +,-*%%" ./0*%%"
of sotware
EvoluMon every week
EVOLUTIONREVOLUTION
Tuesday, June 5, 2012
Works for teams ...
Tuesday, June 5, 2012
Each new game is a revoluMon
Tuesday, June 5, 2012
Each new game is a revoluMon
Tuesday, June 5, 2012
Each new game is a revoluMon
Tuesday, June 5, 2012
Each new game is a revoluMon
Tuesday, June 5, 2012
Each new game is a revoluMon
Tuesday, June 5, 2012
Works for teams and for companies
!""#$%&"'()"*+,
Tuesday, June 5, 2012
Ques2ons?
Jesper Richter-‐Reichhelm@jrirei
slideshare.net/woogawooga.com/jobs
Tuesday, June 5, 2012
Tuesday, June 5, 2012