Last.fm vs Xbox

39
Last.fm vs. Xbox David Singleton last.fm/user/underpangs twitter.com/dsingleton

description

Given at DIBI, Newcastle Apr 2010.http://dibiconference.com/

Transcript of Last.fm vs Xbox

Page 2: Last.fm vs Xbox

Music discovery powered by Scrobbling

Personalised radio, social network, events, a “wikipedia of music”

High traffic

Monthly visitors: 40 million

Monthly page views: 500,000 million

Page 3: Last.fm vs Xbox

Last.fm ArchitectureLoad balancer

HTTP Cache

Web Server

Object Cache

Database

Page 4: Last.fm vs Xbox

Xbox Live Platform, millions of users

Last.fm Radio App

Built by Microsoft

Powered by our API

Launched along side Facebook & Twitter

Page 5: Last.fm vs Xbox

Last.fm vs. Xbox

Page 6: Last.fm vs Xbox

So, what’s there to talk about?

Good Things ™

New users. It’s really cool.

Bad Things™

Lots of new users, traffic spikes

A very important, high profile, launch

How did Last.fm approach this?

Page 7: Last.fm vs Xbox

Xbox Live: 15 million usersassuming a 10% take-up rate = 1,500,000 users

startup: 5 requests + starting radio: 5 requests + 15 minutes of radio: 60 requests

1 hour of radio = 250 requests per useran hour of radio per user is a rough averaged guess

1,500,000 users = 375,000,000 requests over 24 hoursassuming an even distribution = 4,500 requests / second

Likely peaking at more than triple = 15,000 requests / second

Last.fm: 2,000 requests/secbased on number of servers and apache configuration

estimated max capacity of 3,500 requests per second

Page 8: Last.fm vs Xbox

Oh fuck

Page 9: Last.fm vs Xbox

What next?

Picked a metric: requests per second

Estimated traffic increase vs capacity

Selected our goals;

Serve requests faster

Reduce number requests

Page 10: Last.fm vs Xbox

Profiling traffic

Used traffic generated beta testing

Web server request logs

Common format, widely supported format

Hundreds of existing tools

We generated some stats using AWK...

Page 11: Last.fm vs Xbox

Which API requests were made?

Method

71638 track.getInfo 53941 artist.getImages 15150 radio.getPlaylist 7308 library.getArtists 5020 user.getRecentStations 4979 ads.getVideos 4205 radio.tune 3155 track.love 1507 artist.getInfo 1258 user.getRecommendedArtists 1135 user.getInfo 1130 geo.getTopArtists 1128 radio.gamerStations 1102 tag.getTopArtists 1021 track.ban 1006 user.getLovedTracks 340 library.addArtist 206 auth.getMobileSession 136 usersignUp 123 userterms

Page 12: Last.fm vs Xbox

Which API requests were made?

Page 13: Last.fm vs Xbox

Raw data from betaCalls Method Total Average

53941 artist.getImages 19647 0.3671638 track.getInfo 15789 0.2215150 radio.getPlaylist 6962 0.46 7308 library.getArtists 2402 0.33 4979 ads.getVideos 1810 0.36 5020 user.getRecentStations 1674 0.33 1102 tag.getTopArtists 1488 1.35 1258 user.getRecommendedArtists 1457 1.16 4205 radio.tune 923 0.22 1130 geo.getTopArtists 575 0.51 1507 artist.getInfo 440 0.29 1128 radio.gamerStations 298 0.26 1006 user.getLovedTracks 271 0.27 1135 user.getInfo 171 0.15 206 auth.getMobileSession 38 0.19 136 user.signUp 32 0.24 123 user.terms 16 0.13 3155 track.love 0 0.00 1021 track.ban 0 0.00 340 library.addArtist 0 0.00

Page 14: Last.fm vs Xbox

How long did each method take?

Page 15: Last.fm vs Xbox

Why so many track.getInfo calls?A tiny UI tweak...

...responsible for 25% of calls.

Arrggghhhhhh

Added that information to a sensible API call

Microsoft kindly updated the app

Page 16: Last.fm vs Xbox

What next?

Page 17: Last.fm vs Xbox

What about the getImages calls?

Powers an artist slideshow visualisation

Results of this call won’t change often

Set a HTTP cache timeout

Set caching on a few other calls too

Page 18: Last.fm vs Xbox

4

Cached Requests

Page 19: Last.fm vs Xbox

Request generationCalls Method Total Average

53941 artist.getImages 19647 0.3671638 track.getInfo 15789 0.2215150 radio.getPlaylist 6962 0.46 7308 library.getArtists 2402 0.33 4979 ads.getVideos 1810 0.36 5020 user.getRecentStations 1674 0.33 1102 tag.getTopArtists 1488 1.35 1258 user.getRecommendedArtists 1457 1.16 4205 radio.tune 923 0.22 1130 geo.getTopArtists 575 0.51 1507 artist.getInfo 440 0.29 1128 radio.gamerStations 298 0.26 1006 user.getLovedTracks 271 0.27 1135 user.getInfo 171 0.15 206 auth.getMobileSession 38 0.19 136 user.signUp 32 0.24 123 user.terms 16 0.13 3155 track.love 0 0.00 1021 track.ban 0 0.00 340 library.addArtist 0 0.00

1258 user.getRecommendedArtists 1457 1.16

15150 radio.getPlaylist 6962 0.46

Page 20: Last.fm vs Xbox

webgrindhttp://code.google.com/p/webgrind/

kcachegrindhttp://kcachegrind.sourceforge.net

Page 21: Last.fm vs Xbox

What happens if things break?

Simulated failing calls

Highlighted essential calls

Acted as a dry-run for launch day failures

Informed our backup plans

Page 22: Last.fm vs Xbox

Only essential requests

Page 23: Last.fm vs Xbox

Prepare for the worst

Unexpected problems we’ve had:

Servers overheating (twice)

Hardware (almost) stolen from data-centers

Power outage in the office

Page 24: Last.fm vs Xbox

Backup plans, AKA The “Kill List”

Plan Effect Severity

Disable radio DB-backing

Faster calls Minor

Disable Flash Player Save 200 req/sec Major

Drop non essential Xbox API calls

Reduce Xbox trafficby 0 - 50%

Extreme

Drop X% of radio tune calls

Reduce Xbox traffic by X%

Nuclear

Page 25: Last.fm vs Xbox

Communication

Page 26: Last.fm vs Xbox

Last.fm: Launch Day(When traffic attacks)

Page 27: Last.fm vs Xbox

How did it go?

Our estimate was about 50% over

Didn’t exceed capacity (but got quite close)

Profiling and caching was essential

Or we would have gone down

Page 28: Last.fm vs Xbox

What did we learn?

Use timezones to rollout slowly

Traffic will follow daily trends

Live monitoring is essential

Backup plans are comforting

Pre-fill caches before launch

Page 29: Last.fm vs Xbox

So, how does this help me?

Page 30: Last.fm vs Xbox

1. Estimate

Choose your metric

Estimate launch traffic

Compare against capacity

Make performance targets

Know your limitations

Page 31: Last.fm vs Xbox

2. Profile requests

Start with a sample of traffic

Extract data for your metric

Visualise the results

Identify expensive requests for your metric

Use profiling tools on individual requests

Page 32: Last.fm vs Xbox

3. OptimiseReduce number of requests

Set the right HTTP caching headers

Combine with reverse web proxy

Prime caches for common calls

Use an object cache

Avoid language level optimisation

Page 33: Last.fm vs Xbox

Load balancer

HTTP Cache

Web Server

Object Cache

Database

Web Request

Page 34: Last.fm vs Xbox

Load balancer

HTTP Cache

Web Server

Object Cache

Database

Web Request

Page 35: Last.fm vs Xbox

Load balancer

HTTP Cache

Web Server

Object Cache

Database

Web Request

Page 36: Last.fm vs Xbox

4. Plan for failure

Simulate failures

Know your weak spots

Prepare backups plans

Communicate with users and partners

Page 37: Last.fm vs Xbox

5. Launch it!Roll out slowly, if you can

Setup live monitoring

If something goes wrong;

Don’t panic

Keep people updated

Have some champagne on ice

Page 38: Last.fm vs Xbox

1. Start with an estimate

2. Profile your traffic

3. Make optimisations

4. Prepare for the worst

5. Launch it!

Page 39: Last.fm vs Xbox

Last.fm vs. Xbox

David Singleton last.fm/user/underpangs twitter.com/dsingleton

Questions?