Netlog: What we learned about scalability & high availability
-
Upload
folke-lemaitre -
Category
Technology
-
view
43.278 -
download
0
description
Transcript of Netlog: What we learned about scalability & high availability
![Page 1: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/1.jpg)
What we learned about scalability & high availability
Folke LemaitreDirector of Development
http://nl.netlog.com/folke
27 mei 2008
![Page 2: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/2.jpg)
Overview
‣What is Netlog?‣ Translations‣ Network topology‣ Scaling Databases‣ Caching‣ Search‣ Q&A
![Page 3: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/3.jpg)
What is Netlog?
![Page 4: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/4.jpg)
Social Network
‣ Create your own profile
‣ Discover your friendsʼ activity
‣ Communicate
‣ Explore new content
‣ Applications
![Page 5: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/5.jpg)
Your Profile
![Page 6: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/6.jpg)
What: itʼs personal
‣ You rule: itʼs yours
YOU Videos
Blogs
Relations.
Photos
People
Games
Music
Photos
YOU
ANOTHER
ANOTHER
![Page 7: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/7.jpg)
Friend Activity
‣ Share & discover friendsʼ activity
Toon Coppens uploadt een nieuwe foto
Jaak Noukens en Jo zijn nu vrienden
Mari . reageertop haar foto
Pinguke V wijzigt haar profielfoto
Stijn Symons uploadt een nieuwe foto
Jan Maarten Willems tekent het gastenboek van nico b
Kenny Gryp tekent het gastenboek van Lorenz Bogaert
![Page 8: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/8.jpg)
Communication: Shouts
![Page 9: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/9.jpg)
Communication: Ratings & Comments
![Page 10: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/10.jpg)
Communication: Private messaging
![Page 11: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/11.jpg)
Communication: Chat
![Page 12: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/12.jpg)
Communication: Clans
![Page 13: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/13.jpg)
Explore
Profiles
Videos
Photos
Clans
Events
Blogs
Pages
Music
Applications
![Page 14: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/14.jpg)
Applications
‣ OpenSocial• sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1
‣ Officially announced tomorrow@ Google I/O• Stay tuned!
‣ Public launch for june
![Page 15: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/15.jpg)
Developer Pages
http://nl.netlog.com/go/developer
![Page 16: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/16.jpg)
Itʼs going pretty good
‣ More than 35,000,000 unique members‣ More than 4,000,000,000 pageviews/Month‣ 19 languages and more coming up‣ More than 20 countries‣ Current Alexa Top-100 ranking (most visited web sites in the world)
‣ Current ComScore Europe Top-10 ranking
![Page 17: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/17.jpg)
Itʼs going pretty good
0
10.000.000
20.000.000
30.000.000
40.000.000
Janu
ary-07
Februa
ry-07
March-0
7
April-0
7
May-07
June
-07
July-
07
Augus
t-07
Octobe
r-07
Novem
ber-0
7
Decem
ber-0
7
Janu
ary-08
Februa
ry-08
March-0
8
April-0
8
Monthly Unique VisitorsNorthern Europe3%Americas
10%
Western Asia16%
Eastern Europe3%
Southern Europe22%
Western Europe46%
0
50.000.000
100.000.000
150.000.000
200.000.000
Janu
ary-07
Februa
ry-07
March-0
7
April-0
7
May-07
June
-07
July-
07
Augus
t-07
Octobe
r-07
Novem
ber-0
7
Decem
ber-0
7
Janu
ary-08
Februa
ry-08
March-0
8
April-0
8
Monthly Visits
0
1.250.000.000
2.500.000.000
3.750.000.000
5.000.000.000
Janu
ary-07
Februa
ry-07
March-0
7
April-0
7
May-07
June
-07
July-
07
Augus
t-07
Octobe
r-07
Novem
ber-0
7
Decem
ber-0
7
Janu
ary-08
Februa
ry-08
March-0
8
April-0
8
Monthly Page Requests
![Page 18: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/18.jpg)
Itʼs going pretty good
![Page 19: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/19.jpg)
Translations
![Page 20: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/20.jpg)
19 languages and alot more coming!
Català
中文 česky
Dansk
Nederlands
English
Eesti
suomi
français
Deutsch
Italiano
Lietuvių kalba
Norsk (bokmål)
Polski
PortuguêsRomână
Русский
slovenščina
EspañolSvenska
TürkçeAfrikaans
български
Hrvatski
Magyar
Latviešu valoda
Slovenčina
![Page 21: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/21.jpg)
Translate Tool
![Page 22: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/22.jpg)
Template
![Page 23: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/23.jpg)
Parsed Template
![Page 24: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/24.jpg)
Translated Template
![Page 25: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/25.jpg)
Generated PhP code
![Page 26: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/26.jpg)
Template Code
![Page 27: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/27.jpg)
Template Output
![Page 28: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/28.jpg)
Network Topology
![Page 29: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/29.jpg)
Overview
Internet
CDN
Netlog Datacenters
Firewall Web Load Balancer
Web Cluster
Master
Slave
Slave
Primary Pool
Memcache Pools
Session Cache
General Cache Html Cache
Database Pools
Master
Slave
Slave
User Pool
Master
Slave
Slave
Activity Pool
Master
Slave
Slave
Friendships Pool
Master
Slave
Slave
...
Static Load Balancer
Storage Servers
![Page 30: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/30.jpg)
Web Servers
‣ Software• Apache 2• Php 5.2.6• eAccelerator 0.9.5.2 for bytecode caching• Keepalived for high availability
‣ 200 servers‣ 450 000 requests per second
![Page 31: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/31.jpg)
Database Servers
‣ MySQL Enterprise 4.1.22
‣ 200 database servers‣ 40 thousand tables‣ 70 billion records
‣ 60 thousand queries per second
![Page 32: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/32.jpg)
Memcache Servers
‣ Memcached 1.2.4
‣ 60 servers
‣ 250 thousand requests/second
‣ 450 GB of memory
![Page 33: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/33.jpg)
Static servers
‣ Software:• Lighttpd• NginX
‣ Used for:• static files: css/javascript/images/...• user content: photos, videos
‣ Content Delivery Network: Akamai & Panther
![Page 34: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/34.jpg)
Other servers
‣ OpenSocial:• Shindig• Tomcat
‣ Search:• Sphinx
![Page 35: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/35.jpg)
Scaling Databases
![Page 36: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/36.jpg)
Database & Scalability
‣ Database pools
‣ Replication
‣ Partitioning
![Page 37: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/37.jpg)
Database Pools
‣ Different data on different database pools:• messaging• friendships• blogs• music• videos• ...
![Page 38: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/38.jpg)
Replication
‣ write to one master‣ read from multiple slaves (and master)
‣ pros• easy to implement• read intensive applications scale very well
‣ cons• write intensive applications donʼt scale
![Page 39: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/39.jpg)
Partitioning (sharding)
‣ Divide data on primary key:• all user data for users with id 1 - 10 in database1• all user data for users with id 11 - 20 in database2• ...
‣ Best scaling possible
‣ How?• managed in code• MySQL partitioning (available from version 5.1)
![Page 40: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/40.jpg)
Analyse, analyse, analyse!
‣ Tag your queries• SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */
‣ Analyse mysql slow logs‣ Analyse process lists‣ Analyse based on tags
• 1023 User:getUser():230• 512 User::isOnline():124• 10 Activities:getActivity():320
‣ minutely cron that checks for “too many connections”• if “too many connections”, log process list
![Page 41: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/41.jpg)
Caching
![Page 42: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/42.jpg)
Introduction to memcached
‣ Developed by Danga Interactive:• http://www.danga.com/
‣ Initially developed for LiveJournal:• http://www.livejournal.com/
‣ OpenSource
![Page 43: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/43.jpg)
Introduction to memcached
‣ Least Recently Used
‣ Fast!
‣ Distributed
‣ Automatic failover
‣ Big Hash table: set/add/get/delete
![Page 44: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/44.jpg)
What to cache?
‣ sessions
‣ query caching
‣ processed data
‣ generated html
![Page 45: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/45.jpg)
Session Cache
‣ 99% hit ratio
‣ Time to live is 20 minutes
‣ Faster than session database
![Page 46: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/46.jpg)
Query Cache
‣Why memcache and not MySQL query cache?• MySQL invalidates cached queries on a table on
every update• different query cache for different replicated
databases
‣ Add to generic database classes• Cache key is query
![Page 47: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/47.jpg)
Processed data
‣ Better to cache processed data than query results
![Page 48: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/48.jpg)
HTML Caching
![Page 49: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/49.jpg)
HTML Caching
‣ Profile blocks are fully cached
‣ Data needed to generate html is also cached
‣When data changes, html is invalidated, cached data updated
‣ High cache hit rate on profile pages
![Page 50: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/50.jpg)
3 ways of caching
‣ Cache with TTL
‣ Cache forever with invalidate
‣ Cache forever with update
![Page 51: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/51.jpg)
Cache with TTL
‣ The good:• Quickly achieve better performance on existing code
‣ The bad:• Users see outdated information• TTL can not be high• Caching efficiency is minimal
![Page 52: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/52.jpg)
Cache with TTL
‣ Cache friends for 5 minutes
![Page 53: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/53.jpg)
Cache forever with invalidate
‣ The Good:• fairly easy to implement• user never sees outdated data
![Page 54: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/54.jpg)
Cache friends forever
‣ For memcached this means ttl=0
![Page 55: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/55.jpg)
Invalidate Cache
![Page 56: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/56.jpg)
Cache forever with update
‣ The Good:• Best caching possible• Can reduce your select queries to the minimum
![Page 57: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/57.jpg)
Update Cache (array)
‣ Only update cache when no db queries needed
![Page 58: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/58.jpg)
Update Cache (simple value)
‣ No need to check cache
![Page 59: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/59.jpg)
Global Locking
‣ Use memcache as locking mechanism
![Page 60: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/60.jpg)
Global Locking: Chat Example
‣ Example: add new message to cached shared chat thread
![Page 61: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/61.jpg)
Flooding detection
‣ User can only redo action A after a timeout• a guestbook message can only be posted once every
2 minutes
‣ User can not do action A more than X times in T minutes• only 12 failed login attempts per hour are allowed
![Page 62: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/62.jpg)
Flooding detection
![Page 63: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/63.jpg)
Flooding detection
‣ User can only redo action A after a timeout• a guestbook message can only be posted once every
2 minutes
‣ User can not do action A more than X times in T minutes• only 12 failed login attempts per hour are allowed
![Page 64: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/64.jpg)
Search
![Page 65: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/65.jpg)
MySQL full-text search
‣ Initially used for our search• can be very slow• extra load on most of our databases, since most
content is searchable
‣ Better search engine needed• Sphinx!• OpenSource search engine developed by Andrew
Aksyonoff (http://sphinxsearch.com/)
![Page 66: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/66.jpg)
Sphinx Features
‣ very fast indexing‣ very fast searching
• 0.04 seconds average• 5 million searches / day• 60 searches / second
‣ distributed‣ document fields‣ stopwords‣ api available in many languages
• PhP, Java, Python, Ruby, Perl, C++, ...
![Page 67: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/67.jpg)
Sphinx Indexer
‣ Index is read-only (except for attributes)
‣ Build new index while searching old one
‣ How we index:• rebuild full index from data once in a while (daily,
weekly)• generate delta indexes often (every minute, 5
minutes)• contains changes for search index since last full index merge
• full index merge of previous index and delta (every hour)
![Page 68: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/68.jpg)
Sphinx Search
‣ Search query returns list of ids
‣ For every result page shown, we fetch data associated with ids• data is cached with memcache for every id
![Page 69: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/69.jpg)
![Page 70: Netlog: What we learned about scalability & high availability](https://reader034.fdocuments.net/reader034/viewer/2022051207/540e3c6b8d7f72927e8b4b63/html5/thumbnails/70.jpg)
Thank you!
Questions?