Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire:...

27
Scalability at GROU.PS Emre Sokullu

Transcript of Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire:...

Page 1: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Scalability at GROU.PS

Emre Sokullu

Page 2: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Disclaimer

• We’re not fully there yet• We hire: [email protected]

Page 3: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Challenges @ GROU.PS

• 3M unique visitors per month• 120M page views• 1PB assets to be served every month– Video,Photos, Files

• Support for 5Gbit/s• Very dynamic pages:– With social networks; p(u,t) = HTML– p(g,u,t) = HTML -> WHERE group_id = ? AND …

Page 4: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

What is GROU.PS ?

Page 5: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.
Page 6: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.
Page 7: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.
Page 8: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.
Page 9: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Distributed Architecture25+ servers, S3 cloud, EdgeCast CDN4 cores + All Linux: Red HatSome Debian, Ubuntu, CentOS

Page 10: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Amazon Technologies

• S3• CloudFront• EC2 (elastic IP and persistent storage)• SimpleDB• Queue technologies, distributed hadoop and

more…

Page 11: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Amazon Technologies

• Downside: – Not so cheap– Bad database performance

Page 12: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Serving Content?

• Use MogileFS – Distributed file serving

• Use CDN– hot content served off from local servers

• Sysctl tunings needed!

Page 13: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Our typical sysctl additions• net.ipv4.tcp_syncookies = 1• net.ipv4.tcp_synack_retries = 2• ## Emre edited• # http://www.oracle-base.com/articles/11g/OracleDB11gR1InstallationOnFedora8.php• kernel.shmall = 2097152• kernel.shmmax = 2147483648• kernel.shmmni = 4096• # semaphores: semmsl, semmns, semopm, semmni• kernel.sem = 250 32000 100 128• net.ipv4.ip_local_port_range = 1024 65000• net.core.rmem_default=4194304• #net.core.rmem_max=4194304• net.core.wmem_default=262144• #net.core.wmem_max=262144• fs.file-max=5049800• vm.swappiness=10• ## Emre edited• # from http://forums.softlayer.com/showthread.php?t=3252• net.ipv4.tcp_rmem = 4096 87380 8388608• net.ipv4.tcp_wmem = 4096 87380 8388608• net.core.rmem_max = 8388608• net.core.wmem_max = 8388608• net.core.netdev_max_backlog = 5000• net.ipv4.tcp_window_scaling = 1• net.ipv4.ip_nonlocal_bind=1• # http://rackerhacker.com/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/• kernel.msgmni = 1024• kernel.sem = 250 256000 32 1024• net.ipv4.ip_conntrack_max = 524288• net.ipv4.netfilter.ip_conntrack_max = 524288

Page 14: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

MySQL

• Load off via memcache– $memcache->set(“group_by_name.jtpd”, 1122, false, 0);– $memcache->set(“home_module_html.1122”,…, true, 30);– function getGroupID($group_name) {

global $memcache; if( !isset($memcache) || ($res=($memcache->get(“group_by_name.{$group_name}”)))===false ) { // get it from mysql and memcache } else { return $res; // serve from memcache }}

Page 15: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

MySQL

• Replication easy• Split Reads• What about writes?• That’s where sharding comes to play– Vertical Sharding– Horizontal Sharding

• MMM

Page 16: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

MySQL

• Runs poorly on multi-cores• query_cache_size = 0 # on master• query_cache_type = 0 # on master• thread_concurrency = 8 # total cores• max_connections = 750 # shouldn’t exceed

that• innodb_buffer_pool_size = 10G # a little less

than the total amount

Page 17: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

MySQL Query Optimization

• INDEX group, user• WHERE group = ? AND user = ?• Not WHERE user = ? AND group = ?• B-tree

Page 18: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

MySQL Query Optimization

• SHOW PROCESSLIST• Maatkit, mk-query-digest• Percona builds

Page 19: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

NOSQL

• Voldemort, Linkedin• Cassandra, Facebook• Tokyo Cabinet, mixi

Page 20: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Logging

• Database logging is not the solution• File system is expensive too• A legal necessity

Page 21: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Logging

• Solution:• Scribe & Thrift• By Facebook• Eventually consistent

Page 22: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Nginx & libevent

Page 23: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Nginx & libevent

• Handles 10000 connections• 5gbit/s• Rambler• Wordpress• Grou.ps

Page 24: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Postfix

• Run multiple instances• Spam Clusters

Page 25: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

Monitoring

• Munin + monit• Other alternatives:– Cacti– Nagios– Hyperic – vmware

Page 26: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

PHP

Page 27: Scalability at GROU.PS Emre Sokullu. Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com.

More to come on my blog

• http://emresokullu.com• More fine tuning tips• Become a member of my community• Love grou.ps ;)• Convert to PHP• We’re hiring: [email protected]