Scalability at GROU.PS

deadhorsecapableInternet and Web Development

Dec 8, 2013 (3 years and 10 months ago)

126 views

Scalability at GROU.PS

Emre

Sokullu

Disclaimer


We’re not fully there yet


We hire: jobs@groups
-
inc.com

Challenges @ GROU.PS


3M unique visitors per month


120M page views


1PB assets to be served every month


Video,Photos
, Files


Support for 5Gbit/s


Very dynamic pages:


With social networks; p(
u,t
) = HTML


p(
g,u,t
) = HTML
-
> WHERE
group_id

= ? AND …

What is GROU.PS ?




Distributed Architecture

25+ servers, S3 cloud,
EdgeCast

CDN

4 cores +

All Linux: Red Hat

Some
Debian
,
Ubuntu
,
CentOS

Amazon Technologies


S3


CloudFront


EC2 (elastic IP and persistent storage)


SimpleDB


Queue technologies, distributed
hadoop

and
more…

Amazon Technologies


Downside:


Not so cheap


Bad database performance

Serving Content?


Use
MogileFS



Distributed file serving


Use CDN


hot content served off from local servers


Sysctl

tunings needed!


Our typical
sysctl

additions


net.ipv4.tcp_syncookies = 1


net.ipv4.tcp_synack_retries = 2


##
Emre

edited


# http://www.oracle
-
base.com/articles/11g/OracleDB11gR1InstallationOnFedora8.php


kernel.shmall

= 2097152


kernel.shmmax

= 2147483648


kernel.shmmni

= 4096


# semaphores:
semmsl
,
semmns
,
semopm
,
semmni


kernel.sem = 250 32000 100 128


net.ipv4.ip_local_port_range = 1024 65000


net.core.rmem_default
=4194304


#
net.core.rmem_max
=4194304


net.core.wmem_default
=262144


#
net.core.wmem_max
=262144


fs.file
-
max=5049800


vm.swappiness
=10


##
Emre

edited


# from http://forums.softlayer.com/showthread.php?t=3252


net.ipv4.tcp_rmem = 4096 87380 8388608


net.ipv4.tcp_wmem = 4096 87380 8388608


net.core.rmem_max

= 8388608


net.core.wmem_max

= 8388608


net.core.netdev_max_backlog

= 5000


net.ipv4.tcp_window_scaling = 1


net.ipv4.ip_nonlocal_bind=1


# http://rackerhacker.com/2007/08/24/apache
-
no
-
space
-
left
-
on
-
device
-
couldnt
-
create
-
accept
-
lock/


kernel.msgmni

= 1024


kernel.sem = 250 256000 32 1024


net.ipv4.ip_conntrack_max = 524288


net.ipv4.netfilter.ip_conntrack_max = 524288


MySQL


Load off via
memcache


$
memcache
-
>set(“
group_by_name.jtpd
”, 1122, false, 0);


$
memcache
-
>set(“home_module_html.1122”,…, true, 30);


f
unction
getGroupID
($
group_name
) {


global $
memcache
;


if( !
isset
($
memcache
) || ($res=($
memcache
-
>get(“
group_by_name
.{$
group_name
}”)))===false ) {


// get it from
mysql

and
memcache


}


else {


return $res; // serve from
memcache


}

}

MySQ
L


Replication easy


Split Reads


What about writes?


That’s where
sharding

comes to play


Vertical
Sharding


Horizontal
Sharding


MMM

MySQL


Runs poorly on multi
-
cores


query_cache_size

= 0 # on master


query_cache_type

= 0 # on master


thread_concurrency

= 8 # total cores


max_connections

= 750 # shouldn’t exceed
that


innodb_buffer_pool_size

= 10G # a little less
than the total amount

MySQL

Query Optimization


INDEX group, user


WHERE group = ? AND user = ?


Not WHERE user = ? AND group = ?


B
-
tree

MySQL

Query Optimization


SHOW PROCESSLIST


Maatkit
,
mk
-
query
-
digest


Percona

builds

NOSQL


Voldemort
,
L
inkedin


Cassandra,
Facebook


Tokyo Cabinet,
mixi

Logging


Database logging is not the solution


File system is expensive too


A legal necessity

Logging


Solution:


Scribe & Thrift


By
Facebook


Eventually consistent


Nginx

&
libevent

Nginx

&
libevent


Handles 10000 connections


5gbit/s


Rambler


Wordpress


Grou.ps

Postfix


Run multiple instances


Spam Clusters

Monitoring


Munin

+
monit


Other alternatives:


Cacti


Nagios


Hyperic



vmware


PHP

More to come on my blog


http://emresokullu.com


More fine tuning tips


Become a member of my community


Love grou.ps ;)


Convert to PHP


We’re hiring: jobs@groups
-
inc.com