Scaling PostgreSQL with Skytools
-
Upload
gavin-roy -
Category
Technology
-
view
118 -
download
2
description
Transcript of Scaling PostgreSQL with Skytools
![Page 1: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/1.jpg)
Scaling with SkyTools& More
Scaling-Out Postgres with Skype’s Open-Source Toolset
Gavin M. RoySeptember 14th, 2011
![Page 2: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/2.jpg)
About Me
• PostgreSQL ~ 6.5
• CTO @myYearbook.com
• Scaled initial infrastructure
• Not as involved day-to-day database operational and development
• Twitter: @Crad
![Page 3: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/3.jpg)
Scaling?
![Page 4: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/4.jpg)
Concurrency
6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am 6am
Hourly breakdown
Req
uest
s pe
r Se
cond
![Page 5: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/5.jpg)
Increasing Size-On-Disk
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Size
in G
B
![Page 6: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/6.jpg)
Scaling andPostgreSQL Behavior
![Page 7: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/7.jpg)
Size on Disk
![Page 8: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/8.jpg)
Tuples, Indexes, Overhead
![Page 9: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/9.jpg)
Table Size+
Size of all combined Indexes
Relations Indexes
![Page 10: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/10.jpg)
Constraints
• Available Memory
• Disk Speed
• IO Bus Speed
![Page 11: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/11.jpg)
Keep it in memory.
![Page 12: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/12.jpg)
Get Fast Disks & I/O.
![Page 13: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/13.jpg)
Process Forking+
Locks
![Page 14: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/14.jpg)
Client Connections
![Page 15: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/15.jpg)
One Connection per Concurrent Request
![Page 16: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/16.jpg)
Apache+PHPOne connection per backend for each pg_connect
![Page 17: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/17.jpg)
PythonOne connection per connection*
![Page 18: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/18.jpg)
ODBCOne connection to Postgres per ODBC connection
![Page 19: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/19.jpg)
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend Client Connection
Lock Contention?
Each backend for a connected client has to check for locks
![Page 20: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/20.jpg)
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend Client Connection
Connection Backend Client Connection
New Client Connection?
Access ShareAccess Exclusive
ExclusiveShare
Share Row ExclusiveShare UpdateRow Share
Row Exclusive
![Page 21: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/21.jpg)
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend Client Connection
Connection Backend Client Connection
Connection Backend Client Connection
...
Too many connections?
Slow performance
![Page 22: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/22.jpg)
250 Apache Backendsx
1 Connection per Backendx
250 Servers=
62,500 Connections
![Page 23: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/23.jpg)
Solvable Problems!
![Page 24: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/24.jpg)
The Trailblazers
![Page 25: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/25.jpg)
Solving Concurrency
![Page 26: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/26.jpg)
pgBouncer
![Page 27: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/27.jpg)
Session Pooling
![Page 28: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/28.jpg)
Transactional Pooling
![Page 29: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/29.jpg)
Statement Pooling
![Page 30: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/30.jpg)
Connection Pooling
Clients Clients Clients
Postgres Server #1
pgBouncer
Postgres Server #2
Postgres Server #3
Tens TensTens
Hundreds HundredsHundreds
![Page 31: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/31.jpg)
Add Local Pooling
Local pgBouncer Local pgBouncer Local pgBouncer
Postgres Server #1
pgBouncer
Postgres Server #2
Postgres Server #3
ClientsClients Clients
Tens TensTens
Hundreds HundredsHundreds
Tens TensTens
![Page 32: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/32.jpg)
Easy to runUsage: pgbouncer [OPTION]... config.ini -d, --daemon Run in background (as a daemon) -R, --restart Do a online restart -q, --quiet Run quietly -v, --verbose Increase verbosity -u, --user=<username> Assume identity of <username> -V, --version Show version -h, --help Show this help screen and exit
![Page 33: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/33.jpg)
userlist.txt
“username” “password”“foo” “bar”
![Page 34: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/34.jpg)
pgbouncer.ini
![Page 35: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/35.jpg)
Specifying Connections[databases]; foodb over unix socketfoodb =
; redirect bardb to bazdb on localhostbardb = host=localhost dbname=bazdb
; access to dest database will go with single userforcedb = host=127.0.0.1 port=300 user=baz password=foo client_encoding=UNICODE datestyle=ISO connect_query='SELECT 1'
![Page 36: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/36.jpg)
Base Daemon Config
[pgbouncer]logfile = pgbouncer.logpidfile = pgbouncer.pid; ip address or * which means all ip-slisten_addr = 127.0.0.1listen_port = 6432; unix socket is also used for -R.;unix_socket_dir = /tmp
![Page 37: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/37.jpg)
Authentication
; any, trust, plain, crypt, md5auth_type = trust#auth_file = 8.0/main/global/pg_authauth_file = etc/userlist.txtadmin_users = user2, someadmin, otheradminstats_users = stats, root
![Page 38: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/38.jpg)
Stats Users?
SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|VERSIONSHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
pgbouncer=# SHOW CLIENTS; type | user | database | state | addr | port | local_addr | local_port | connect_time ------+-------+-----------+--------+-----------+-------+------------+------------+--------------------- C | stats | pgbouncer | active | 127.0.0.1 | 47229 | 127.0.0.1 | 6000 | 2011-09-13 17:55:46
* Truncated columns for display purposes
![Page 39: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/39.jpg)
psql 9.0+ Problem?
psql -U stats -p 6432 pgbouncerpsql: ERROR: Unknown startup parameter
Add to pgbouncer.ini:
ignore_startup_parameters = application_name
![Page 40: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/40.jpg)
Pooling Behaviorpool_mode = statement
server_check_query = select 1server_check_delay = 10
max_client_conn = 1000default_pool_size = 20
server_connect_timeout = 15server_lifetime = 1200server_idle_timeout = 60
![Page 41: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/41.jpg)
Skytools
![Page 42: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/42.jpg)
Read Only Copy Read Only Copy Read Only Copy Read Only Copy
Load Balancer
pgBouncer
Canonical Database
Clients Clients Clients Clients
Scale-Out Reads
![Page 43: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/43.jpg)
PGQ
![Page 44: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/44.jpg)
The Ticker
![Page 45: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/45.jpg)
ticker.ini [pgqadm] job_name = pgopen_ticker db = dbname=pgopen # how often to run maintenance [seconds] maint_delay = 600 # how often to check for activity [seconds] loop_delay = 0.1 logfile = ~/Source/pgopen_skytools/%(job_name)s.log pidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
![Page 46: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/46.jpg)
Getting PGQ Running
Setup our ticker:
pgqadm.py ticker.ini install
Run the ticker daemon:
pgqadm.py ticker.ini ticker -d
![Page 47: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/47.jpg)
Londiste
![Page 48: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/48.jpg)
replication.ini[londiste]job_name = pgopen_to_destination provider_db = dbname=pgopen subscriber_db = dbname=destination # it will be used as sql ident so no dots/spacespgq_queue_name = pgopen logfile = ~/Source/pgopen_skytools/%(job_name)s.logpidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
![Page 49: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/49.jpg)
Install Londiste
londiste.py replication.ini provider install
londiste.py replication.ini subscriber install
![Page 50: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/50.jpg)
Start Replication Daemon
londiste.py replication.ini replay -d
![Page 51: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/51.jpg)
DDL?
![Page 52: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/52.jpg)
Add the ProviderTables and Sequences
londiste.py replication.ini provider add public.auth_user
![Page 53: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/53.jpg)
Add the SubscriberTables and Sequences
londiste.py replication.ini subscriber add public.auth_user
![Page 54: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/54.jpg)
Great Success!
![Page 55: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/55.jpg)
PL/Proxy
![Page 56: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/56.jpg)
Scale-Out Reads & Writes
A-F Server G-L Server M-R Server S-Z Server
plProxy Server
![Page 57: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/57.jpg)
How does it work?
![Page 58: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/58.jpg)
Simple Remote Connection
CREATE FUNCTION get_user_email(username text)RETURNS SETOF text AS $$ CONNECT 'dbname=remotedb'; SELECT email FROM users WHERE username = $1;$$ LANGUAGE plproxy;
![Page 59: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/59.jpg)
Sharded Request
CREATE FUNCTION get_user_email(username text)RETURNS SETOF text AS $$ CLUSTER “usercluster”; RUN ON hashtext(username);$$ LANGUAGE plproxy;
![Page 60: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/60.jpg)
Sharding Setup
• Need 3 Functions:
• plproxy.get_cluster_partitions(cluster_name text)
• plproxy.get_cluster_version(cluster_name text)
• plproxy.get_cluster_config(in cluster_name text, out key text, out val text)
![Page 61: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/61.jpg)
get_cluster_partitionsCREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(cluster_name text)RETURNS SETOF text AS $$BEGIN IF cluster_name = 'usercluster' THEN RETURN NEXT 'dbname=part00 host=127.0.0.1'; RETURN NEXT 'dbname=part01 host=127.0.0.1'; RETURN; END IF; RAISE EXCEPTION 'Unknown cluster';END;$$ LANGUAGE plpgsql;
![Page 62: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/62.jpg)
get_cluster_version
CREATE OR REPLACE FUNCTION plproxy.get_cluster_version(cluster_name text)RETURNS int4 AS $$BEGIN IF cluster_name = 'usercluster' THEN RETURN 1; END IF; RAISE EXCEPTION 'Unknown cluster';END;$$ LANGUAGE plpgsql;
![Page 63: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/63.jpg)
get_cluster_configCREATE OR REPLACE FUNCTION plproxy.get_cluster_config( in cluster_name text, out key text, out val text)RETURNS SETOF record AS $$BEGIN -- lets use same config for all clusters key := 'connection_lifetime'; val := 30*60; -- 30m RETURN NEXT; RETURN;END;$$ LANGUAGE plpgsql;
![Page 64: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/64.jpg)
get_cluster_config values
• connection_lifetime
• query_timeout
• disable_binary
• keepalive_idle
• keepalive_interval
• keepalive_count
![Page 65: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/65.jpg)
SQL/MED
![Page 66: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/66.jpg)
SQL/Med Cluster Definition
CREATE SERVER a_cluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime '1800', disable_binary '1', p0 'dbname=part00 hostname=127.0.0.1', p1 'dbname=part01 hostname=127.0.0.1', p2 'dbname=part02 hostname=127.0.0.1', p3 'dbname=part03 hostname=127.0.0.1' );
![Page 67: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/67.jpg)
PLProxy + SQL/Med Behavior
• PL/Proxy will prefer SQL/Med cluster definitions over the plproxy.get_* functions
• PL/Proxy will fallback to plproxy.get_* functions if there are no SQL/Med clusters
![Page 68: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/68.jpg)
SQL/MED User Mapping
CREATE USER MAPPING FOR bob SERVER a_cluster OPTIONS (user 'bob', password 'secret');
CREATE USER MAPPING FOR public SERVER a_cluster OPTIONS (user 'plproxy', password 'foo');
![Page 69: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/69.jpg)
plproxyrc
https://github.com/myYearbook/plproxyrc
• plpgsql based api for table based management of PL/Proxy
• Used to manage complicated PL/Proxy infrastructure @myYearbook
• BSD Licensed
![Page 70: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/70.jpg)
Postgres Server #1
Postgres Server #2
Postgres Server #3
pgBouncer
“Server-to-Server”
![Page 71: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/71.jpg)
![Page 72: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/72.jpg)
Complex PL/Proxy and pgBouncer Environment
Local pgBouncer
Local pgBouncer
Local pgBouncer
Postgres Server #1
pgBouncer
Postgres Server #3
Clients
Clients
Clients pgBouncer
Load Balancer
plProxy Server plProxy Server
Load Balancer
pgBouncer
pgBouncer
Postgres Server #3
![Page 73: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/73.jpg)
Other Tools and Methods?
![Page 74: Scaling PostgreSQL with Skytools](https://reader034.fdocuments.net/reader034/viewer/2022051412/54c6ac354a7959ba518b45c5/html5/thumbnails/74.jpg)
Questions?