PostgreSQL 9 - Hagander · PostgreSQL 9.6. Development schedule June 30, 2015 - branch 9.5 July...

48
PostgreSQL 9.6 PGConf,UK 2016 London, UK Magnus Hagander [email protected]

Transcript of PostgreSQL 9 - Hagander · PostgreSQL 9.6. Development schedule June 30, 2015 - branch 9.5 July...

PostgreSQL 9.6PGConf,UK 2016

London, UK

Magnus Hagander [email protected]

Magnus HaganderRedpill Linpro

Infrastructure servicesPrincipal database consultant

PostgreSQLCore Team memberCommitterPostgreSQL Europe

PostgreSQL 9.6

Development scheduleJune 30, 2015 - branch 9.5July 2015 - CF1September 2015 - CF2November 2015 - CF3January 2016 - CF4March 2016 - CF5June 2016 - Beta2!

Current statusBeta 2

Testing and fixesMay still be removedPlease help!

New FeaturesDBA and administrationDeveloper and SQL featuresReplication and backupPerformance

idle in transaction timeoutSimple: kill idle in transaction sessions

postgres=# set idle_in_transaction_session_timeout = 5000;SETpostgres=# begin;BEGINpostgres=# FATAL: terminating connection due to idle­in­transaction

pg_stat_activityNow has much better wait informationNot just a booleanwaiting column is now gone

Update your scripts!

pg_stat_activitypostgres=# SELECT * FROM pg_stat_activity WHERE wait_event IS NOT NULL­[ RECORD 1 ]­­­­+­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­pid | 4026...state_change | 2016­04­14 14:33:10.621561+02wait_event_type | Lockwait_event | transactionidstate | active...query | select * from a for update;

pg_blocking_pidsReturns array of pids that are blocking xUse on a process in waiting state

Shows who to blamepostgres=# select * from pg_blocking_pids(4026); pg_blocking_pids ­­­­­­­­­­­­­­­­­­ 4021(1 row)

Utility command progresspostgres=# SELECT * FROM pg_stat_progress_vacuum ;­[ RECORD 1 ]­­­­­­+­­­­­­­­­­­­­­pid | 4021datid | 12407datname | postgresrelid | 16402phase | scanning heapheap_blks_total | 4425heap_blks_scanned | 27heap_blks_vacuumed | 0index_vacuum_count | 0max_dead_tuples | 291num_dead_tuples | 0

System informationView: pg_config

Same info as binary pg_configFunctions: pg_control_*

Same info as pg_controldata

Vacuum of frozen pagesTrack all-frozen pagesAvoid VACUUM on all-frozen pages

Anti-wraparound autovacManual freezeCOPY FREEZE

Much lighter on mostly-read tables

postgres_fdwUse remote extensions

Whitelist per serverManually install on remote!Use functions/operators locally

ALTER SERVER foo OPTIONS (extensions 'pgcrypto,tablefunc')

New FeaturesDBA and administrationDeveloper and SQL featuresReplication and backupPerformance

Phrase searchingpostgres=# SELECT plainto_tsquery('quick fox') @@ to_tsvector('the quick brown fox jumped'); ?column? ­­­­­­­­­­ t(1 row)

postgres=# SELECT phraseto_tsquery('quick fox') @@ to_tsvector('the quick brown fox jumped'); ?column? ­­­­­­­­­­ f(1 row)

Phrase searchingpostgres=# SELECT tsquery('quick <­> fox') @@ to_tsvector('the quick brown fox jumped'); ?column? ­­­­­­­­­­ f(1 row)

postgres=# SELECT tsquery('quick <2> fox') @@ to_tsvector('the quick brown fox jumped'); ?column? ­­­­­­­­­­ t(1 row)

New FeaturesDBA and administrationDeveloper and SQL featuresReplication and backupPerformance

wal_level=replicaSame as old hot_standbyarchive has been retired

If specified, maps to replica

pg_stat_wal_receiverOn standbys only"Mirror" of pg_stat_replicationZero or one rows

Replication slotspg_basebackup

Can now create slotOnly used for replication

pg_create_physical_replication_slotCan now reserve WAL directly

Multiple sync standbysRequires more than one server to ack commitIncrease availability in case of multi-node failure

synchronous_standby_names = 'node1'

synchronous_standby_names = '3 (node1, node2, node3, node4)'

synchronous_commit ='remote_apply'

Waits for full WAL apply on standbySlower than 'on'

But not necessarily muchGuarantees data available for slave readCan be combined with multiple sync

New FeaturesDBA and administrationDeveloper and SQL featuresReplication and backupPerformance

Faster time datatypes outputtimestamp, date and timeMuch faster output functionsCopy up to 2x faster!

Single table, single column timestamp

Locking changesEven more...For high concurrency loadsAlso better tracing

Relation extensionUsed to extend by one block

Much blocking in write intensive loadsNow extends multiple blocks at once

20 * number of waiters

Checkpoint sortingI/O at checkpoints no longer random

Sorted by tablespaceThen relfilenodeThen forkThen block

Much more sequential writing

Kernel writeback configIssues with large write cachesOS would buffer writes "too long"And flush all at once

Causing I/O stormsCould be configured on global level

/proc/sys/vm/dirty_background_ratio etc

Kernel writeback configNow configurable in postgresql.confPlatform dependentEnabled by default on Linux only

for nowUsually better to "flush early"

Exception workload:Bigger than shared_buffersSmaller than OS cache

Kernel writeback configcheckpoint_flush_aer

Default: 256Kbbgwriter_flush_aer

Default: 512Kbbackend_flush_aer

Default: 128Kb

postgres_fdwControl fetch_size

Per table or per server(Used to be 100)

postgres_fdwPush down joins

Normal joinsNot anti/semi

Push down orderingTriggers remote ORDER BY

Make direct updates and deletesNo SELECT FOR UPDATE

Parallelism

ParallelismCPU intensive workloadsPreviously, single query=single coreBut we have many cores now...

ParallelismMany different partsMany still remainingBut already very useful!

Parallel seq scansScan a single table using multiple workersIncrease throughputFunctions can be pushed down

Filtering functionsTarget functionsIf marked parallel safe

Foundation for many others

Parallel aggregatesAggregates oen CPU-boundPartial aggregation in workerFinal combination in parentRequires aggregate-specific support

Most built-inExcept string, json, xml, arraysAnd not ordered-sets

Parallel joinsBased in parallel seq scanEach "partition" joined individually

In a separate workerNot all joins

>Only NestLoop and HashOther restrictions

Controlling parallelismmax_worker_processes = n

Globalmax_parallel_degree = n

Max per individual queryLimited by max_worker_processes

Controlling parallelismparallel_setup_cost = nparallel_tuple_cost= nforce_parallel_mode = n

Controlling parallelismALTER TABLE .. SET (parallel_degree = n)

Default determines by relation sizeALTER FUNCTION .. PARALLEL SAFEALTER FUNCTION ... COST

Ok, one last thingAnybody used Oracle?

ORA­01555: snapshot too old

Yup, we have that now

Snapshot too oldConfigured by timeTerminates old transactions

If repeatable_read or higherPrevents bloat buildup

old_snapshot_threshold = <minutes>Default is off

postgres=# SELECT * FROM c;ERROR: snapshot too old

There's always moreLots of smaller fixesPerformance improvementsetc, etcCan't mention them all!

What's your biggest feature?ParallelismVacuum freezeSnapshot Too OldMultiple sync standbyspostgres_fdw improvementsWait/lock monitoringOther?

Thank you!Magnus Hagander

[email protected] @magnushagander

http://www.hagander.net/talks/

This material is licensed