2016 may-countdown-to-postgres-v96-parallel-query

Post on 17-Feb-2017

178 views 0 download

Transcript of 2016 may-countdown-to-postgres-v96-parallel-query

Welcome Parallelism to PostgreSQL

Thursday, 19 May 2016

• Current State of Parallelism in PostgreSQL

• What was needed to bring server side parallelism – Work done in v9.4 and v9.5

• Parallel Query in v9.6

• Review some parallel plans

• Parallelism may not be used always

• Parallelism may not be useful always

• Parameters

• Benefits

• Questions

Agenda

2

• Client side parallelism – Application can open multiple sessions • One can run a batch with multiple application threads

• Server side languages can potentially do parallel operations

• I/O activity is taken off from main query execution process by walwriter and bgwriter

• effective_io_concurrency allows page prefetch requests to the kernel, for bitmap joins

• But there is no server side parallelism for dividing the same task among multiple-workers

Current State (v9.5) of Parallelism in PostgreSQL

3

v9.4

• Dynamic background workers

• Dynamic shared memory

• Implementation of shared memory message queues

v9.5

• Message propagation i.e. error messages from background worker can be sent to master and received by master

• Synchronization of state (GUC values, XID, CID mapping, current user and current dbetc)

• Parallel Contexts can be used by backend code to launch worker processes

A lot of work was needed and was done!

4

• Parallel Sequential Scan

• Parallel Joins

• Parallel Aggregates

• Though these are not in their best forms and have certain exceptions/limitations but they still work and quite useful!

v9.6: We have something that users can use!

5

Basically how parallelism is supposed to work

6

Let’s look at some plans

Sequential Scan without Parallelism

8

Parallel Sequential Scans

9

You may not get as many workers as you desire

10

Parallel Aggregate

11

Parallel Joins

12

Wow! So using ‘Parallel Workers’ should be

preferred!No, not really!

Parallel Query May not be used all the time

• Cost of working and coordinating among multiple worker processes defeats the advantage of parallelism

• Cost of setting up parallelism infrastructure is too high

• No worker process is available

14

Example

15

Parallel Query may not be good all the time

16

Parallel Query may not be good all the time

• It depends a lot on your hardware resources and process scheduling by your OS

• I tried various degree of parallelism on a test machine • 3 CPU, 3GB RAM • VM Running CentOS• Single I/O disk

• A simple ‘count’ on a table with 100million rows and 8 byte width• explain analyze select count(*) from pgbench_accounts ;

• It performs faster with parallel degree set to 0, as index scan is performed

• Make sure you have tuned your parameters well to help optimizer decide

17

Parameters Involved

Parameters which govern parallel query execution

• parallel_setup_cost

• parallel_tuple_cost

• max_worker_processes

• max_parallel_degree

• force_parallel_mode

• ALTER TABLE … SET (parallel_degree=n)

• ALTER FUNCTION … PARALLEL SAFE

• ALTER FUNCTION … COST

19

Benefits to the users

• Sequential scan on large tables would be faster

• Analytics workload involve aggregates would be faster

• Faster JOINs between large tables

• PostgreSQL v9.6 can be a good candidate for the backend database of data warehouse

• More parallel operations to come in future releases

20

What can you do?

• PostgreSQL Beta 1 is out

• Try it out…

• Test it…

• Break it…

• Report it

• Help PostgreSQL community make it better

21

Further Reading

• PGCon 2014: Implementing Parallelism in PostgreSQL, Robert Haas

• PGConf.US, 2016: PostgreSQL 9.6, Magnus Hagander

• PGCon, Ottawa 2015: Parallel Sequential Scan, Robert Haas and Amit Kapila at

• EnterpriseDB Blog: Parallelism Progress, Robert Haas

• Parallel Sequential Scan is Committed, Robert Haas

• EnterpriseDB Blog: Parallelism Becomes a Reality in Postgres, Amit Kapila

22

Send us your suggestions and questions

success@ashnik.com

Stay Tuned!

Website: www.ashnik.com