Introduction to PostgreSQL

52
Introduction to PostgreSQL November, 2014 Creative Commons Attribution License

description

This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014. http://www.meetup.com/nylug-meetings/events/180533472/

Transcript of Introduction to PostgreSQL

Page 1: Introduction to PostgreSQL

Introduction to PostgreSQL

November, 2014

Creative Commons Attribution License

Page 2: Introduction to PostgreSQL

Who Are We?

● Jim Mlodgenski– [email protected]– @jim_mlodgenski

● Co-organizer of– NYCPUG - www.nycpug.org

● Director, PgUS– www.postgresql.us

● CTO, OpenSCG– www.openscg.com

● Jonathan S. Katz– [email protected] – @jkatz05

● Co-organizer of– NYCPUG - www.nycpug.org

● Director, PgUS– www.postgresql.us

● CTO, VenueBook– www.venuebook.com

Page 3: Introduction to PostgreSQL
Page 4: Introduction to PostgreSQL

History

● The world’s most advanced open source database● Designed for extensibility and customization● ANSI/ISO compliant SQL support● Actively developed for almost 30 years

– University POSTGRES (1986-1993)– Postgres95 (1994-1995)– PostgreSQL (1996-2014)

Page 5: Introduction to PostgreSQL

Timeline

“Over the past few years, PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications.” – Jeff Barr, Chief Evangelist, Amazon Web Services

Page 6: Introduction to PostgreSQL

Why PostgreSQL

Affordability

Technology

Security

Flexibility

Stability

Extensibility

Reliability

Predictability

Community

Auditability

Page 7: Introduction to PostgreSQL

Technology

● Full Featured Database– Mature Server Side Programming Functionality

– Hot Standby High Availability

– Online Backups

– Point In Time Recovery

– Table Partitioning– Spatial Functionality– Full Text Search

Page 8: Introduction to PostgreSQL

Security

● Object Level Privileges assigned to “Roles & User”● Row Level Security● Many Authentication mechanisms

– Kerberos– LDAP– PAM– GSSAPI

● Native SSL Support.● Data Level Encryption (AES, 3DES, etc)● Ability to utilize 3rd party Key Stores in a full PKI

infrastructure

Page 9: Introduction to PostgreSQL

Flexibility

● No Vendor Lock-in– Compliant with the ANSI SQL standard

– Runs on all major platforms using all major languages and middleware

● “BSD-like” license – PostgreSQL License– Allows businesses to retain the option of commercializing the final product

with minimal legal issues– No fear of “Open Source Viral Infection”

Page 10: Introduction to PostgreSQL

Predictability

● Predictable release cycles– The average span between major

releases over the last 10 years is 13 months

● Quick turn around on patches– The average span between minor

releases over the last 5 years is 3 months

Version Release Date

7.3 Nov-02

7.4 Nov-03

8.0 Jan-05

8.1 Nov-05

8.2 Dec-06

8.3 Feb-08

8.4 Jul-09

9.0 Aug-10

9.1 Sep-11

9.2 Sep-12

9.3 Sep-13

Page 11: Introduction to PostgreSQL

Community

● Strong Open Source Community● Independent & Thriving Development Community

– 10+ committers and ~200 reviewers– 1,500 contributors and 10,000+ members

● Millions of downloads per year

● PostgreSQL is a meritocracy– Influence through their merits (usually technical) of the contributor

Page 12: Introduction to PostgreSQL

Who's Using PostgreSQL

Page 13: Introduction to PostgreSQL

PostgreSQL Success Stories

“…With PostgreSQL we have been successful in growing the databases as the company

has grown, both in number of users and in the complexity of services we offer…”

Hannu Krosing – Database Architect Skye Technologies.

“We manage multiple terabytes of data in more than 50 unique production PostgreSQL databases.”

Cisco uses PostgreSQL as the embedded database in all its “Case Sensitive Routing”

(CSR) products to store carrier details, rules, contacts, routes – to perform call routing.

“…Fujitsu is proud of its sponsorship of contributions to PostgreSQL and of its work with

The PostgreSQL community. We are committed to helping make PostgreSQL the leading

Database Management System…”

Takayuki Nakazawa – Director Database in Software Group.

Page 14: Introduction to PostgreSQL

Database 101

● A database stores data● Clients ( people or applications ) input data into tables

( relations ) in the database and retrieve data from it● Relational Database Management Systems are responsible

for managing the safe-storage of data● RDBMSs are designed to store data in an A.C.I.D compliant

way ( all or nothing )– This is done via transactions

Page 15: Introduction to PostgreSQL

Database 101 - (ACID)

● Atomic – Store data in an 'all-or-nothing' approach

● Consistent – Give me a consistent picture of the data

● Isolated– Prevent concurrent data access from causing me woe

● Durable– When I say 'COMMIT;' the data, make sure it is safe until I explicitly destroy it

Page 16: Introduction to PostgreSQL

Database 101 - (Transactions)

● All or nothing● A transaction has

– A Beginning ( BEGIN; )

– Work ( multiple lines of SQL, i.e. INSERT / UPDATE / DELETE)– An Ending ( END; ) You would expect one of two cases

● COMMIT; ( save everything )● ROLLBACK; ( undo all changes, save nothing)

– Once the transaction has ended, it will either make ALL of the changes between BEGIN; and COMMIT; or NONE of them ( if there is an error for example )

Page 17: Introduction to PostgreSQL

PostgreSQL 101

● PostgreSQL meets all of the requirements to be a fully ACID-compliant, transactional database.

● PostgreSQL RDBMS serves a cluster aka an instance.– An instance serves one ( and only one ) TCP/IP port– Contains at least one database– Has an associated data-directory

Page 18: Introduction to PostgreSQL

Major Features● Full network client-server architecture● ACID compliant● Transactional ( uses WAL / REDO )● Partitioning● Tiered storage via tablespaces● Multiversion Concurrency Control

( readers don't block writers )

● On-line maintenance operations● Hot ( readonly ) and Warm ( quick-

promote ) standby ● Log-based and trigger based replication● SSL● Full-text search● Procedural languages

– Pl/pgSQL plus other, custom languages

Page 19: Introduction to PostgreSQL

General Limitations

Limit Value

Maximum Database Size Unlimited

Maximum Table Size 32 TB

Maximum Row Size 1.6 TB

Maximum Field Size 1 GB

Maximum Rows / Table Unlimited

Maximum Columns / Table 250-1600

Maximum Indexes / Table Unlimited

Page 20: Introduction to PostgreSQL

Client Architecture

Page 21: Introduction to PostgreSQL

Server Overview

● PostgreSQL utilizes a multi-process architecture● Similar to Oracle's 'Dedicated Server' mode● Types of processes

– Primary ( postmaster )– Per-connection backend process– Utility ( maintenance processes )

Page 22: Introduction to PostgreSQL

Server Architecture

Page 23: Introduction to PostgreSQL

Process Components

Page 24: Introduction to PostgreSQL

Memory Components

Page 25: Introduction to PostgreSQL

On-Disk Components

Page 26: Introduction to PostgreSQL

Data Types

● Building blocks of a schema● Optimized on-disk format for a specific type of data● PostgreSQL provides:

– Wide array (no pun intended) of basic to complex data types– Functional interfaces for ease of manipulation– Ability to extend and create custom data types

Page 27: Introduction to PostgreSQL

Number Types

Name Storage Size Range

smallint 2 bytes -32768 to +32767

integer 4 bytes -2147483648 to +2147483647

bigint 8 bytes -9223372036854775808 to 9223372036854775807

decimal variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

numeric variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

real 4 bytes 6 decimal digits precision

double 8 bytes 15 decimal digits precision

Page 28: Introduction to PostgreSQL

Character Types

Name Description

varchar(n) variable-length with limit

char(n) fixed-length, blank padded

text variable unlimited length

Page 29: Introduction to PostgreSQL

Date/Time Types

Name Size Range Resolution

timestamp without timezone

8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits

timestamp with timezone

8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits

date 4 bytes 4713 BC to 5874897 AD 1 day

time without timezone

8 bytes 00:00:00 to 24:00:00 1 microsecond / 14 digits

time with timezone

12 bytes 00:00:00+1459 to 24:00:00-1459

1 microsecond / 14 digits

interval 12 bytes -178000000 years to 178000000 years

1 microsecond / 14 digits

Page 30: Introduction to PostgreSQL

Specialized Types

Name Storage Size Range

boolean 1 byte false to true

smallserial 2 bytes 1 to 32767

serial 4 bytes 1 to 2147483647

bigserial 8 bytes 1 to 9223372036854775807

bytea 1 to 4 bytes plus size of binary string

variable-length binary string

cidr 7 or 19 bytes IPv4 or IPv6 networks

inet 7 or 19 bytes IPv4 or IPv6 hosts or networks

macaddr 6 bytes MAC addresses

uuid 16 bytes Universally Unique Identifiers

Page 31: Introduction to PostgreSQL

“Schema-less” Types

Name Description

xml stores XML data and checks the input values for well-formedness

hstore stores sets of key/value pairs

json stores an exact copy of the input JSON document

jsonb stores a decomposed binary format of the input JSON document

Page 32: Introduction to PostgreSQL

Range Types

● Represents a range of an element type– Integers– Numerics– Times– Dates– And more...

Page 33: Introduction to PostgreSQL

Range TypesCREATE TABLE travel_log (

id serial PRIMARY KEY,

name varchar(255),

travel_range daterange,

EXCLUDE USING gist (travel_range WITH &&)

);

INSERT INTO travel_log (name, trip_range) VALUES ('Chicago', daterange('2012-03-12', '2012-03-17'));

INSERT INTO travel_log (name, trip_range) VALUES ('Austin', daterange('2012-03-16', '2012-03-18'));

ERROR: conflicting key value violates exclusion constraint "travel_log_trip_range_excl"

DETAIL: Key (trip_range)=([2012-03-16,2012-03-18)) conflicts with existing key (trip_range)=([2012-03-12,2012-03-17)).

Page 34: Introduction to PostgreSQL

Indexes

● Enhances database performance

● Enforces some types of constraints– Uniqueness– Exclusion

Page 35: Introduction to PostgreSQL

Index Types

● B-Tree● Generalized Inverted Index (GIN)● Generalized Search Tree (GIST)● Space-Partitoned Generalized Search Tree (SP-GIST)

Coming Soon...● Block Range Index (BRIN) ● “VODKA”

Page 36: Introduction to PostgreSQL

Procedural Languages

● Allows for use defined functionality to be run within the database– Used as functions or triggers

● Frequent use cases– Enhance performance– Increase security– Centralize business logic

Page 37: Introduction to PostgreSQL

Procedural Language Types

● PL/pgSQL● PL/Perl● PL/TCL● PL/Python● More available through extensions...

Page 38: Introduction to PostgreSQL

Extensions

● Additional modules that can be plugged into PostgreSQL● Can be used to add a ton of useful features

– Procedural Languages– Data Types– Administration Tools– Foreign Data Wrappers

● Many found in contrib● Also www.pgxn.org

Page 39: Introduction to PostgreSQL

Procedural Language Extensions

● pl/Java● pl/v8● pl/R● pl/Ruby● pl/schema● pl/lolcode

● pl/sh● pl/Proxy● pl/psm● pl/lua● pl/php

Page 40: Introduction to PostgreSQL

Data Type Extensions

● Hstore● Case Insensitive Text (citext)● International Product Numbering Standards (ISN)● PostGIS (geometry)● BioPostgres● SSN● Email

Page 41: Introduction to PostgreSQL

PostGIS

● PostGIS adds OpenGIS Consortium (OGC) compliant geometry data types and functions to PostgreSQL

● With PostgreSQL, becomes a best of breed spatial and raster database

Page 42: Introduction to PostgreSQL

Administration Tool Extensions

● auto_explain● pageinspect● pg_buffercache● pg_stat_statements● Slony● OmniPITR● pg_monitoring● pgaudit● pg_partman

Page 43: Introduction to PostgreSQL

What are Foreign Data Wrappers?

● Used with SQL/MED– New ANIS SQL 2003 Extension

– Management of External Data

– Standard way of handling remote objects in SQL databases

● Wrappers used by SQL/MED to access remotes data sources

● Makes external data sources look like a PostgreSQL table

Page 44: Introduction to PostgreSQL

FDW Extensions

● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC

● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC

● TDS (Sybase/SQL Server)● S3● WWW● PG-Strom● Column Store● Delimited files● Fixed length files● JSON files

● Hadoop● MongoDB● CouchDB● MonetDB● Redis● Neo4j● Tycoon● LDAP

Page 45: Introduction to PostgreSQL

MongoDB FDWCREATE SERVER mongo_server FOREIGN DATA WRAPPER

mongo_fdw OPTIONS (address '192.168.122.47', port '27017');

CREATE FOREIGN TABLE databases (

_id NAME,

name TEXT

)

SERVER mongo_server

OPTIONS (database 'mydb', collection 'pgData');

test=# select * from databases ;

_id | name

--------------------------+------------

52fd49bfba3ae4ea54afc459 | mongo

52fd49bfba3ae4ea54afc45a | postgresql

52fd49bfba3ae4ea54afc45b | oracle

52fd49bfba3ae4ea54afc45c | mysql

52fd49bfba3ae4ea54afc45d | redis

52fd49bfba3ae4ea54afc45e | db2

(6 rows)

Page 46: Introduction to PostgreSQL

WWW FDW

test=# SELECT * FROM www_fdw_geocoder_googletest-# WHERE address = '731 Lexington Ave, New York, NY';

-[ RECORD 1 ]-----+----------------------------------------------address | type | street_addressformatted_address | 731 Lexington Avenue, New York, NY 10022, USAlat | 40.7619363lng | -73.9681017location_type | ROOFTOP

Page 47: Introduction to PostgreSQL

PL/Proxy

● Developed by Skype

● Allows for scalability and parallelization

● Uses procedural languages and FDWs

Page 48: Introduction to PostgreSQL

PostgreSQL Replication

● Replicate to read-only databases using native streaming replication

● All writes go to a master server

● Load balance across the pool of servers

Page 49: Introduction to PostgreSQL

PostgreSQL Scalability

● PostgreSQL scales up linearly up to 64 cores

● May scale further but hardware is not available to the community

http://rhaas.blogspot.com/2012/04/did-i-say-32-cores-how-about-64.html

Page 50: Introduction to PostgreSQL

Getting Help

● Community Mail List– http://www.postgresql.org/list/

● IRC– irc://irc.freenode.net/postgresql

● NYC PostgreSQL User Group– http://www.nycpug.org

Page 51: Introduction to PostgreSQL
Page 52: Introduction to PostgreSQL

Questions?