PostgreSQL : Introduction

110
PostgreSQL introduction

Transcript of PostgreSQL : Introduction

Page 1: PostgreSQL : Introduction

PostgreSQL

introduction

Page 2: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

I

Part 1

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 2/110

Page 3: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Part 1

1 Introduction

2 Installation

3 The psql client

4 Authentication and privileges

5 Backup and restoration

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 3/110

Page 4: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Introduction

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 4/110

Page 5: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

What is PostgreSQL

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 5/110

Page 6: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

What is PostgreSQL ?

Open Source

BSD-like licenceMany commercial derivatives (EnterpriseDB,. . . )

Uncompromised

Slow evolution but rock solidNo for-profit company behindGoes much further than standards asks for.

Relational

ACIDObject-RelationalStored procedures

Cross-platform

LinuxWindowsMost *NIX

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 6/110

Page 7: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

History

Evolved from Ingres

Postgres = Ingres + SQL

Postgres released in 1995

1999 : support for real ACID and PL/pgSQL

2005 : optimized enough to become a real contender

2009 : PostgreSQL 8.4

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 7/110

Page 8: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

How does PostgreSQL compare to MySQL ?

Respects the standards strictly.

Strict value checking. Example: MySQL 4.X considers2012-01-00 a valid date , same for 2012-02-31, MySQL 5 fixesthe later example only.

Really ACID. Exemples

In PostgreSQL NOW() is the start of the transaction notreally the current timestamp.Transactional DDL : In MySQL, an ALTER will silently COMMIT

any transaction.In MySQL, foreign key cascades do not fire triggers,API’sdon’t fire triggers

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 8/110

Page 9: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

Maximum sizes of objectsObject PostgreSQL MyISAM InnoDB

Database Unlimited 1 Unlimited2 Adressable space3

Table size 32TB Max file size 4 Max file size 4

Row size 400GB 64kB 5 64kB 5

Field size 1GB 8kB 6 8kB 6

Columns up to 16007 ?8 10009

Indexes unlimited 256 2561some 32TB databases exist2but databases over 200GB are quite rare, or quite slow3innodbPageSize * 224 with default innodbPageSize = 16kB, for all tables

unless innodb file per table=144TB on typical linux5not counting BLOBs6not counting BLOBs, VARCHAR and, VARBINARY7250 to 1600, depending on field types8constraint is row size9row size constraint still applies

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 9/110

Page 10: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

What is PostgreSQL

How does PostgreSQL compare to MySQL ?

Deeply object-oriented (database creation is afork+inheritance, tables can inherit from one another...)

Storage of large objects (BLOB) is automagically ”put aside”to not clobber tables.

PostgreSQL is renowned in the geographical field :PostGIS

PostgreSQL can be used for datawarehousing :OpenStreetMap has a 2.4TB database and it’s a small one.

PostgreSQL does not have a query cache : you need anexternal tool for that

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 10/110

Page 11: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Basic concepts

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 11/110

Page 12: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Cluster

A database cluster is a single PostgreSQL server instance

If you want multiple clusters you will need :

Different config filesA different data directoryA different network port

Usually, servers only have one cluster

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 12/110

Page 13: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Database

A database is a set of objects that can be used together

TablesFunctionsCustom data typesViewsTriggers...

Once connected to a database, you stay inside it, and cannotuse objects from other databases

A database cluster can have multiple databases

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 13/110

Page 14: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Schema

A schema is a namespace within a database

All databases have a default, public schema

It can be used to separate things

It can also help with access control

A database can have multiple schemas

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 14/110

Page 15: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Tables

Your regular DB table

myschema.mytable : explicit table indication

mytable : look for table in the search path (contains public)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 15/110

Page 16: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic concepts

Tablespaces

The physical location of files on disk. Your friendly sysadmincan choose to put one tablespace on a faster medium (SSD)to increase performance of this tablespace’s tables.

Can be defined per database or per user

Or explicitely on table creation

Any object in the whole cluster can use any tablespace ifpermissions allow it

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 16/110

Page 17: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installation

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 17/110

Page 18: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

Installing PostgreSQL

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 18/110

Page 19: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

Grabbing the bits

Debian : aptitude install postgresql

RedHat :

yum install postgresql-server

chkconfig postgresql on

service postgresql initdb

service postgresql start

You might want to install postgresql-contrib, whichcontains may community-developped tools : benchmark tools,additional diagnostic tools, . . .

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 19/110

Page 20: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

First steps : processesLook at the running processes

postgres

-D /var/lib/postgresql/8.4/main

-c config file=/etc/.../postgresql.conf

Master database process

postgres: writer process

Writes data blocks back to disk

postgres: wal writer process

Writes transaction logs

postgres: autovacuum launcher process

Database housekeeping

postgres: stats collector process

Stats

postgres: postgres postgres [local] idle

Each connection to the DB spawns a process

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 20/110

Page 21: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

First steps : config files

Each cluster has its own set of configuration files and data files

environment : environment variables for the server

pg ctl.conf : startup command-line options

pg hba.conf : authentication

pg ident.conf : ident maps for authentication

postgresql.conf : main config file

start.conf : whether this cluster is auto started or not

When changing the configuration, there is usually no need torestart, a reload sould be enough

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 21/110

Page 22: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

First steps : files

That -D parameter was the $datadir : by default, everythingis there

On RedHat, everything is there

On Debian, as usual, conf goes to /etc/postgresql and logsgo to /var/log/postgresql

Debian has a version/cluster convention that allows youto run as many instances of as many versions ofPostgreSQL as you want easily

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 22/110

Page 23: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Installing PostgreSQL

First steps : files

base : default tablespace

global : global tablespace

pg tblspc : other tablespaces (can contain symlinks)

pg log : application logs (admin-readable)

pg clog, pg multixact, pg stat tmp, pg subtrans,pg twophase, pg xlog : various binary logs and technicaldirectories

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 23/110

Page 24: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic administration

Basic administration

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 24/110

Page 25: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Basic administration

Basic administration

su - postgres : required to do any administrative operationon the database with the default configuration

createuser -P myuser : create a new user

You should answer ”no” to all its questions : access usersshould not be able to create anything global

createdb -O myuser mydb : create a database that belongsto the user, he will have full privileges on it

you can now psql -hlocalhost -Umyuser mydb

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 25/110

Page 26: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

The psql client

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 26/110

Page 27: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

psql basic usage

Uses unix-socket by default to connect to localhost, checkyour ident or adapt pg hba.conf

Connect with psql DBNAME USERNAME -W for interactiveauth

Use psql -h 10.1.2.3 to connect to a remote postgres.

psql uses less as a result pager by default (usefull)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 27/110

Page 28: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

psql basic usage

Once connected,

\l : list database (equivalent to MySQL’s SHOW DATABASES)

\c : connect to database (equivalent to MySQL’s USE)

\dt : list tables in current database (equivalent to MySQL’sSHOW TABLES).

\dt+ : shows more information like ownerships and commentson each table.

\di : show indexes

\d table : lists columns of a table (equivalent to MySQL’sDESC table)

\x : Print one column per line

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 28/110

Page 29: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Autocommit

By default, autocommit is set, this means any command youenter in the psql client will be automatically commited as it itwere a mini-transaction

You can \set AUTOCOMMIT off

With autocommit set to off, even if you do not explicitelystart a transaction with BEGIN, no permanent change will bemade until you COMMIT

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 29/110

Page 30: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Passwords

psql will prompt automatically for a password if required

You cannot give the password on the command line !

.pgpass file in the homedirectory :

hostname:port:database:username:password

eg : *:*:mydb:myuser:mypw

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 30/110

Page 31: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Authentication and privileges

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 31/110

Page 32: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Authentification

Authentification

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 32/110

Page 33: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Authentification

Authentification

PostgreSQL allows you to choose between many authenticationschemes :

md5 : regular user/password based on embedded account info

ident : get identity from local socket (only for local auth !)

krb5

ldap

cert : SSL client auth

gss/sspi/pam : delegate to external framework

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 33/110

Page 34: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Authentification

Authentification

Authentication is configured in pg hba.conf:

# Database a d m i n i s t r a t i v e l o g i n by UNIX s o c k e t sl o c a l a l l p o s t g r e s i d e n t

# ” l o c a l ” i s f o r Unix domain s o c k e t c o n n e c t i o n s o n l yl o c a l a l l a l l i d e n t

# IPv4 l o c a l c o n n e c t i o n s :h o s t a l l a l l 1 2 7 . 0 . 0 . 1 / 3 2 md5

# IPv6 l o c a l c o n n e c t i o n s :h o s t a l l a l l : : 1 / 1 2 8 md5

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 34/110

Page 35: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Authentification

Authentification

Allowing external connections :

First, change the listen address in postgresql.conf

l i s t e n a d d r e s s e s = ’∗ ’

Then, add in pg hba.conf

# From j u s t one f r o n t s e r v e rh o s t a l l a l l 1 7 2 . 1 6 . 0 . 5 4 / 3 2 md5

# From a p o o l o f f r o n t s e r v e r sh o s t a l l a l l 1 7 2 . 1 6 . 0 . 0 / 2 4 md5

# From e v e r y w h e r e ( don ’ t )h o s t a l l a l l 0 . 0 . 0 . 0 / 0 md5

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 35/110

Page 36: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Roles : users and groups

Roles : users and groups

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 36/110

Page 37: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Roles : users and groups

Roles : users and groupsRole management :

CREATE ROLE something; : not very usefulCREATE ROLE foobar LOGIN PASSWORD "toto"; : role canloginCREATE USER foobar; : same thingCREATE ROLE foobar CREATEDB; : this role can createdatabasesCREATE ROLE foobar CREATEROLE; : this role can adminrolesCREATE ROLE foobar SUPERUSER; : this role is notsubmitted to permission checksALTER ROLE, DROP ROLE

GRANT group role TO user; : add a user to a groupSET ROLE group role; : gain the privileges of the group

Role display :SELECT * FROM pg roles;

\duwww.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 37/110

Page 38: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Privileges

Privileges

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 38/110

Page 39: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Privileges

Privileges

SELECT, INSERT, UPDATE, DELETE

TRUNCATE

REFERENCES, TRIGGER, TEMPORARY: create foreignkeys/triggers/temp tambes

CREATE : create tables, indexes, etc.

CONNECT : allow a user to connect to the database

USAGE, EXECUTE : allow a user to define and execute functions

ALL/ALL PRIVILEGES : allow everything

rights to DROP and ALTER are not grantable, and given toobject owners and superusers

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 39/110

Page 40: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Privileges

Privileges

GRANT rights ON object TO role;

REVOKE rights ON object FROM role;

\dp : display permissions

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 40/110

Page 41: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Ownership

Ownership

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 41/110

Page 42: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Ownership

Ownership

Owners have default privileges

Create any table in a dbread any column in a tableetc..

When using createdb to create a new database for yourapplication do not forget to set the owner

You can see the owner with \dtYou can change the owner with ALTER

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 42/110

Page 43: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Backup and restoration

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 43/110

Page 44: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Backups

Backups

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 44/110

Page 45: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Backups

dumps

pg dump

pg dump is the standard database dump program :pg dump dbname > outfile (as superuser)pg dump -t mytable dbname > outfile : just one table(-b for blobs)You can use -Fc to produce a optimized, more flexible dumpusing an internal format (not SQL)

pg dumpall

pg dumpall dumps all databasespg dumpall > outfile (as superuser)

Backups will be coherent, and will not lock tables (MVCC) exceptif you do a heavy modification like ALTER

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 45/110

Page 46: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Restoration

Restoration

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 46/110

Page 47: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Restoration

Restoration

psql < mydump

pg restore -d mydb mydump if your dump is inPostgreSQL ’s custom format

pg restore mydump converts the dump into SQL format forlater use

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 47/110

Page 48: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Physical backup and PITR

Physical backup and PITR

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 48/110

Page 49: PostgreSQL : Introduction

Introduction Installation The psql client Authentication and privileges Backup and restoration

Physical backup and PITR

Physical backup and PITR

Instead of doing dumps, you can ”freeze” the database and doa filesystem backup

Then, the WALs are archived

Using the filesystem backup + the WAL, you can restore toany point in time

This requires a specific, complicated config

Stick to good old dumps if you don’t really need that

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 49/110

Page 50: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

II

Part 2

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 50/110

Page 51: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Part 2

6 Internal Architecture

7 Performance optimization

8 Stats and monitoring

9 Logs

10 Replication

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 51/110

Page 52: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Internal Architecture

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 52/110

Page 53: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Storage engine

Storage engine

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 53/110

Page 54: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Storage engine

Filesystem

The tablespace is a folder that contains oddly-named files

Those files are data blocks (multiples of 8k)

They are mapped in memory in various caches

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 54/110

Page 55: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Storage engine

Cache

PostgreSQL uses two levels of cache

Shared buffers contain a copy of data blocks

Since PostgreSQL knows what they are used for, it caninvalidate only some of them during certain operations

The OS cache (not PostgreSQL memory) also contains acopy of the data blocks

PostgreSQL relies on it for performance considerations(query planner)Invalidation is controlled by the OS, and not smart from a DBpoint of viewIt is however, smart from a hardware point of view

Both level are designed to work together

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 55/110

Page 56: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Storage engine

Write-ahead log

The WAL is an essential component

stored in pg xlog

Contains every modification of the database

Optimized for write performance

Pre-allocatedsequential writes

Used to guarantee integrity : on COMMIT (or autocommit),the modification is written and fsync-ed to the log

If a crash occurs, the server re-synchronises the data from thexlog

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 56/110

Page 57: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Storage engine

Checkpoints and background writer

The background writer’s task is to commit dirty buffers to thedisk (or OS cache)

Checkpointing is the operation that writes all data back to thehard drive (bypassing OS cache)

throttled to avoid IO peaks

It occurs automatically

When the transation logs are fullAfter a certain time

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 57/110

Page 58: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Transactions

Transactions

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 58/110

Page 59: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Transactions

MVCC : Multi Version Concurrency Control

MVCC is a method used in many databases, relational orothers

InnoDB, Oracle Database, Berkeley DB, DB2, Sybase (andMSSQL)CouchDBSubversionEHcacheand of course, PostgreSQL

rows have a xmin, and xmax value, containing a xid, ortransaction ID

Every transaction increments the current xid,

Is is an internal column, but you can display it with

SELCT txid current();

SELECT *,xmin,xmax from mytable;

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 59/110

Page 60: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Transactions

MVCC and vaccuuming

UPDATE does not modify rows, it creates a new row with axmin set to the current transaction and sets xmax to thecurrent transaction on the old row

DELETE does not delete rows, it sets the row’s xmax to thecurrent transaction

The operation that removes unneeded rows is VACCUUM

regular VACCUUM simply removes unneeded rows by markingthem free in the Free Space Map (FSM)VACUUM FULL actually disallocates unneeded rows from thetablespace, it is a heavy operation, but can help of very activetables (sessions)

regular VACCUUM is automatically performed by theautovaccuum process

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 60/110

Page 61: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Transactions

Transaction isolation level

PostgreSQL automatically manages row-level locking asneeded

READ COMMITED (default)

In the same transaction, you can see a different image of theDB depending if another transaction has been validated in themeantimeYou will never, however, see a transaction that has not beencommited yetBut you can lock lines explicitely with SELECT FOR UPDATE

SERIALIZABLE

If a transaction detects that a value was modified since thebeginning of the transaction, it will fail

The other standard levels (READ UNCOMMITED, andREPEATABLE READ) are mapped to the closest level of thosetwo

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 61/110

Page 62: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Performance optimization

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 62/110

Page 63: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Memory

Memory

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 63/110

Page 64: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Memory

shared buffers

The main memory pool used by PostgreSQL

Usually : shared buffers = totalram/4

more isn’t better

SHM : need to tweak sysctl.conf

kernel.shmmax : maximum size (in bytes) of a memorysegmentkernel.shmall : maximum size (in pages) of all memorysegmentsgetconf PAGE SIZE

effective cache size : give it the actual cache size,including OS cache (50− 70% of memory) to help the queryplanner. This will not allocate anything, it’s just helping thequery planner make the right decisions.

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 64/110

Page 65: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Memory

work memory

work mem : memory for sorts, hash tables, etc.

The more work mem, the less you need to store temporarytables to diskAny single session can run multiple operations concurrently(complex queries)so don’t put too much there, or you’ll run out of memory onhigh loads

maintenance work mem : memory available to maintenanceoperation (INDEX, VACUUM, ALTER, . . . )

You can give more than work mem, not many maintenanceoperations will occur simultaneouslya good value is totalram/20

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 65/110

Page 66: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Disk

Disk

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 66/110

Page 67: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Disk

Disk

Checkpoints are the main source of IOcheckpoint segments : number of 16Mb log files to keepbefore ckecpointing

recommended value : 10

checkpoint timeout : maximum time between twocheckpoint

Default : 5min, but you can put more if you need to

checkpoint completion target : helps spread the IO load

default : 0.5you can try : 0.9 or something in-between if you want tospread the load more

wal buffers

a good value is 16Mb, which is the size of a WAL

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 67/110

Page 68: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Network

Network

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 68/110

Page 69: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Network

Network

max connections : maximum amount of connections

superuser reserved connections : maximum amount ofconnections for the superuser

If you use PHP, you’ll need

max connections =∑

frontends

maxclients+superuser reserved connections

If you have connection pools :

max connections =∑

frontends

poolsize+superuser reserved connections

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 69/110

Page 70: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Query planner

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 70/110

Page 71: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Query planner

The query planner is used by PostgreSQL to find the mostefficient way to process a request

Can we use indexesWhich one is the bestIn which order do we execute complex statementsCan we reformulate the statement in a more efficient one(SELECT WHERE IN becomes an JOIN, etc.)

It uses a cost-based model

And predictions from ANALYZE (autoanalyze is a part ofautovaccuum)

Sometimes it is wrong (but not very often!)

Don’t try to configure it or you will shoot yourself in the foot

You cant, however, hint it not to use certain operations (nofull scans, no indexes, etc.) and see what happens

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 71/110

Page 72: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

What it looks like

EXPLAIN will yield :

A tree of actions

Indication on

estimated cost for first row and for all rowsamount of rows expected to retrieveestimated size of a row

EXPLAIN ANALYZE will actually execute the request, yield thequery planner info, and :

time it took to produce first row and all rows (but actual timewihtout the profiler would be much faster)

actual amount of rows returned

how many times the node was executed

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 72/110

Page 73: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Sequential scans and index scans

# A q u e r y u s i n g a t a b l e scan ( bad )r o o t=> EXPLAIN SELECT ∗ FROM customer WHERE a c t i v e =0;

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Seq Scan on customer ( c o s t = 0 . 0 0 . . 1 6 . 4 9 rows=15 width =70)F i l t e r : ( a c t i v e = 0)

# A q u e r y u s i n g an i n d e x ( good )r o o t=> EXPLAIN SELECT ∗ FROM customer WHERE l a s t n a m e =’SMITH ’ ;

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

I n d e x Scan u s i n g i d x l a s t n a m e on customer ( c o s t = 0 . 0 0 . . 8 . 2 7 rows=1 width =70)I n d e x Cond : ( ( l a s t n a m e ) : : t e x t = ’SMITH ’ : : t e x t )

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 73/110

Page 74: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Query optimization

r o o t=> EXPLAIN SELECT DISTRICT FROM a d d r e s s WHERE a d d r e s s i d IN(SELECT a d d r e s s i d FROM customer WHERE a c t i v e =0);

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Hash Semi J o i n ( c o s t = 1 6 . 6 8 . . 3 2 . 4 5 rows=15 width =9)Hash Cond : ( a d d r e s s . a d d r e s s i d = customer . a d d r e s s i d )−> Seq Scan on a d d r e s s ( c o s t = 0 . 0 0 . . 1 4 . 0 3 rows =603 width =13)−> Hash ( c o s t = 1 6 . 4 9 . . 1 6 . 4 9 rows=15 width =2)

−> Seq Scan on customer ( c o s t = 0 . 0 0 . . 1 6 . 4 9 rows=15 width =2)F i l t e r : ( a c t i v e = 0)

r o o t=> EXPLAIN SELECT a d d r e s s . d i s t r i c t FROM a d d r e s s , customer WHEREa d d r e s s . a d d r e s s i d=customer . a d d r e s s i d AND customer . a c t i v e =0;

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Hash J o i n ( c o s t = 1 6 . 6 8 . . 3 3 . 1 2 rows=15 width =9)Hash Cond : ( a d d r e s s . a d d r e s s i d = customer . a d d r e s s i d )−> Seq Scan on a d d r e s s ( c o s t = 0 . 0 0 . . 1 4 . 0 3 rows =603 width =13)−> Hash ( c o s t = 1 6 . 4 9 . . 1 6 . 4 9 rows=15 width =2)

−> Seq Scan on customer ( c o s t = 0 . 0 0 . . 1 6 . 4 9 rows=15 width =2)F i l t e r : ( a c t i v e = 0)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 74/110

Page 75: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Complex queries

r o o t=> EXPLAIN SELECT COUNT(∗ ) , a d d r e s s . d i s t r i c t FROM a d d r e s s , customerWHERE a d d r e s s . a d d r e s s i d=customer . a d d r e s s i d AND customer . a c t i v e =1GROUP BY a d d r e s s . d i s t r i c t ;

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

HashAggregate ( c o s t = 4 9 . 0 1 . . 5 3 . 7 3 rows =378 width =9)−> Hash J o i n ( c o s t = 2 1 . 5 7 . . 4 6 . 0 9 rows =584 width =9)

Hash Cond : ( customer . a d d r e s s i d = a d d r e s s . a d d r e s s i d )−> Seq Scan on customer ( c o s t = 0 . 0 0 . . 1 6 . 4 9 rows =584 width =2)

F i l t e r : ( a c t i v e = 1)−> Hash ( c o s t = 1 4 . 0 3 . . 1 4 . 0 3 rows =603 width =13)

−> Seq Scan on a d d r e s s ( c o s t = 0 . 0 0 . . 1 4 . 0 3 rows =603 width =13)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 75/110

Page 76: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Query planner

Complex queries (for real)

r o o t=> EXPLAIN ANALYZE SELECT COUNT(∗ ) , a d d r e s s . d i s t r i c t FROM a d d r e s s , customerWHERE a d d r e s s . a d d r e s s i d=customer . a d d r e s s i d AND customer . a c t i v e =1GROUP BY a d d r e s s . d i s t r i c t ;

QUERY PLAN−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

HashAggregate ( c o s t = 4 9 . 0 1 . . 5 3 . 7 3 rows =378 width =9)( a c t u a l t ime = 4 . 9 3 9 . . 5 . 3 8 4 rows =369 l o o p s =1)

−> Hash J o i n ( c o s t = 2 1 . 5 7 . . 4 6 . 0 9 rows =584 width =9)( a c t u a l t ime = 1 . 6 3 2 . . 4 . 0 4 5 rows =584 l o o p s =1)

Hash Cond : ( customer . a d d r e s s i d = a d d r e s s . a d d r e s s i d )−> Seq Scan on customer ( c o s t = 0 . 0 0 . . 1 6 . 4 9 rows =584 width =2)

( a c t u a l t ime = 0 . 0 1 6 . . 0 . 9 3 5 rows =584 l o o p s =1)F i l t e r : ( a c t i v e = 1)

−> Hash ( c o s t = 1 4 . 0 3 . . 1 4 . 0 3 rows =603 width =13)( a c t u a l t ime = 1 . 6 0 2 . . 1 . 6 0 2 rows =603 l o o p s =1)

−> Seq Scan on a d d r e s s ( c o s t = 0 . 0 0 . . 1 4 . 0 3 rows =603 width =13)( a c t u a l t ime = 0 . 0 0 7 . . 0 . 7 9 6 rows =603 l o o p s =1)

T o t a l r u n t i m e : 5 . 8 8 3 ms

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 76/110

Page 77: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Stats and monitoring

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 77/110

Page 78: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Table stats

Table stats

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 78/110

Page 79: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Table stats

Table statistics

SELECT * from pg stat user table

r o o t=> \d p g s t a t u s e r t a b l e s ;View ” p g c a t a l o g . p g s t a t u s e r t a b l e s ”

Column | Type−−−−−−−−−−−−−−−−−−+−−−−−−−−−−

r e l i d | o i d | Table o i dschemaname | name | Schema namere lname | name | Table names e q s c a n | b i g i n t | Number o f s e q u e n t i a l s c a n ss e q t u p r e a d | b i g i n t | Number o f rows r e t u r n e by seq s c a n si d x s c a n | b i g i n t | Number o f i n d e x s c a n si d x t u p f e t c h | b i g i n t | Number o f rows r e t u r n e d by i d x s c a n sn t u p i n s | b i g i n t | Number o f i n s e r t e d rowsn t u p u p d | b i g i n t | Number o f updated rowsn t u p d e l | b i g i n t | Number o f d e l e t e d rowsn t u p h o t u p d | b i g i n t | Number o f updated rows ( w i t h HOT)n l i v e t u p | b i g i n t | Number o f v a l i d rowsn d e a d t u p | b i g i n t | Number o f i n v a l i d (MVCC) rowsl a s t v a c u u m | t imestamp | L a s t t ime VACCUUM was s t a r t e d m a n u a l l yl a s t a u t o v a c u u m | t imestamp | L a s t t ime VACCUUM was s t a r t e d a u t o m a t i c a l l yl a s t a n a l y z e | t imestamp | L a s t t ime ANALYZE was s t a r t e d ma n u a l l yl a s t a u t o a n a l y z e | t imestamp | L a s t t ime ANALYZE was s t a r t e d a u t o m a t i c a l l y

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 79/110

Page 80: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Table stats

IO stats

SELECT * from pg statio user tables

Show you the amount of cache reads and page readsShow how efficient the buffer pool isDoes not show information about the OS cache

SELECT * from pg statio user indexes

Same with indexes

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 80/110

Page 81: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Table stats

IO stats example

SELECT relname , h e a p b l k s r e a d , h e a p b l k s h i t , i d x b l k s r e a d , i d x b l k s h i tFROM p g s t a t i o u s e r t a b l e s ;

r e lname | h e a p b l k s r e a d | h e a p b l k s h i t | i d x b l k s r e a d | i d x b l k s h i t−−−−−−−−−−−+−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−

payment | 52 | 297 | 54 | 28123f i l m | 58 | 170 | 34 | 8646i n v e n t o r y | 27 | 71 | 35 | 17552

SELECT relname ,c a s t ( h e a p b l k s h i t as numer ic ) / ( h e a p b l k s h i t+h e a p b l k s r e a d ) AS h i t r a t e

FROM p g s t a t i o u s e r t a b l e sWHERE ( h e a p b l k s h i t+h e a p b l k s r e a d )>0 ;

re lname | h i t r a t e−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−

c o u n t r y | 0.50000000000000000000payment p2007 04 | 0.85100286532951289398f i l m | 0.74561403508771929825payment p2007 02 | 0.83898305084745762712

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 81/110

Page 82: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Table stats

Table sizes

SELECT schemaname , re lname ,p g s i z e p r e t t y ( p g r e l a t i o n s i z e ( r e l i d ) ) AS s i z e ,p g s i z e p r e t t y ( p g t o t a l r e l a t i o n s i z e ( r e l i d ) ) AS t o t a l s i z e

FROM p g s t a t u s e r t a b l e s ;

or \dt+

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 82/110

Page 83: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Activity

Activity

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 83/110

Page 84: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Activity

The pg stat activity view

r o o t=> \d p g s t a t a c t i v i t y ;View ” p g c a t a l o g . p g s t a t a c t i v i t y ”

Column | Type−−−−−−−−−−−−−−−+−−−−−−−−−−−−

d a t i d | o i d | Database OIDdatname | name | Database namep r o c p i d | i n t e g e r | PID o f s e r v e r p r o c e s su s e s y s i d | o i d | User OIDusename | name | User namec u r r e n t q u e r y | t e x t | Queryw a i t i n g | b o o l e a n | I s t h e q u e r y w a i t i n gx a c t s t a r t | t imestamp | T r a n s a c t i o n s t a r t t imeq u e r y s t a r t | t imestamp | Query s t a r t t imeb a c k e n d s t a r t | t imestamp | P r o c e s s s t a r t t imec l i e n t a d d r | i n e t | C l i e n t IP a d d r e s sc l i e n t p o r t | i n t e g e r | C l i e n t s o u r c e p o r t

A slightly better view :

SELECT datname , usename , c u r r e n t q u e r y , w a i t i n g , c l i e n t a d d r ,now()− q u e r y s t a r t AS r u n n i n g f o r

FROM p g s t a t a c t i v i t y ORDER BY r u n n i n g f o r DESC ;

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 84/110

Page 85: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Locks

Locks

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 85/110

Page 86: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Locks

Finding locks

Let’s see which queries are blocked

SELECT ∗ FROM p g s t a t a c t i v i t y WHERE w a i t i n g=t r u e ;−[ RECORD 1 ]−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−d a t i d | 16392datname | r o o tp r o c p i d | 1530u s e s y s i d | 16390usename | r o o tc u r r e n t q u e r y | SELECT i FROM t WHERE s=1 FOR UPDATE;w a i t i n g | tx a c t s t a r t | 2012−02−27 14:21 :34 .723872+01q u e r y s t a r t | 2012−02−27 14:21 :36 .195808+01b a c k e n d s t a r t | 2012−02−27 14:16 :49 .203923+01c l i e n t a d d r |c l i e n t p o r t | −1

PID 1530 is waiting for something

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 86/110

Page 87: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Locks

Let’s find who is blocking it

SELECT ∗ FROM p g l o c k s WHERE p i d =1530 AND g r a n t e d=f a l s e ;−[ RECORD 1 ]−−−−−−+−−−−−−−−−−−−−−l o c k t y p e | t r a n s a c t i o n i dd a t a b a s e |r e l a t i o n |page |t u p l e |v i r t u a l x i d |t r a n s a c t i o n i d | 688c l a s s i d |o b j i d |o b j s u b i d |v i r t u a l t r a n s a c t i o n | 2/39p i d | 1530mode | ShareLockg r a n t e d | f

We are locked by transaction 688

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 87/110

Page 88: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Locks

Let’s find who is transaction 688

SELECT ∗ FROM p g l o c k s WHERE t r a n s a c t i o n i d =688 AND g r a n t e d=t r u e ;−[ RECORD 1 ]−−−−−−+−−−−−−−−−−−−−−l o c k t y p e | t r a n s a c t i o n i dd a t a b a s e |r e l a t i o n |page |t u p l e |v i r t u a l x i d |t r a n s a c t i o n i d | 688c l a s s i d |o b j i d |o b j s u b i d |v i r t u a l t r a n s a c t i o n | 1/1099p i d | 1519mode | E x c l u s i v e L o c kg r a n t e d | t

Now you can SELECT * FROM pg stat activity WHERE

procpid=1519; to find more about it

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 88/110

Page 89: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

System monitoring

System monitoring

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 89/110

Page 90: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

System monitoring

Munin

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 90/110

Page 91: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Logs

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 91/110

Page 92: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Error logs

Error logs

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 92/110

Page 93: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Error logs

Error logs

You can find error logs in /var/log/postgresql

syntax errors, server errors, etc.

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 93/110

Page 94: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Finding slow queries

Finding slow queries

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 94/110

Page 95: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Finding slow queries

Finding slow queries

log min duration statement : define which queries shouldbe logged

−1 (default) : Do not log anything0 : log all queries (aka: ruin your HDD)500 : log queries that run for longer than 500ms

Set it globally in postgresql.conf then reload

Or set it just for one user :

alter role ’myuser’ set log min duration statement

= 500;

Will only start logging next time myuser opens a sessionThen, disable it with :alter role ’myuser’ set log min duration statement

= DEFAULT;

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 95/110

Page 96: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Replication

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 96/110

Page 97: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Basics

Basics

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 97/110

Page 98: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Basics

Basics

PostgreSQL replication is asynchronous by default

synchronous replication is possible, but not usuallyrecommended

log shipping : copy transaction log files to the server andreplay them

Delay can be highYou can set a timeout to make sure you get a new log every XminutesYou have to setup the copy (using rsync, NFS, etc.)

streaming : direct connection to the master server (likeMySQL)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 98/110

Page 99: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Basics

Basics

warm standby : the replica is not usable, but can be rapidlyenabled during failover

hot standby : the replica can be used for reads

beware of conflicts due to lockingmust compromise between the ability to run long queries andthe freshness of data

On 8.4 : log shipping, warm standby

On 9.1 : log shipping, streaming, hot standby

Many 3rd party products : slony, etc.

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 99/110

Page 100: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Log Shipping

Log Shipping

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 100/110

Page 101: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Log Shipping

Log Shipping: master

Setup SSH keys, a NFS shared server, or any other mean toshare the logs

In postgresql.conf, enable archiving :

wal level = archive

archive mode = on

archive command = ’rsync -a %p host:/dir/%f’

archive timeout = 60 (if you want one)

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 101/110

Page 102: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Log Shipping

Log Shipping: initializing the replication

Do a hot backup on the master :

SELECT pg start backup(’label’);

rsync -av --delete $datadir slave:$datadir

SELECT pg stop backup();

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 102/110

Page 103: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Log Shipping

Log Shipping: slave

create a recovery.conf file in the datadir

restore command = ’/ u s r / l i b / p o s t g r e s q l / 8 . 4 / b i n / p g s t a n d b y −d −t/tmp/ p g s q l s t o p s t a n d b y / v a r / l i b / p o s t g r e s q l / 8 . 4 / main / a r c h i v e l o g %f %p %r2>>/v a r / l o g / p o s t g r e s q l / s t a n d b y . log ’

recovery end command = ’ rm −f /tmp/ p g s q l s t o p s t a n d b y ’

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 103/110

Page 104: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Log Shipping

Log Shipping: failing over

remove the trigger file /tmp/pgsql stopstandby to finishrestoration

recovery.conf is renamed to recovery.done

You can now access the slave

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 104/110

Page 105: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Streaming

Streaming

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 105/110

Page 106: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Streaming

Streaming: master

You can combine streaming and log shipping

Enable WAL senders :

max wal senders = 2

You should make sure to have a few old log files if you don’tuse archiving :

wal keep segments = 10

Create a REPLICATION account :

CREATE ROLE myuser REPLICATION LOGIN PASSWORD

’mypass’

In pg hba.conf

host replication myuser slave/32 md5

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 106/110

Page 107: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Streaming

Streaming: slave

standby mode = ’on’

primary conninfo = ’host=pgmaster port=5432

user=X password=Y’

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 107/110

Page 108: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Streaming

Streaming: failing over

pg ctlcluster 9.1 main promote

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 108/110

Page 109: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Hot standby

Hot standby

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 109/110

Page 110: PostgreSQL : Introduction

Internal Architecture Performance optimization Stats and monitoring Logs Replication

Hot standby

Hot standby

hot standby = on

wal level = hot standby

max standby archive delay = 30

max standby streaming delay = 30

It’s the time we allow queries running on the slave to blockWAL replay

www.opensourceschool.fr – Licence Creative Commons (CC BY-SA 3.0 FR) – 110/110