PostgreSQL 9.0 HA - RMLL

offbeatlossData Management

Nov 22, 2012 (4 years and 8 months ago)

354 views

Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
PostgreSQL 9.0 HA
Julien Pivotto
Inuits
July 9,2012
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
Table of content
1
Overview
The mission
Before the migration
2
PostgreSQL 9.0
Intro
Streaming replication
Master configuration
Slave configuration
3
Clustering
Set up of corosync
OCF resource
Synthesis
4
Backups and monitoring
Backups with cron jobs
BackupPC
Nagios
Munin
5
Automation
Puppet module
The node file
6
The end
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
Who am I

Julien Pivotto
 @roidelapluie

roidelapluie@inuits.eu
 Consultant at Inuits (Belgium) since May 2011

FOSS defender since 2005
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
A.R.S.I.A.

Association Régionale de Santé et d’Identification Animales
(Belgique)

30 linux servers in several locations

A lot of Open Source:

CentOS,Samba,Open-xchange,mailscanner,Cyrus,

...Puppet,jenkins,foreman,OpenVPN,GLPI,rabbitmq,

...BackupPC,CUPS,icinga,trac,zope,plone,

...solr,pentaho,funambol,munin,squid,asterisk,

...and PostgreSQL,...
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
C.E.R.I.S.E
 A web application:
 Plone (python)
 15k+ visits,500k+ pages and 2.000.000+ hits each month
 Developped by another open-source company,Affinitic
 Several databases:
 PostgreSQL 9.0
 Oracle database
 Several servers/services:

Reverse proxies

Application servers

PostgreSQL servers

Oracledb server

Development server

Pentaho server
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
PostgreSQL before the migration
 PostgreSQL 8.3.7

No native support of HA
 High availability with heartbeat 2 and DRBD

Installed on the application servers

Nothing automated
 Failover:Passive node is not even read only
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
Monitoring before the installation
 Icinga
 Check of the DRBD

Simple connection check to PostgreSQL

Graphing with Cacti:

Size of the databases

Connexions to the database

Checkpoints
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
The mission
Before the migration
Backups before the installation

Hourly local backups
 External backups (on disk and on tape)

Made with pg_dump command
 External backups tool:BackupPC
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL 9.0

Released in september 2010
 First version with native replication

No native failover tool
 We need to use PostgreSQL + Corosync
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Write-Ahead Logging

Every change to datafile is first written into a log file
 Less disk writes

More cohesion
 Logs can be replayed
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
What is streaming replication

Ship and apply WAL XLOGS to standby servers

Possibility to have multiple standby servers
 Standby servers can be read-only ("Hot standby")
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
DisadvantagesSpecifications of streaming replication

Asynchronous log-shipping

Delay is often very close to 0

No native HA features in PostgreSQL

We will use corosync

Single-threaded replication

Single-threaded replication...
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Master configuration
Only one specific configuration file.
Configuration related to SR
wal_level = hot_standby
max_wal_senders = 2
wal_keep_segments = 128
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Master configuration

wal_level = hot_standby
Allows stanby server to be readable
 max_wal_senders = 2
We allow up to 2 standby nodes
 wal_keep_segments = 128
The minimum wal segments to keep
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Slave configuration

Two configuration files
 PostgreSQL.conf file
 Recovery.conf file

A trigger file to stop replication can be specified
PostgreSQL.conf - Configuration related to SR
wal_level = hot_standby
hot_standby = on
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Slave configuration
recovery.conf
standby_mode = ‘on’
primary_conninfo = ‘host=192.168.142.31 user=replicuser’

standby_mode = standby server
 primary_conninfo = connection to the master
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Replication user
 Super user called replication is created
SQL query to create super user
CREATE USER replication SUPERUSER LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD ‘foobar’;
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
pg_hba.conf
 pg_hba.conf contains some kind of ACLs for the PostgreSQL
connections
 Need to add the nodes as trusted.
pg_hba.conf
hostnossl all all 192.168.142.31/32 trust
hostnossl all all 192.168.142.31/32 trust
hostnossl replication replicuser 192.168.142.32/32 trust
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
Setting up a slave

3 commands required to add a slave
Adding a standby server
psql -c"SELECT pg_start_backup(’label’,true)"
rsync -a ${PGDATA}/standby:/srv/pgsql/standby/--exclude postmaster.pid --exclude ‘*-master’
--exclude ‘*-slave’
psql -c"SELECT pg_stop_backup()"
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
Corosync configuration
 Goal:switching between master/slave

One and only one node should be master
 Master need to be connected to the router

Custom master/slave ocf resource
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
crm.conf

Main configuration file of corosync is
/etc/corosync/crm.conf.

Contains all the resources/nodes/etc...
Defining the nodes
node dupont.demo\
attributes standby="off"
node dupond.demo\
attributes standby="on"
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
crm.conf
Primitives
primitive pgsql ocf:inuits:pgsql-ms
primitive virt_ip ocf:heartbeat:IPaddr2\
params nic="eth1"iflabel="0"ip="192.168.142.30"cidr_netmask="24"broadcast="192.168.142.255"
\
meta target-role="Started"is-managed="true"
primitive ping ocf:pacemaker:ping\
params host_list="192.168.142.1"\
op monitor interval="10s"timeout="10s"\
op start interval="0"timeout="45s"\
op stop interval="0"timeout="50s"
 3 primitives:
 pgsql,the PostgreSQL primitive
 virt_ip,the floating IP address
 ping,the primitive that will check that the servers are
connected to the router
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
crm.conf
Configuring the primitives
ms pgsql-ms pgsql\
params pgsqlconfig="/var/lib/pgsql/data/postgresql.conf"\
lsb_script="/etc/init.d/postgresql-9.0"\
pgsqlrecovery="/var/lib/pgsql/data/recovery.conf"\
meta clone-max="2"clone-node-max="1"master-max="1"master-node-max="1"notify="false"
clone clone-ping ping\
meta globally-unique="false"
 PostgreSQL M/S configuration:the init script,the
configuration files...
 Ping resource as a clone (launched on both servers)
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
crm.conf
group and colocation
group PSQL virt_ip
location connected PSQL\
rule $id="connected-rule"-inf:not_defined pingd or pingd lte 0
colocation ip_psql inf:PSQL pgsql-ms:Master
property $id="cib-bootstrap-options"\
cluster-infrastructure="openais"\
expected-quorum-votes="2"\
stonith-enabled="false"\
no-quorum-policy="ignore"\
default-resource-stickiness="INFINITY"
rsc_defaults $id="rsc_defaults-options"\
migration-threshold="INFINITY"\
failure-timeout="10"\
resource-stickiness="INFINITY"

Master should always be on the same node as the floating IP
address

Master should be connected to the router
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
OCF resource
 Custom OCF resource to manage the master/slave
PostgreSQL
 Based on an example of resource written by Andrew Beekhof
from Clusterlabs
 The file is in/usr/lib/ocf/resource.d/inuits/pgsql-ms

Available on github!
https://github.com/roidelapluie/puppet-cluster
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
OCF resource
 The script does the following:
 Move the postgresql.conf-master to postgresql.conf
when a node is promoted/master

Move the postgresql.conf-slave to postgresql.conf
when a node is depromoted/slave

Ensure that recovery.conf-slave is on recovery.conf on
slave and absent on master

Starts/restarts PostgreSQL when needed.
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
Synthesis
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
Synthesis
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Set up of corosync
OCF resource
Synthesis
Synthesis
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
Backups of the databases

The databases are quite small (a few gigabytes)

Hourly on each node (one at minute 0 and one at minute 30)
 Daily external backup of each node

We keep 24 hourly backups and 7 daily backups on disk
 BackupPC:we keep months of backups
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
Hourly backup script
/usr/local/bin/backup_hourly.sh
#!/bin/bash
DATE=$(date +%H)
BACKUP_PATH=/var/lib/backups/hourly
for db in foobar_db foobar2_db
do
/usr/bin/pg_dump $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
The daily script is almost the same.
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
BackupPC script
/usr/local/bin/backup_backuppc.sh
#!/bin/bash
DATE=$(date +%u)
BACKUP_PATH=/var/lib/backups/backuppc
for db in cerise trackitquality trackit zodb_cerise
do
/usr/bin/pg_dump -U postgres $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
In the backupPC config,I added the following:
BackupPC config
$Conf{DumpPreUserCmd} = ‘$sshPath -t -q -x -l backuppc $host/usr/local/bin/backup_backuppc.sh’;
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
check_postgres script
 check_postgres.pl is a nagios-compatible perl script

Available on http://www.bucardo.org/check_postgres/
and on Github
 Checks:
 Current connections number
 Replication delay
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
Check hot_standby latency
 The check_postgres.pl script has a check for hot_standby
delay
 We do not know who is the master and the slave

Here is a bash script I wrote to know the M/S order
Master/slave replication check
#!/bin/bash
/usr/lib64/nagios/plugins/check_postgres.pl --db="$1"\
--action hot_standby_delay -w 300 -c 600 --host=$(
crm_resource --resource pgsql-ms --locate|
awk ‘/Master/{master=$6}/$/{slave=$6} END {print master","slave}’
)
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
Munin postgres scripts

Munin is shipped with perl plugins for postgresql
 We use four of them:
 postgres_size,
 postgres_checkpoints,
 postgres_connections_db,
 postgres_cache
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Backups with cron jobs
BackupPC
Nagios
Munin
Munin postgres scripts
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Puppet module

Puppet-postgres module forked from Kris Buytaert’s github
page

Forked to remove all references to services

Creates the users,the super users,the databases
 Parameterized classes

The cache sizes are parameterized too (useful for Vagrant
boxes)
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Sqlexec definition
sqlexec.pp
define postgres::sqlexec($username,$database,$sql,$sqlcheck,$host=’localhost’,$password="")
{
exec{"psql -h $host –username=${username} $database
-c
¨
${sql}
¨
&&/bin/sleep 5":
environment =>"PGPASSWORD=${password}",
path => $::path,
timeout => 600,
unless =>"psql -h $host -U $username $database -c $sqlcheck",
require => Service[’postgresql-9.0’],
}
}
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Example in the node file
Here is the result in the node file:
site.pp
node postgres-server {
include postgres
include site
class {
’cluster’:
cluster_bind_interface => ’eth1’,
cluster_mcastaddr => ’226.94.1.1’,
}
postgres::hba {
$::fqdn:
password => ’yes’,
allowedrules => [
"host all all $::ipaddress/32 password",
’hostnossl replication replicuser 192.168.142.31/24 trust’,
’hostnossl replication replicuser 192.168.142.32/24 trust’,
’hostnossl all all 192.168.142.0/24 password’,
]
}
include site::sshkeys
Postgres::Cluster[$::fqdn] -> Class[’cluster’]
}
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Example in the node file
dupont.pp
node dupont inherits postgres-server {
postgres::cluster {
$::fqdn:
other_node_ip => ’192.168.142.31’,
listen => ’*’,
max_connections => ’100’,
shared_buffers => ’120MB’,
work_mem => ’3MB’,
effective_cache_size => ’352MB’,
}
class {
’postgres::firstsync’:
remotehost => ’dupont’,
password => ’rmll2012’,
}
Class[’postgres’] -> Postgres::Cluster[$::fqdn]
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Example in the node file
dupont.pp
exec {
’/bin/sleep 10’:
refreshonly => true,
subscribe => Exec[’load_crm_config’],
}
exec {
’crm node online dupond.demo’:
refreshonly => true,
path => $::path,
subscribe => Exec[’load_crm_config’],
require => Exec[’/bin/sleep 10’,’crm node online dupond.demo’],
}
Class[’site’] -> Class[’cluster’] -> Class[’postgres::firstsync’]
Class[’postgres::firstsync’] -> Postgres::Hba[$::fqdn]
Class[’cluster’] -> Exec[’/bin/sleep 10’] -> Class[’postgres::firstsync’]
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Puppet module
The node file
Example in the node file
dupont.pp
postgres::createuser {
’tintin’:
password => ’rmll2012’,
host => ’192.168.142.30’,
passwd => ’milou’,
require => Exec[’/bin/sleep 10’],
}
postgres::createdb {
’moulinsart’:
password => ’rmll2012’,
host => ’192.168.142.30’,
owner => ’tintin’,
require => [Exec[’/bin/sleep 10’],Postgres::Createuser[’tintin’]],
}
postgres::createsuperuser {
’replicuser’:
password => ’rmll2012’,
host => ’192.168.142.30’,
passwd => ’icanhazapassword’,
require => Exec[’/bin/sleep 10’],
}
}
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Test it yourself
You will need vagrant (http://vagrantup.com).
Test-it-yourself howto
$ git clone
git://github.com/roidelapluie/postgres9-demo-vagrant.git
$ cd postgres9-demo-vagrant
$ git submodule update --init
$ vagrant up
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Any questions?
(I stay all the week @RMLL)
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end

Presentation made with latex-beamer
 PostgreSQL logo is used following the policy
http://wiki.postgresql.org/wiki/Trademark_Policy
 Tango icons have been released into the public domain

Database icon is under CC BY-SA 3.0
http://dracos.deviantart.com/#/d2y5ele
 RMLL logo is under CC BY-SA 3.0
http://2012.rmll.info/kit-media
Julien Pivotto
PostgreSQL 9.0 HA
Overview
PostgreSQL 9.0
Clustering
Backups and monitoring
Automation
The end
Thank you
Julien Pivotto
PostgreSQL 9.0 HA