This is your PostgreSQL on Drugs

pridefulauburnData Management

Dec 16, 2012 (4 years and 10 months ago)

181 views

Aaron
Thul

Electronic
Medical
Office
Logistics
(EMOL)

http://chasingnuts.com/oscon1.08.pdf

Who
am
I?



Computer
&
Database
Geek,
just
like
you



Formerly
a

SysAdmin

at

Autoweb

Communications



PostgreSQL

Build
Your
Car



Presently
a
IT
manager
at
a
EMOL



PostgreSQL

Evangelist



Penguicon
Organizer

With

PostgreSQL

and
other
Open

Source
software
EMOL
is



Allowing
Data
collection
from
EMRs
and

other
sources



Aiding
in
Adherence
to
national
standards



Providing
Physician
and
Practice
level

benchmarking



Data
Brokering



Enabling
Automation
of
National
initiatives,

such
as
the
CMS
PQRI

EMOL

PostgreSQL

Data



Patient
Records



Billing
Records



Lab
Results



Clinical
Records



Inventory
Management



Patient
Reported
Data

Metadata



Physicians
Dictations



Scanned
Documents



Images



XRAYs



MIRIs



CAT
Scans

Metadata
Storage



ReiserFS

with
tail
packing



Each
practice/doctor
has
a
folder



SUN

OpenSolaris

&
ZFS???



Linux
and
XFS???



Netapp

Waffle???

EMOL
Software
Building

Blocks



Ubuntu

Linux
LTS
(8.04)



PostgreSQL

(8.3)



Perl
(5.8.x)



Windows
Unified
Data
Storage
Server
2003

(R2)



Yes
Windows

EMOL
Hardware
Building

Blocks



HP

ProCurve

Switches



Support

considerably

cheaper
than

Smartnet



SonicWall

Firewalls



Support

considerably

cheaper
than

Smartnet



Large
number
of
SCSI
and
SATA
Hard
Drives



iSCSI

Servers
and
DAS
(Direct
Attached

Storage)
Systems

Why

PostgreSQL
?



Capable



Required
Features




Database
Team
Experience



Security



Community




Documentation
Project



Mailing
Lists



IRC



Events
Like
This!

Why
Perl?



P
ractical

E
xtraction
and

R
eport

L
anguage



Development
team
experienced
with
Perl



Unix‐centric,
and
available
for
Windows




Text
parsing
and
normalizing



I
know
it
Perl
is
not
sexy
like



INSERT ‘
new_popular_language
’ INTO
languages;


Rapid
prototyping



Weakly
typed



Interpreted,
though
very
fast



Supports
objects

Who
is
Where?



OS
and

PostgreSQL

binaries
on
local
disks



RAID
1
Mirror



15k
spindle
drives



EXT3



WAL
Buffers
on
local
disks



RAID
1
Mirror



15k
spindle
speed




EXT2



INDEXs



DAS
(Direct
Attached
Storage)
Units



RAID
6



10
k
spindle
speed
SCSI



EXT3



TABLES



Multiple

iSCSI

Servers
on
SANS



4
x
1
Gigabit
Ethernet
Interfaces
Bonded



8
x
1
Terabyte
SATA
drives
per
SAN
Node
RAID
6



EXT3

Data
Daily



Loading
10
GB
data
daily
into

PostgreSQL



Loading

10
GB
metadata
daily


Data
Size

SELECT
relname
, (
relpages
*8)/1024 as MB
FROM
pg_class

ORDER BY
relpages
DESC;
Data
Size

SELECT
relname
, (
relpages
*8)/1024 as MB
FROM
pg_class

ORDER BY
relpages
DESC;
This
does
not
account
for

pg_toast



This
does
provide
more
precision

Data
Size
Really

SELECT
nspname
|| '.' ||
relname
AS "relation",

pg_size_pretty(pg_relation_size(nspname
|| '.' ||
relname
)) AS "size"
FROM
pg_class
C
LEFT JOIN
pg_namespace
N ON (
N.oid
=
C.relnamespace
)
WHERE
nspname
NOT IN ('
pg_catalog
',
'
information_schema
')
AND
nspname
!~ '^
pg_toast
'
AND
pg_relation_size(nspname
|| '.' ||
relname
)>0
ORDER BY
pg_relation_size(nspname
|| '.' ||
relname
)
DESC
How
much
data
are
we
talking



Largest
Table:
1,844.73

GB



Second
Largest
Table:
1,289.36
GB



Largest
Index:
411.91
GB



Second
Largest
Index:
405.08
GB



Total
DB
size
on
disk:
16,800.39
GB

Better
make
sure
we
need
that

INDEX

select

indexrelid::regclass
as index,
relid::regclass
as
table
from

pg_stat_user_indexes

JOIN
pg_index
USING (
indexrelid
)
where

idx_scan
= 0 and
indisunique
is false;
More
details
at:

http://people.planetpostgresql.org/xzilla/index.php?/archives/351‐Index‐pruning‐
techniques.html

Run
it
twice
and
make
it
faster



Maintain
a
1/500
set
of
random
sample
data



ALL
queries
hit
that
data
base
first



Only
once
query
result
is
successful
is
the

query
moved
onto
production
database

server

How
do
I
sleep
at
night



First
Name



Last
Names



Social
Security
Numbers




Birth
Dates



Needed
to
track
people
over
time
and

geography

How
do
I
sleep
at
night

"By
default,
PostgreSQL
is
probably
the
most

security‐aware
database
available
..."






Database
Hacker's
Handbook

Protecting
the
Warehouse



Simple
processes
that
are
followed



Intrusion
Prevention
&
Firewalls



Security
Monitoring
&
Management
‐
MSSP



Encrypted
Communication



Identity
Management
‐
Centralized

management
of
users
and
groups
–
mitigates

vulnerabilities
that
occur
due
to

inconsistencies


Protecting
the
Warehouse



Role‐based
security



Functions
everyplace
we
can



Identity
data
symmetrically
encrypted



Data
is

anonamized


in
all
but
a
few
tables



Role‐based
security



All
data
is

anonamized


before
it
is
sent
out

Lessons
Learned



Server
Ethernet
Cards
are
not
all
made
the

same



With
100+
drives
be
ready
to
RMA
some
disks



You
can
never
have
to
many
DIMM
slots



You
do
get
what
you
pay
for
with
RAID

controllers



You
can’t
have
to
big
a
cache
on
your
RAID

controller

More
Lessons
Learned



pg_resetxlog

is
not
THAT
scary



You
can
never
have
to
many
PCI‐X
Slots



Auto‐vacuum
is
not
always
your
friend

More
Lessons
Learned



Worry
when
a
developer
says
“I
have
an
idea”



Some
mistakes
are
just
to
much
fun
to
make

only
once



I
am
used
to
hearing
“It
seems
like
you
are

doing
something
fundamentally
wrong”



Never
ask
for
directions
from
a
two‐headed

tourist!


‐Big
Bird

Looking
Forward



I
don’t
think
I
need
to
worry
about

PostgreSQL
scaling



Size
matters:
Yahoo
claims
2‐petabyte
database
is

world's
biggest,
busiest



http://
www.computerworld.com/action/
article.do?
command
=
viewArticleBasic&taxonomyId
=18&arti
cleId=9087918&intsrc=
hm_topic

Looking
Forward



GridSQL

from

EnterpriseDB



Built
using
multiple
standard
PostgreSQL
servers



Open
Source
Project

Questions



Web:
http://www.chasingnuts.com



Email:
aaron@chasingnuts.com




IRC:

AaronThul

on
irc.freenode.org




Jabber:

apthul@gmail.com



Twitter:
@
AaronThul



AIM:

AaronThul