Materialized Views in PostgreSQL - HSR-Wiki

disturbedoctopusData Management

Nov 27, 2012 (4 years and 4 months ago)

394 views

09.01.2012

1

Materialized Views in PostgreSQL

Experiments around J. Gardner‘s Proposal

HSR
HOCHSCHULE FÜR TECHNI K
RAPPERSWI L
FH
O
F
a
c
hhoc
h
s
ch
u
l
e
O
s
t
sc
h
w
e
i z
Reto Guadagnini

09.01.2012

2

Contents


What are Materialized Views


Basic Types


Experiments:


Query Performance


Creation of an Eager Materialized View


Discussion

09.01.2012

3

What are Materialized Views?

View


Represents a named query


Query is performed each time the view is accessed



operations on a view are relatively expensive

Materialized View


Real table


Bases on an ordinary view


Contains all tuples of the corresponding view


Acts like a cache between clients and a view

09.01.2012

4

Basic concept of a Materialized View

Table 2

Table n

Table 1



View

Materialized

View

Client

Queries

09.01.2012

5

Support for Materialized Views

DBMS which support Materialized Views:


Oracle Database 10


Microsoft SQL Server 2008: „Indexed Views“



will not be covered in this talk


PostgreSQL:


Currently (version 9.1) there is no built in support


J. Gardner proposed how to handcraft Materialized Views
in PostgreSQL



the rest of this talk will focus on J. Gardner’s proposal

09.01.2012

6

Types of Materialized Views

Types according to J. Gardner:


Snapshot


Eager


Lazy


Very Lazy


The various types differ in the way the Materialized
View is kept in synch with the underlying tables

09.01.2012

7

Snapshot Materialized View

Update strategy:
Update on request

Implementation:


Function refresh_matview()


Deletes the content of the Materialized View


Queries the view and fills the Materialized View again with
the current content of the view

View

Materialized

View

Function: refresh_matview

09.01.2012

8

Eager Materialized View

Update strategy:
Update on change

Implementation:


Two functions


mv_refresh_row(PK)


Updates tuple with Primary Key PK


mv_refresh()


Used if the view bases on mutable functions like “now()”


Triggers


On INSERT, UPDATE and DELETE


Call mv_refresh_row() in an appropriate way

09.01.2012

9

Eager Materialized View

Table 2

Table n

Table 1



View

Materialized

View

mv_refresh_row(PK)

mv_refresh()

Triggers:

Insert

Update

Delete

Triggers:

Insert

Update

Delete

Triggers:

Insert

Update

Delete

09.01.2012

10

Very Lazy Materialized View

Update strategy:
Update on request

Implementation:


Write all changes on the underlying tables in a log

(use some triggers and functions)


Implement a function “
matview_refresh
()” which
reads the log and updates the Materialized View
according the list of changes in the log

09.01.2012

11

Lazy Materialized View

Update strategy:
Update on commit

Implementation:


Similar to Very Lazy Materialized Views



matview_refresh
()” would be called on commit and
not on request


J. Gardner states, that he currently doesn’t know
how to implement them in PostgreSQL

09.01.2012

12

Experiments

In our experiments we covered only Eager Materialized
Views. We performed two experiments:


1.
Query Performance: Materialized View VS ordinary
view


2.
Implementation of an Eager Materialized View on
an example database according to the scheme
proposed by Gardner

09.01.2012

13

Experiment 1: Query Performance


To compare the query performance of ordinary views and
Materialized Views we used the Eager Materialized View
example from Gardner’s proposal:


1

n

b

c

a

b_v

b_mv

b_mv_refresh_row

b_mv_refresh

Triggers:

b_mv_it

b_mv_ut

b_mv_dt

Triggers:

b_mv_it

b_mv_ut

b_mv_dt


Triggers:

b_mv_it

b_mv_ut

b_mv_dt


1

n

09.01.2012

14

Experiment 1: Setting


We wrote PL/
pgSQL

function for test data generation


Our function generates the desired number of tuples
randomly and fills Gardner’s example database with them


For each number of tuples we measured three times the
query time on both the Materialized View and the ordinary
view using the SQL commands:



“SELECT * FROM
b_mv
;”


“SELECT * FROM
b_v
;”



To compensate outliers we calculated the arithmetic mean of
each measurement result triplet

09.01.2012

15

Experiment 1: Setting


Specifications of the system we’ve used for our
experiments:



CPU: Intel Core 2 Duo T9550 @ 2.66 GHz


RAM: 4 GB DDR2 SDRAM @ 1066 MHz


Hard drive: Hitachi HTS 250GB @ 7200 rpm


Operating System: Windows 7 Professional SP 1 (32 bit)


DBMS: PostgreSQL 9.0.5
-
1 with pgAdminIII 1.12.3


09.01.2012

16

Experiment 1: Results

0

20'000

40'000

60'000

80'000

100'000

120'000

140'000

160'000

180'000

Query time [ms]

Number of tuples

Query on view b_v

Query on materialized view b_mv

09.01.2012

17

Experiment 1: Discussion


Materialized Views really increase the query
performance compared to ordinary views if there are
many tuples (more than 100’000 in our experiment)
in the view


For 2 million tuples we measured an improvement of
about factor 4


Materialized Views diminish the performance of data
manipulation operations like INSERT, UPDATE and
DELETE

09.01.2012

18

Experiment 2: Implementation


We wanted to show that Gardner‘s suggestions for
the implementation of an Eager Materialized View
are applicable to a more realistic example database


For this purpose we used the „World“ database from
PgFoundry


The World database consists of 3 tables:

country

countrylanguage

city

1

1

1

n

09.01.2012

19

Experiment 2: Implementation

We performed the following steps to implement an Eager Materialized View
on the „World“ database:


1.
Create a view

2.
Identify the primary key of the view

3.
Create the basic tables/functions “
matviews
”, “
create_matview
()”,

drop_matview
()” and “
refresh_matview
()” published in Gardner’s paper

4.
Create the table for the Materialized View using the “
create_matview
()” function

5.
Implement the “
mv_refresh_row
” function with the primary key of the view as
argument

6.
Implement all INSERT, UPDATE and DELETE triggers on the underlying tables

7.
Test if the Materialized View behaves as expected when INSERT, UPDATE and
DELETE operations are performed on the underlying tables


09.01.2012

20

Experiment 2: Result

country

countrylanguage

city

country_v

country_mv

country_mv_refresh_row

Triggers:

country_mv_it

country_mv_ut

country_mv_dt

Triggers:

country_mv_it

country_mv_ut

country_mv_dt


Triggers:

country_mv_it

country_mv_ut

country_mv_dt


1

1

1

n

09.01.2012

21

Experiment 2:
mv_refresh_row
()

CREATE

FUNCTION


country_mv_refresh_row
(
country_mv
.
cl_code
%
TYPE
,

country_mv
.
cl_language
%
TYPE
)


RETURNS

VOID

SECURITY

DEFINER

LANGUAGE

'
plpgsql
'

AS

'

BEGIN


DELETE

FROM

country_mv

WHERE

cl_code

=

$
1

AND

cl_language

=

$
2
;


INSERT

INTO

country_mv

SELECT

*

FROM

country_v

WHERE

cl_code

=

$
1

AND

cl_language

=

$
2
;


RETURN
;

END

'
;

09.01.2012

22

Experiment 2:
country_mv_city_it
()

CREATE

FUNCTION

country_mv_city_it
()

RETURNS

TRIGGER

SECURITY

DEFINER

LANGUAGE

'
plpgsql
'

AS

'

BEGIN


PERFORM

country_mv_refresh_row
(code,

language)


FROM



(SELECT

*

FROM

country,

countrylanguage

WHERE

code

=

countrycode
)

AS

x


WHERE



capital

=

NEW
.
id
;


RETURN

NULL
;

END

'
;

CREATE

TRIGGER

country_mv_it

AFTER

INSERT

ON

city


FOR

EACH

ROW

EXECUTE

PROCEDURE

country_mv_city_it
()
;


09.01.2012

23

Experiment 2: Discussion


We were able to apply Gardner‘s proposal to a more
realistic example database


The “World” database is still a quite simple example


The implementation of an Eager Materialized View
following Gardner’s proposition is quite complex
even on this simple example

09.01.2012

24

Conclusion


Materialized Views really improve the query
performance


Gardner‘s approach to realize Eager Materialized Views
in PostgreSQL works but it is complicated to implement


Materialized Views need more memory than ordinary
views and some of them diminish the performance of
data manipulation operations


Other DBMS have built in support for Materialized views


Built in support for Materialized Views would be a good
feature for PostgreSQL

09.01.2012

25

Questions?