A Method for Online Reorganization of a Database

basesprocketData Management

Oct 31, 2013 (3 years and 7 months ago)

98 views

Online REORG DB2
1

GHS
11/1/2013

A Method for Online Reorganization
of a Database

Gary Sockut

(joint work with Thomas Beavin & Chung
-
Chia Chang)


Work performed at

IBM Silicon Valley Laboratory, San Jose, CA

Online REORG DB2
2

GHS
11/1/2013

Introduction: reorganization

Reorganization
:
Δ

way arranged (logical / physical).


Any
DBMS

& database might need:


Δ

logical database definition:

generalize 1
-
many
-


many
-
many, split column.


Δ

physical database definition:

construct
index
, split partition.


Restore physical arrangement of instances

without
Δ

definition:

* Compact space.

* Collect garbage.

* Restore
clustering

(store near each other; criteria).


I/O.


Writing degrades; reorganization restores. This method.

Online REORG DB2
3

GHS
11/1/2013

highly available (24 x 7)
(web commerce, armed
forces)

very large

Introduction: offline vs. online reorganization



Reorganize
online

(concurrently with

usage or incrementally within users’

transactions). Importance

.

TIME

USAGE

BATCH WINDOW (offline OK)

USAGE

REORGANIZATION

Traditional: reorganize
offline
:

2 categories:

unacceptable

Online REORG DB2
4

GHS
11/1/2013

Introduction: online reorganization

Method for restoring clustering online.

IBM’s Database 2 (DB2) relational DBMS.


Rest of presentation:


Storage structures


Overview (unload, reload, process
log
); problem; solution


Steps of reorganization & more details


Comparison: previous research

Online REORG DB2
5

GHS
11/1/2013

ID map

Storage structures


Not
unique key
.


Fil攠p慧敳e ind數敳e 敮瑲i敳ein log

u獥s
r散erd id敮瑩fi敲

(
剉R
)

瑯 id敮瑩晹 r散erds:


Page # & # within page. ID map.


Δ

only

during reorganization.


Header of page: position in log (
log record sequence
number

(
LRSN
)
) current when most recently written.

LRSN

data

data

*

*

data record

data record

file page:


Rows

of
tables

in
data records

in
file pages
:


Online REORG DB2
6

GHS
11/1/2013

ID map

ID map

Storage structures


Variable
-
length: grow:
regular

-


pointer

&
overflow
:

LRSN

data

RID

*

*

regular

data record

pointer

data record

file page:

LRSN

data

data

*

*

regular

data record

overflow

data record

file page:

bad!

Online REORG DB2
7

GHS
11/1/2013

Storage structures: writing

Users write rows
-


DBMS writes data records (& log entries).

Usually,
I
nsert /
U
pdate /
D
elete row
-


1 regular data record.

Exceptions

on
U
pdate:


Has regular & new data too large for regular's page
-


I overflow & U regular to pointer.


Has overflow & new data fits on overflow's page
-


U overflow to overflow.


Has overflow & new data too large for overflow's page
-


* Now room on pointer's page
-



U pointer to regular & D overflow.

* Still no room on pointer's page
-



I overflow, D overflow, & U pointer to pointer.

Exception

on
D
elete: has overflow
-


D pointer & overflow.



Backout

(undo if failure) reverses.

Online REORG DB2
8

GHS
11/1/2013

Storage structures: reorganization


Database administrator invokes
reorganization
:

* Restores clustering


(
clustering key
: not necessarily unique).

* Removes overflow.

* Distributes free space.



Improves performance.


Area being reorganized
: set of 1 or more tables

or set of 1 or more partitions of 1 table (parameter).


Offline
:

1.
Read
-
only

(
R/O
)
: unload data, sort.

2.
No access
: reload.


Online
:
Read/write

(
R/W
)
: most.
--


Online REORG DB2
9

GHS
11/1/2013

Overview of method for
online

reorganization

An inspiration:

Fuzzy dumping
: unload data (backup)

while letting users write:

A.
Dumping
: (1) record
current LRSN

for
log;
(2)

unload
.



B.
Recovery
:
(1) reload
;

(2) bring up to date:

apply log entries

(start: recorded
LRSN
).

Ignore log entry whose LRSN


LRSN of indicated page.

data
-

base

back
-

up

log

data
-

base

back
-

up

1

2

-


Fuzzy reorganization
.

Online REORG DB2
10

GHS
11/1/2013

Overview of method for online reorganization

A) Record current LRSN;
unload, sort, reload (reorganize).

(R/W):

users’

data

old copy

of area

new copy

of area

log

READ

WRITE

UNLOAD,

SORT, RELOAD

users’

data

old copy

of area

new copy

of area

log

READ

WRITE

B)
Process log (read; apply).

(
n

iterations R/W; 1 R/O):

PROCESS

users’

data

new copy

of area

log

READ

WRITE

C)
Switch users' accessing to new.

(brief offline; R/W):

name 0

name 1

name 0

name 1

name 0

name 1

Online REORG DB2
11

GHS
11/1/2013

Problem & solution

Problem
: reorganization
Δ

RIDs:


Log entries:
old

RIDs.


Apply: identify data record in new copy (
new

RID).


Solution
:


Temporary table maps
old

&
new

RIDs, stores LRSNs.


Translate log entries before applying.


Novelty
: interaction: maintain table & process log.

Online REORG DB2
12

GHS
11/1/2013

Main steps of reorganization (more detail)

1.
Record current LRSN for log.


2.




3.







4.


5.



6.



7.



8.



9.


--


Delete old.

Start R/W new.

Switch future accesses: exchange names of files

(data, indexes). Users: no access.

Quiesce all access.

Process log. Append changed pages to backup. R/O.

Quiesce writing (block; wait until finished). Record LRSN.

Process log: iterate:

* Read subset of log between recorded LRSNs


(users' writing): translate; apply to new. R/W.

* Record current LRSN. Iterate or next step (criteria).

* Last iteration: append changed pages to backup.

Unload data, sort, reload into new (includes indexes),

create backup copy. Users R/W old.

Table maps: old & new RIDs. Record current LRSN.

Online REORG DB2
13

GHS
11/1/2013

old

RID

new

RID

Unloading, sorting, & reloading (R/W)

2. Sort by clustering key.

old copy

of area

new copy

of area

1. UNLOAD

2. SORT

3. RELOAD

1. ADD ENTRY

3. ADD ENTRY

mapping

table

sort file

1. Unload: scan sequentially:

* Regular or overflow: unload data, old RID, & LRSN of page.

* Pointer: add entry to mapping table.

3. Reload: also add entry to mapping table.

Online REORG DB2
14

GHS
11/1/2013

Processing of log (
n

R/W & 1 R/O) 5 phases

5. APPLY

(insertion: actual)

pointers. sort: LRSN

copy of log (buffer)

log

1. COPY

2. SORT BY OLD RID

(speed access to mapping table)

pointers. sort:
old

RID

copy of log (buffer)

3. TRANSLATE RIDS
(insertion: estimate, for sorting)

pointers. sort:
old

RID

copy of log (buffer)

mapping

table

4. SORT BY NEW RID

(speed access to new copy of area)

pointers. sort:
new

RID

copy of log (buffer)

pointers. sort:
new

RID

copy of log (buffer)

mapping table

new copy

of area

3, 5, control of iterations
--


Online REORG DB2
15

GHS
11/1/2013

Phase 3: translation of RIDs

DBMS's log application (recovery) ignores log entry if LRSN


page’s. Translation ignores log entry if LRSN


mapping table’s.

Insertion
:


R or O: estimate new RID; store in log entry (buffer);

insert entry in mapping table (old, estimated new).


P: delete log entry; insert entry in mapping table (old, no new).

Update
:



R to R or O to O: store new RID in log entry.



P to P: delete log entry.



P to R: ~I: estimate new RID. Log entry: store estimated new



RID; U
-


I. Mapping table: store estimated new RID.



R to P: ~D: log entry: store new RID; U
-


D.

Deletion
:



R or O: store new RID in log entry; delete mapping table’s entry.



P: delete log entry; delete mapping table’s entry.

Online REORG DB2
16

GHS
11/1/2013

Phase 5: application

Scan set of pointers to log entries (sorted by new RID).

For each RID value:

1.
Find all pointers (contiguous).

2.






3.



Apply sequentially:

I
:

* Insert in new copy; obtain actual new RID.

* In mapping table & in log entries for current RID,


estimated new RID
-


actual.

U

or
D
: treat like DBMS's handling of user's.

≥ 1 D log entry
-


delete certain log entries:

* 1st entry is I
-


delete last D & all preceding.

* 1st entry is D or U
-


keep last D; delete all preceding.

Omit log entries for which no entries in mapping table;

omit I U U D.

Online REORG DB2
17

GHS
11/1/2013

HI

LO

Control of iterations of log processing

Iterate (allowing R/W) until:

1.
Estimated time for next


parameter for maximum

R/O
-


next is last (R/O). Parameter:
trade
-
off.
--



2.
Estimated completion for next > deadline

(parameter)
-


cancel reorganization.

3.
Amount of log for next not sufficiently < current

(not catching up)
-


send message to operator:

After delay (parameter), DBMS will continue,

quiesce writing, or cancel (parameter).

During delay, database administrator can

Δ

parameters, adjust priorities,

quiesce writing, cancel, or let DBMS take action.



Display status.

iteration

time

Online REORG DB2
18

GHS
11/1/2013

Scheduling of online reorganization

1.
High tolerance of delay
-


R/O & offline tolerable.

2.
Low rate of writing
-


easy to catch up to log.

3.
No long
-
running transactions
-


quick quiescing.


Trade
-
off.


Online REORG DB2
19

GHS
11/1/2013

Comparison with previous research

Calculation of clustering: not novel;

Sort by clustering key (index); assign to pages.



Compare
mapping tables
,
fuzzy reorganization
:



1)
Mapping tables
:


Omiecinski et al.: reorganization
in place
;

map RIDs to translate entries in leaves of indexes.


Wiener & Naughton: loading data into object database;

map surrogate object identifiers (source file) into object IDs.



2) Several authors mention
fuzzy reorganization

(no detail).

Unique identifier does not
Δ
.


N
o m慰ping 瑡tl攮

M慮礠捵獴sm敲猠di獬i步 r敱uirem敮琮

Fuzzy reorganization when RIDs
Δ

--


Online REORG DB2
20

GHS
11/1/2013

Comparison with previous research

O'Toole et al.
: fuzzy garbage collection

for persistent data.
Forwarding field
:



In database context (not O'Toole),
mapping table
’s advantages:

1.
Reorganization copies, user deletes, DBMS reuses space
(not O'Toole). Mapping table safe; forwarding field gone.

2.
Mapping table entry < data record
.


F敷敲 p慧敳e



Less I/O to read/write forwarding information.

3.
Less locking for old (
unload old
,
reload/map
,
process log
):
Mapping table: (
R
,
no
,
no
). Forwarding field: (
R
,
W
,
R/W?
).

4.
Avoid extra space (permanent) for forwarding field

(O'Toole: space already existed).



Forwarding field
’s advantages:

1.
One field (already existed: O'Toole), not two (temporary).

2.
Seems simpler.

old copy

new copy

data

RID

data

Online REORG DB2
21

GHS
11/1/2013

Possible (but rejected) alternatives

1.
Reorganize
only index

online:

support, but not enough.



3.





4.


record’s

position

record’s

position


Δ
▬►

user’s

position

in scan

reorganization

Reorganize
partition

(index) (support):

fine
-
grained
; reorganize
offline
:

* Correct other indexes.

* Slow routing, increase space for partition descriptors.

* Less uniform growth/shrinkage
-


increase total free space:



coarse
: |
high

growth

shrinkage

low growth

|



fine
: |
high

growth

|
shrinkage

|
low growth

|

Reorganize online
in place

(not

by copying): inaccuracy.
--


Complexity and/or slowness,

or low
-
concurrency locking.

Online REORG DB2
22

GHS
11/1/2013

Publications

Web site: Google GARY SOCKUT

(
http://alum.mit.edu/www/ghs
); click “publications”:
-


pdf.

1.
This work
: G. H. Sockut, T. A. Beavin, & C.
-
C. Chang, “A
Method for On
-
line Reorganization of a Database,”
IBM
Systems Journal
, Vol. 36, No. 3, 1997, pp. 411
-
436;
erratum in Vol. 37, No. 1, 1998, p. 152. Web site: slides.

2.
Survey on online reorganization

(not just IBM, not just
clustering, not just fuzzy): G. H. Sockut & B. R. Iyer,
“Online Reorganization of Databases,”
Computing
Surveys
, Vol. 41, No. 3, ACM, Article 14, July 2009, 136
pages. Web site: table of contents.

Online REORG DB2
23

GHS
11/1/2013

Reorganize online:


Copy data from old to new in reorganized form; R/W.

Table maps between old & new RIDs.


Apply log to new; map.


Switch users' accessing to new.




Summary

Online REORG DB2
24

GHS
11/1/2013

Appendices


Online REORG DB2
25

GHS
11/1/2013

Appendix: effect of fine
-
grained partitioning



population density

(people / square mile)

urban area with
many high
-
rise
apartment buildings
(Manhattan
Community District
8: ~ Upper East Side)

uninhabited area

New

Jersey

Alaska

coarse (state)

fine (square mile)

granularity of partitioning

Population densities in the area covered by the 50 states:

finer

0

~1 (census)

~100K

(demographia

.com)

~1K (census)