216 - Department of Computer Science

grassquantityΤεχνίτη Νοημοσύνη και Ρομποτική

15 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

163 εμφανίσεις

Cs257 Summary

By:
-

Rupinder

Singh

216

Chapter 13

SECONDARY STORAGE
MANAGEMENT

13.1.1 Memory Hierarchy


Data storage capacities varies for different data


Cost per byte to store data also varies


Device with smallest capacity offer the fastest
speed with highest cost per bit


Memory Hierarchy Diagram


Programs, DBMS

Main Memory DBMS’s

Main Memory

Cache

As Visual Memory Disk File System

Tertiary Storage

13.1.1 Memory Hierarchy


Cache


Lowest level of the hierarchy


Data items are copies of certain locations of main memory


Sometimes, values in cache are changed and
corresponding changes to main memory are delayed


Machine looks for instructions as well as data for those
instructions in the cache


Holds limited amount of data


No need to update the data in main memory immediately in a single
processor computer


In multiple processors data is updated immediately to main
memory….called as write through


Main Memory


Refers to physical memory that is internal to the computer. The word main
is used to distinguish it from external mass storage devices such as disk
drives.


Everything happens in the computer i.e.
instruction execution, data manipulation, as
working on information that is resident in
main memory


Main memories are random access….one can
obtain any byte in the same amount of time




Secondary storage



Used to store data and programs when they
are not being processed


More permanent than main memory, as data
and programs are retained when the power is
turned off


A personal computer might only require 20,000 bytes of
secondary storage


E.g. magnetic disks, hard disks

Tertiary Storage


consists of anywhere from one to several storage drives.



It is a comprehensive computer storage system that is usually
very slow, so it is usually used to archive data that is not
accessed frequently.


Holds data volumes in terabytes


Used for databases much larger than what can
be stored on disk


13.1.2 Transfer of Data Between levels


Data moves between adjacent levels of the hierarchy


At the secondary or tertiary levels accessing the desired data or finding the desired
place to store the data takes a lot of time


Disk is organized into bocks


Entire blocks are moved to and from memory called a
buffer


A key technique for speeding up database operations is to arrange the data so that
when one piece of data block is needed it is likely that other data on the same
block will be needed at the same time


Same idea applies to other hierarchy levels




13.1.3 Volatile and Non Volatile




Storage



A volatile device forgets what data is stored on
it after power off


Non volatile holds data for longer period even
when device is turned off


Secondary and tertiary devices are non volatile


Main memory is volatile

13.1.4 Virtual Memory


computer system technique which gives an application program the impression that it
has contiguous working memory (an address space), while in fact it may be physically
fragmented and may even overflow on to disk storage


technique make programming of large applications easier and use real physical
memory (e.g. RAM) more efficiently


Typical software executes in virtual memory


Address space is typically 32 bit or 2
32

bytes or 4GB


Transfer between memory and disk is in terms of
blocks


13.2.1 Mechanism of Disk



Mechanisms of Disks


Use of secondary storage is one of the important
characteristic of DBMS


Consists of 2 moving pieces of a disk


1. disk assembly


2. head assembly


Disk assembly consists of 1 or more platters


Platters rotate around a central spindle


Bits are stored on upper and lower surfaces of platters


13.2.1 Mechanism of Disk



Disk is organized into tracks


The track that are at fixed radius from center
form one cylinder


Tracks are organized into sectors


Tracks are the segments of circle separated by
gap






13.2.2 Disk Controller


One or more disks are controlled by disk
controllers


Disks controllers are capable of


Controlling the mechanical actuator that moves the head assembly


Selecting the sector from among all those in the
cylinder at which heads are positioned


Transferring bits between desired sector and main
memory


Possible buffering an entire track


13.2.3 Disk Access Characteristics


Accessing (reading/writing) a block requires 3
steps


Disk controller positions the head assembly at the
cylinder containing the track on which the block is
located. It is a ‘seek time’


The disk controller waits while the first sector of
the block moves under the head. This is a
‘rotational latency’


All the sectors and the gaps between them pass the head, while disk
controller reads or writes data in these sectors. This is a ‘transfer time’


13.3 Accelerating Access to Secondary


Storage


Secondary storage definition


Several approaches for more
-
efficiently accessing
data in secondary storage:


Place blocks that are together in the same cylinder.


Divide the data among multiple disks.


Mirror disks.


Use disk
-
scheduling algorithms.


Prefetch

blocks into main memory.


Scheduling Latency


added delay in accessing
data caused by a disk scheduling algorithm.


Throughput


the number of disk accesses per
second that the system can accommodate.


13.3.1 The I/O Model of Computation


The number of block accesses (Disk I/O’s) is a
good time approximation for the algorithm.


Disk I/o’s proportional to time taken, so should be minimized.


Ex 13.3: You want to have an index on
R

to
identify the block on which the desired tuple
appears, but not where on the block it
resides.


For Megatron 747 (M747) example, it takes 11ms
to read a 16k block.


delay in searching for the desired tuple is negligible.


13.3.2 Organizing Data by Cylinders


first seek time and first rotational latency can never be neglected
.


Ex 13.4: We request 1024 blocks of M747.


If data is randomly distributed, average latency is
10.76ms by Ex 13.2, making total latency 11s.


If all blocks are consecutively stored on 1 cylinder:


6.46ms + 8.33ms * 16 = 139ms


(1 average seek)

(time per rotation)

(# rotations)

13.3.3 Using Multiple Disks


Number of disks is proportional to the factor by which performance is
performance will increase by improved


Striping


distributing a relation across multiple
disks following this pattern:


Data on disk R
1
: R
1
, R
1+n
, R
1+2n
,…


Data on disk R
2
: R
2
, R
2+n
, R
2+2n
,…








Data on disk
R
n
:
R
n
,
R
n+n
, R
n+2n
, …



Ex 13.5: We request 1024 blocks with
n
= 4.


6.46ms + (8.33ms * (16/4)) = 39.8ms


(1 average seek)

(time per rotation)

(# rotations)

13.3.4 Mirroring Disks


Mirroring Disks


having 2 or more disks hold
identical copy of data.


Benefit 1: If
n

disks are mirrors of each other,
the system can survive a crash by
n
-
1 disks.


Benefit 2: If we have
n

disks, read
performance increases by a factor of
n
.


Performance increases =>increasing efficiency

13.3.5 Disk Scheduling and the




Elevator Problem


Disk controller will run this algorithm to select
which of several requests to process first.


Pseudo code:


requests[]
// array of all non
-
processed data requests


upon receiving new data request:


requests[].add(new request)


while(requests[] is not empty)


move head to next location


if(head is at data in requests[])


retrieves data


removes data from requests[]


if(head reaches end)


reverses head direction

13.3.5 Disk Scheduling and the




Elevator Problem (
con’t
)

Events:

Head starting point

Request data at 8000

Request data at 24000

Request data at 56000

Get data at 8000

Request data at 16000

Get data at 24000

Request data at 64000

Get data at 56000

Request Data at 40000

Get data at 64000

Get data at 40000

Get data at 16000

data

time

Current time

Current time

0

Current time

4.3

Current time

10

Current time

13.6

Current time

20

Current time

26.9

Current time

30

Current time

34.2

Current time

45.5

Current time

56.8

8000

16000

24000

32000

40000

48000

56000

64000

data

time

8000
..

4.3

data

time

8000
..

4.3

24000
..

13.6

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

64000
..

34.2

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

64000
..

34.2

40000
..

45.5

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

64000
..

34.2

40000
..

45.5

16000
..

56.8

13.3.5 Disk Scheduling and the




Elevator Problem (
con’t
)

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

64000
..

34.2

40000
..

45.5

16000
..

56.8

data

time

8000
..

4.3

24000
..

13.6

56000
..

26.9

16000
..

42.2

64000
..

59.5

40000
..

70.8

Elevator
Algorithm

FIFO

Algorithm

13.3.6
Prefetching

and Large
-
Scale


Buffering


If at the application level, we can predict the
order blocks will be requested, we can load
them into main memory before they are
needed.


This even reduces the cost and even save the time

13.4.Disk Failures


Intermittent

Error
: Read or write is unsuccessful.


If we try to read the sector but the correct content of that
sector is not delivered to the disk controller. Check for the
good or bad sector. To check write is correct: Read is
performed.
Good sector and bad sector is known by the read
operation.



Checksums:
Each sector has some additional bits, called the
checksums. They are set on the depending on the values of
the data bits stored in that sector. Probability of reading bad
sector is less if we use checksums. For Odd parity: Odd
number of 1’s, add a parity bit 1. For Even parity: Even
number of 1’s, add a parity bit 0. So, number of 1’s becomes
always even.



Example:


1. Sequence : 01101000
-
> odd no of 1’s



parity bit: 1
-
> 011010001



2. Sequence : 111011100
-
>even no of 1’s



parity bit: 0
-
> 111011100



Stable
-
Storage Writing Policy:


To recover the disk failure known as
Media Decay,
in which if
we overwrite a file, the new data is not read correctly. Sectors
are paired and each pair is said to be X, having left and right
copies as Xl and
Xr

respectively and check the parity bit of left
and right by substituting spare sector of Xl and
Xr

until the
good value is returned.





The term used for these strategies is RAID or
Redundant
Arrays of Independent Disks
.


Mirroring:


Mirroring Scheme is referred as RAID level 1 protection
against data loss scheme. In this scheme we mirror each disk.
One of the disk is called as
data disk

and other
redundant
disk
.
In this case the only way data can be lost is if there is a
second disk crash while the first crash is being repaired.


Parity Blocks:


RAID level 4 scheme uses only one redundant disk no matter
how many data disks there are. In the redundant disk, the ith
block consists of the parity checks for the ith blocks of all the
data disks. It means, the jth bits of all the ith blocks of both
data disks and redundant disks, must have an even number of
1’s and redundant disk bit is used to make this condition true.







Failures: If out of Xl and Xr, one fails, it can be read form other, but
in case both fails X is not readable, and its probability is very small


Write Failure: During power outage,



1. While writing Xl, the Xr, will remain good and X can be read
from Xr



2. After writing Xl, we can read X from Xl, as Xr may or may not
have the correct copy of X.



Recovery from Disk Crashes:


To reduce the data loss by Dish crashes, schemes which involve
redundancy, extending the idea of parity checks or duplicate sectors
can be applied.





Parity Block


Writing


When we write a new block of a data disk, we need to change that
block of the redundant disk as well.



One approach to do this is to read all the disks and compute the
module
-
2 sum and write to the redundant disk.


But this approach requires n
-
1 reads of data, write a data block and
write of redundant disk block.




Total = n+1 disk I/Os


RAID 5



RAID 4 is effective in preserving data unless there are two
simultaneous disk crashes.











Error
-
correcting codes theory known as Hamming code leads to the
RAID level 6.



By this strategy the two simultaneous crashes are correctable.


The bits of disk 5 are the modulo
-
2 sum of the corresponding bits
of disks 1, 2, and 3.

The bits of disk 6 are the modulo
-
2 sum of the corresponding bits
of disks 1, 2, and 4.

The bits of disk 7 are the module2 sum of the corresponding bits
of disks 1, 3, and 4

Coping With Multiple Disk Crashes


Reading/Writing


We may read data from any data disk normally.



To write a block of some data disk, we compute the modulo
-
2 sum
of the new and old versions of that block. These bits are then
added, in a modulo
-
2 sum, to the corresponding blocks of all those
redundant disks that have 1 in a row in which the written disk also
has 1.

Whatever scheme we use for updating the disks, we need to
read and write the redundant disk's block. If there are n data
disks, then the number of disk writes to the redundant disk
will be n times the average number of writes to any one data
disk.



However we do not have to treat one disk as the redundant
disk and the others as data disks. Rather, we could treat each
disk as the redundant disk for some of the blocks. This
improvement is often called
RAID level 5.



13.5
Arranging data on disk


Data elements are represented as records, which stores in
consecutive bytes in same
same

disk block.


Basic layout techniques of storing data :




Fixed
-
Length Records



Allocation criteria
-

data should start at word boundary.



Fixed Length record header


1. A pointer to record schema.


2. The length of the record.



3. Timestamps to indicate last modified or last read.














Example


CREATE TABLE employee(




name CHAR(30) PRIMARY KEY,

address VARCHAR(255),

gender CHAR(1),

birthdate

DATE

);

Data should start at word boundary and contain header and four
fields name, address, gender and
birthdate
.



Packing Fixed
-
Length Records into Blocks

Records are stored in the form of blocks on the disk and they
move into main memory when we need to update or access
them.

A block header is written first, and it is followed by series of
blocks.

Block header contains the following information:


Links to one or more blocks that are part of a network of
blocks.


Information about the role played by this block in such a
network.


Information about the relation, the
tuples

in this block belong
to.


A "directory" giving the offset of each record in the block.


Time stamp(s) to indicate time of the block's last modification
and/or access



Along with the header we can pack as many record as we can

Along with the header we can pack as many record as we can

in one block as shown in the figure and remaining space will

be unused.


13.6 Representing Block and Record Addresses


Address of a block and Record


In Main Memory



Address of the block is the virtual memory address of
the first byte



Address of the record within the block is the virtual
memory address of the first byte of the record


In Secondary Memory: sequence of bytes describe the
location of the block in the overall system


Sequence of Bytes describe the location of the block : the
device Id for the disk, Cylinder number, etc.




Addresses in Client
-
Server Systems


The addresses in address space are represented in two ways


Physical Addresses: byte strings that determine the place
within the secondary storage system where the record can
be found.


Logical Addresses: arbitrary string of bytes of some fixed
length


Physical Address bits are used to indicate:


Host to which the storage is attached


Identifier for the disk


Number of the cylinder


Number of the track


Offset of the beginning of the record



Map Table relates logical addresses to physical addresses
.

Logical

Physical

Logical
A
ddress

Physical Address


Logical and Structured Addresses


Purpose of logical address?


Gives more flexibility, when we


Move the record around within the block


Move the record to another block


Gives us an option of deciding what to do when a record
is deleted?


Pointer
Swizzling


Having pointers is common in an object
-
relational
database systems


Important to learn about the management of pointers


Every data item (block, record, etc.) has two addresses:


database address: address on the disk


memory address, if the item is in virtual memory






Translation Table: Maps database address to
memory address








All addressable items in the database have entries
in the map table, while only those items currently
in memory are mentioned in the translation table

Dbaddr

Mem
-
addr

Database address

Memory Address


Pointer consists of the following two fields


Bit indicating the type of address


Database or memory address


Example 13.17



Disk

Block 2

Block 1

Memory

Swizzled

Unswizzled

Block 1


Example 13.7


Block 1 has a record with pointers to a second record on the
same block and to a record on another block


If Block 1 is copied to the memory


The first pointer which points within Block 1 can be
swizzled so it points directly to the memory address of the
target record


Since Block 2 is not in memory, we cannot swizzle the
second pointer


Three types of swizzling


Automatic Swizzling


As soon as block is brought into memory, swizzle all
relevant pointers.




Swizzling on Demand


Only swizzle a pointer if and when it is actually
followed.



No Swizzling


Pointers are not swizzled they are accesses using the
database address.


Unswizzling


When a block is moved from memory back to disk, all
pointers must go back to database (disk) addresses


Use translation table again


Important to have an efficient data structure for the
translation table




Pinned records and Blocks


A block in memory is said to be pinned if it cannot be written
back to disk safely.


If block B1 has swizzled pointer to an item in block B2, then B2
is pinned


Unpin a block, we must unswizzle any pointers to it


Keep in the translation table the places in memory holding
swizzled pointers to that item


Unswizzle those pointers (use translation table to replace
the memory addresses with database (disk) addresses




13.7 Records With Variable
-
Length Fields

A

simple

but

effective

scheme

is

to

put

all

fixed

length

fields

ahead

of

the

variable
-
length

fields
.

We

then

place


in

the

record

header
:

1
.

The

length

of

the

record
.

2
.

Pointers

to

(i
.
e
.
,

offsets

of)

the

beginnings

of

all

the

variable
-
length

fields
.

However,

if

the

variable
-
length

fields

always

appear

in

the

same

order

then

the

first

of

them

needs

no

pointer
;

we

know

it

immediately

follows

the

fixed
-
length

fields
.




Records With Repeating Fields


A

similar

situation

occurs

if

a

record

contains

a

variable

number

of

Occurrences

of

a

field

F,

but

the

field

itself

is

of

fixed

length
.

It

is

sufficient

to

group

all

occurrences

of

field

F

together

and

put

in

the

record

header

a

pointer

to

the

first
.


We

can

locate

all

the

occurrences

of

the

field

F

as

follows
.

Let

the

number

of

bytes

devoted

to

one

instance

of

field

F

be

L
.

We

then

add

to

the

offset

for

the

field

F

all

integer

multiples

of

L,

starting

at

0
,

then

L,

2
L,

3
L,

and

so

on
.


Eventually,

we

reach

the

offset

of

the

field

following

F
.

Where

upon

we

stop
.


An

alternative

representation

is

to

keep

the

record

of

fixed

length,

and

put

the

variable

length

portion

-

be

it

fields

of

variable

length

or

fields

that

repeat

an

indefinite

number

of

times

-

on

a

separate

block
.

In

the

record

itself

we

keep
:


1
.

Pointers

to

the

place

where

each

repeating

field

begins,

and


2
.

Either

how

many

repetitions

there

are,

or

where

the

repetitions

end
.



Variable
-
Format Records


The simplest representation of variable
-
format records is a
sequence of
tagged fields, each of
which consists of:

1. Information about the role of this field, such as:


(a) The attribute or field name,


(b) The type of the field, if it is not apparent from the

field
name and some readily available schema

information, and


(c)
The length of the field, if it is not apparent from the

type.

2. The value of the field.

There

are

at

least

two

reasons

why

tagged

fields

would

make

sense
.

1.
Information

integration

applications

-

Sometimes,

a

relation

has

been

constructed

from

several

earlier

sources,

and

these

sources

have

different

kinds

of

information

For

instance,

our

movie

star

information

may

have

come

from

several

sources,

one

of

which

records

birthdates,

some

give

addresses,

others

not,

and

so

on
.

If

there

are

not

too

many

fields,

we

are

probably

best

off

leaving

NULL

those

values

we

do

not

know
.


2
.

Records

with

a

very

flexible

schema

-

If

many

fields

of

a

record

can

repeat

and/or

not

appear

at

all,

then

even

if

we

know

the

schema,

tagged

fields

may

be

useful
.

For

instance,

medical

records

may

contain

information

about

many

tests,

but

there

are

thousands

of

possible

tests,

and

each

patient

has

results

for

relatively

few

of

them



These

large

values

have

a

variable

length,

but

even

if

the

length

is

fixed

for

all

values

of

the

type,

we

need

to

use

some

special

techniques

to

represent

these

values
.

In

this

section

we

shall

consider

a

technique

called

“spanned

records"

that

can

be

used

to

manage

records

that

are

larger

than

blocks
.


Spanned

records

also

are

useful

in

situations

where

records

are

smaller

than

blocks,

but

packing

whole

records

into

blocks

wastes

significant

amounts

of

space
.


For

both

these

reasons,

it

is

sometimes

desirable

to

allow

records

to

be

split

across

two

or

more

blocks
.

The

portion

of

a

record

that

appears

in

one

block

is

called

a

record

fragment
.

If

records

can

be

spanned,

then

every

record

and

record

fragment

requires

some

extra

header

information
:



1
.

Each

record

or

fragment

header

must

contain

a

bit

telling

whether

or

not

it

is

a

fragment
.

2
.

If

it

is

a

fragment,

then

it

needs

bits

telling

whether

it

is

the

first

or

last

fragment

for

its

record
.

3
.

If

there

is

a

next

and/or

previous

fragment

for

the

same

record,

then

the

fragment

needs

pointers

to

these

other

fragments
.

Storing

spanned

records

across

blocks
:




BLOBS


Binary, Large
OBjectS

= BLOBS



BLOBS can be images, movies, audio files and other very large
values that can be stored in files.



Storing BLOBS



Stored in several blocks.



Preferable to store them consecutively on a cylinder or
multiple disks for efficient retrieval.



Retrieving BLOBS



A client retrieving a 2 hour movie may not want it all at the
same time.



Retrieving a specific part of the large data requires an
index structure to make it efficient. (Example: An index by
seconds on a movie BLOB.)


Column Stores:


An alternative to storing tuples as records is to store each
column as a record. Since an entire column of a relation may
occupy far more than a single block, these records may span
many block, much as long as files do. If we keep the values in
each column in the same order then we can reconstruct the
relation from column records


13.8


Insertion:


Insertion of records without order


Records can be placed in a block with empty space or in a
new block.


Insertion of records in fixed order


Space available in the block


No space available in the block (outside the block)



Structured address

Pointer to a record from outside the block.


Insertion in fixed order

Space available within the block

Use of an offset table in the header of each block with pointers to
the location of each record in the block
.



The records are slid within the block and the pointers in the
offset table are adjusted.

No space available within the block (outside the block)


Find space on a “nearby” block.


In case of no space available on a block, look at the
following block in sorted order of blocks.


If space is available in that block ,move the highest
records of first block 1 to block 2 and slide the records
around on both blocks.


Create an overflow block


Records can be stored in overflow block.


Each block has place for a pointer to an overflow block in
its header. The overflow block can point to a second
overflow block as shown below.



Deletion:


Recover space after deletion

When using an offset table, the records can be slid around
the block so there will be an unused region in the center
that can be recovered.

In case we cannot slide records, an available space list can be
maintained in the block header.

The list head goes in the block header and available regions
hold the links in the list.


Use of tombstone



The tombstone is placed in a record in order to avoid
pointers to the deleted record to point to new records.


The tombstone is permanent until the entire database is
reconstructed.





If pointers go to fixed locations from which the location of the
record is found then we put the tombstone in that fixed
location. (See examples)

Where a tombstone is placed depends on the nature of the record
pointers.

Map table is used to translate logical record address to physical
address.


UPDATING RECORDS


For Fixed
-
Length Records, there is no effect on
the storage system


For variable length records :


If length increases, like insertion “slide the records”


If length decreases, like deletion we update the space
-
available list, recover the space/eliminate the
overflow blocks.


Chapter:18


18.1 Serial and Serializable Schedule


A

process

of

assuming

that

the

transactions

preserve

the

consistency

when

executing

simultaneously

is

called

Concurrency

Control
.


This

consistency

is

taken

care

by

Scheduler
.


Concurrency control in database management systems
(DBMS) ensures that database transactions are performed
concurrently without the concurrency violating the data
integrity of a database.


Executed transactions should follow the ACID rules. The
DBMS must guarantee that only serializable (unless
Serializability is intentionally relaxed), recoverable schedules
are generated.




It also guarantees that no effect of committed transactions is lost,
and no effect of aborted (rolled back) transactions remains in the
related database.


ACID

rules


Atomicity

-

Either the effects of all or none of its operations remain
when a transaction is completed
-

in other words, to the outside
world the transaction appears to be indivisible, atomic.



Consistency

-

Every
transaction

must leave the database in a
consistent state
.





Isolation

-

Transactions cannot interfere with each other. Providing
isolation is the main goal of concurrency control.



Durability

-

Successful transactions must persist through
crashes
.





In the field of databases, a schedule is a list of actions, (i.e.
reading, writing, aborting, committing), from a set of
transactions.



In this example, Schedule D is the set of 3 transactions T1, T2,
T3. The schedule describes the actions of the transactions as
seen by the DBMS. T1 Reads and writes to object X, and then
T2 Reads and writes to object Y, and finally T3 Reads and
writes to object Z. This is an example of a serial schedule,
because the actions of the 3 transactions are not interleaved.



Serial and
Serializable

Schedules:


A schedule that is equivalent to a serial schedule has the
serializability

property.


In schedule E, the order in which the actions of the
transactions are executed is not the same as in D, but in the
end, E gives the same result as D.



Serial Schedule
TI precedes T2



T1


T2



A

B







50

50

READ (A,t)






t := t+100





WRITE (A,t)



150




READ (A,s)






s := s*2




WRITE (A,s)


300

READ (B,t)


t := t+100

WRITE (B,t)




150




READ (B,s)




s := s*2




WRITE (B,s)



300



Non
-
Serializable Schedule




T1


T2



A

B







50

50

READ (A,t)






t := t+100





WRITE (A,t)



150




READ (A,s)






s := s*2




WRITE (A,s)


300




READ (B,s)




s := s*2




WRITE (B,s)



100

READ (B,t)


t := t+100

WRITE (B,t)




200



A Serializable Schedule with details




T1


T2



A

B







50

50

READ (A,t)






t := t+100





WRITE (A,t)



150




READ (A,s)






s := s*1




WRITE (A,s)


150




READ (B,s)




s := s*1




WRITE (B,s)



50

READ (B,t)


t := t+100

WRITE (B,t)




150


18.2 Conflict
Serializability


Non
-
Conflicting Actions


Two actions are
non
-
conflicting

if whenever they

occur consecutively in a schedule, swapping them

does not affect the final state produced by the

schedule. Otherwise, they are
conflicting
.


Conflicting Actions: General Rules


Two actions of the same transaction conflict:


r1(A) w1(B)


Two actions over the same database element conflict, if one
of them is a write


r1(A) w2(A)


w1(A) w2(A)



Conflict
Serializable
:

We may take any schedule and make as many
nonconflicting

swaps as we wish.

With the goal of turning the schedule into a serial schedule.


If we can do so, then the original schedule is
serializable
,
because its effect on the database state remains the same as
we perform each of the
nonconflicting

swaps.


A schedule is said to be conflict
-
serializable

when the
schedule is conflict
-
equivalent to one or more serial
schedules.


Another definition for conflict
-
serializability

is that a schedule
is conflict
-
serializable

if and only if there exists an acyclic
precedence graph/
serializability

graph for the schedule.


Which is conflict
-
equivalent to the serial schedule <T1,T2>,
but not <T2,T1>.



Conflict equivalent / conflict
-
serializable


Let Ai and Aj are
consecutive non
-
conflicting actions

that
belongs to different transactions. We can swap Ai and Aj
without changing the result.


Two schedules are
conflict equivalent

if they can be turned
one into the other by a sequence of
non
-
conflicting swaps of
adjacent

actions.


We shall call a schedule
conflict
-
serializable

if it is conflict
-
equivalent to a serial schedule

Test for conflict
-
serializability


Construct the precedence graph for S and observe if there are any
cycles.


If yes, then S is not conflict
-
serializable


Else, it is a conflict
-
serializable schedule.



Example of a cyclic precedence graph:


Consider the below schedule


S
1
: r
2
(A); r
1
(B); w
2
(A); r
2
(B); r
3
(A); w
1
(B); w
3
(A); w
2
(B);


Observing the actions of A in the previous example (figure 2), we can
find that T
2

<s
1

T
3
.



But when we observe B, we get both T
1

<s
1

T
2

and T
2

<s
1

T
1
. Thus
the graph has a cycle between 1 and 2. So, based on this fact we
can conclude that S
1

is not conflict
-
serializable



Why the Precedence
-
Graph test works


A cycle in the graph puts too many constraints on the order of
transactions in a hypothetical conflict
-
equivalent serial schedule.




If there is a cycle involving n transactions T
1

T
2

..T
n

T
1


Then in the hypothetical serial order, the actions of T
1

must
precede those of T
2

which would precede those of T
3
... up to n.


But actions of T
n

are also required to precede those of T
1
.


So, if there is a cycle in the graph, then we can conclude that the
schedule is not conflict
-
serializable.



18.3 Enforcing
Serializability

by Locks:


Locks


It works as follows :


A request from transaction



Scheduler checks in the lock table


Generates a
serializable

schedule of actions.



Consistency of transactions


Actions and locks must relate each other


Transactions can only read & write only if has a lock and
has not released the lock.


Unlocking an element is compulsory.



Legality of schedules


No two transactions can aquire the lock on same element
without the prior one releasing it.


Locking scheduler
-



Grants lock requests only if it is in a legal schedule.



Lock table stores the information about current locks

on the elements.




A legal schedule of consistent transactions but
unfortunately it is not a serializable.



The locking scheduler delays requests that
would result in an illegal schedule.




Two
-
phase

locking


Guarantees a legal schedule of consistent transactions is
conflict
-
serializable.


All lock requests proceed all unlock requests.


The growing phase:


Obtain all the locks and no unlocks allowed.


The shrinking phase:


Release all the locks and no locks allowed.


Failure of 2PL

2PL fails to provide security against deadlocks.




Shared & Exclusive Locks:


Consistency of Transactions


Cannot write without Exclusive Lock


Cannot read without holding some lock


This basically works on these principles



1. Consistency of Transactions


A read action can only proceed a shared or an exclusive
lock


A write lock can only proceed a exclusive lock


All locks need to be unlocked before commit

2.
Two
-
phase locking of transactions


Locking Must precede unlocking

3. Legality of Schedules


An element may be locked exclusively by one transaction
or by several in shared mode, but not both





Compatibility Matrices:


Has a row and column for each lock mode.


Rows correspond to a lock held on an element by another
transaction


Columns correspond to mode of lock requested.


Example :The column for S says that we can grant a shared
lock on an element if the only locks held on that element
currently are shared locks.


Upgrading Locks


Suppose a transaction wants to read as well as write :


It acquires a shared lock on the element


Performs the calculations on the element


And when its ready to write, It is granted a exclusive lock
.


Transactions with unpredicted read write locks can use
upgrading locks.



Indiscriminating use of upgrading produces a deadlock.
(Limitation)


Example : Both the transactions want to upgrade on the same
element


18.4 Locking Systems with Several Lock
Modes


Locking Scheme


Shared/Read Lock ( For Reading)


Exclusive/Write Lock( For Writing)



Compatibility Matrices


Upgrading Locks


Update Locks


Increment Locks


79

Shared & Exclusive Locks


Consistency of Transactions


Cannot write without Exclusive Lock


Cannot read without holding some lock


This basically works on 2 principles


A read action can only proceed a shared or an exclusive
lock


A write lock can only proceed a exclusice lock


All locks need to be unlocked before commit

80

Shared and exclusive locks (cont.)


Two
-
phase locking of transactions


Must precede unlocking


Legality of Schedules


An element may be locked exclusively by one transaction or
by several in shared mode, but not both.



81

Compatibility Matrices


Has a row and column for each lock mode.


Rows correspond to a lock held on an element by
another transaction


Columns correspond to mode of lock requested.


Example :

82

LOCK

REQUESTED

S

X

LOCK

S

YES

NO

HOLD

X

NO

NO

Update locks


Solves the deadlock occurring in upgrade lock
method.


A transaction in an update lock can read but
cant write.


Update lock can later be converted to
exclusive lock.


An update lock can only be given if the
element has shared locks.

83

Increment Locks


Used for incrementing & decrementing stored
values.


E.g.
-

Transfer money from one bank to
another, Ticket selling transactions in which
number seats are decremented after each
transaction.



84

85


Increment lock


A increment lock does not enable either read or write locks on
element.


Any number of transaction can hold increment lock on an
element at any time.


Shared and exclusive locks cannot be granted if an increment
lock is granted on element


18.5 Locking Scheduler



The order in which the individual steps of different
transactions occur is regulated by the scheduler.


The general process of assuring that transactions
preserve consistency when executing simultaneously is
called concurrency control.


Architecture of a Locking Scheduler



The transactions themselves do not request locks, or
cannot be relied upon to do so. It is the job of the
scheduler to insert lock actions into the stream of
reads, writes and other actions that access data.


Transactions do not locks. Rather the scheduler releases the locks
when the transaction manager tells it that the transaction will
commit or abort.

Role of a Scheduler


Lock Table







Lock Table



The lock table is a relation that associates database elements
with locking information about that element.


The table is implemented with a hash table using database
elements as a hash key.


Size of Lock Table


The size of the table is proportional to the number of locked
elements only and not to the entire size of the database since
any element that is not locked does not appear in the table.


Group Mode


The group mode is a summary of the most stringent
conditions that a transaction requesting a new lock on an
element faces. Rather than comparing the lock request with
every lock held by another transaction on the same element,
we can simplify the grant/deny decision by comparing the
request with only the group mode.



Structure of Lock Table Entries


Handling Lock Requests


Suppose transaction T requests a lock on A.


If there is no lock
-
table entry for A, then surely there are no
locks on A, so the entry is created and the request is granted.


If the lock
-
table entry for A exists then we use it to guide the
decision about the lock request.


Handling Unlocks


If the value of waiting is ‘Yes’ then we need to grant one or
more locks from the list of requested locks. The different
approaches for this are:


First
-
come
-
first
-
served


Priority to shared locks


Priority to upgrading


18.6 Managing Hierarchies of Database Elements

It Focus on two problems that come up when there id tree
structure to our data.

1.
Tree Structure : Hierarchy of lockable elements. And
How to allow locks on both large elements, like
Relations and elements in it such as blocks and
tuples

of relation, or individual.

2. Another is data that is itself organized in a tree. A major
example would be
B
-
tree index.


Locks With Multiple Granularity

“Database

Elements”

:

It

is

sometime

noticeably

the

various

elements

which

can

be

used

for

locking
.

Eg
:

Tuples
,

Pages

or

Blocks,

Relations

etc
.










Example: Bank database










Small granularity locks: Larger concurrency can achieved.




Large granularity locks: Some times saves from unserializable
behavior.



Warning

locks

The

solution

to

the

problem

of

managing

locks

at

different

granularities

involves



a

new

kind

of

lock

called

a

“Warning
.




It

is

helpful

in

hierarchical

or

nested

structure

.



It

involves

both

“ordinary”

locks

and

“warning”

locks
.


Ordinary

locks
:

Shared(S)

and

Exclusive(X)

locks
.


Warning

locks
:

Intention

to

shared(IS)

and

Intention

to



Exclusive(IX)

locks
.







Warning Protocols



1. To place an ordinary
S or X lock on any element. we must



begin at the

root of the hierarchy.


2. If we are at the element that we want to lock, we need look
no further. We request lock there only


3. If the element is down in hierarchy then place warning lock
on that node respective of shared and exclusive locks and
then Move on to appropriate child and then try steps 2 or 3
and until you go to desired node and then request shared or
exclusive lock.








Compatibility Matrix

IS column: Conflicts only on X lock.


IX column: Conflicts on S and X locks.


S column: Conflicts on X and IX locks.


X column: Conflicts every locks.



IS

IX

S

X

IS

YES

YES

YES

NO

IX

YES

YES

N O

NO

S

YES

NO

YES

NO

X

NO

NO

NO

NO


Warning Protocols

Consider the relation:

M o v i e ( t i t l e , year, length, studioName)


Transaction1 (T1):

SELECT *

FROM Movie

WHERE title = 'King Kong';


Transaction
2
(T
2
)
:

UPDATE Movie

SET year = 1939

WHERE title = 'Gone With the Wind';




18.7 The Tree Protocol


ADVANTAGES OF TREE PROTOCOL


Unlocking takes less time as compared to 2PL


Freedom from deadlocks


18.7.1 MOTIVATION FOR TREE
-
BASED LOCKING



Consider B
-
Tree Index, treating individual nodes as lockable
database elements.



Concurrent use of B
-
Tree is not possible with standard set of
locks and 2PL.



Therefore, a protocol is needed which can assure
serializability by allowing access to the elements all the way at
the bottom of the tree even if the 2PL is violated


18.7.2 ACCESSING TREE STRUCTURED DATA

Assumptions:


Only one kind of lock


Consistent transactions


Legal schedules


No 2PL requirement on transaction

Rules:


First lock can be at any node.


Subsequent locks may be acquired only after parent node has
a lock.


Nodes may be unlocked any time.


No relocking of the nodes even if the node’s parent is still
locked


18.7.3 WHY TREE PROTOCOL WORKS?



Tree protocol implies a serial order on transactions in the
schedule.


Order of precedence:

T
i

< s T
j


If T
i

locks the root before T
j
, then T
i

locks every node in
common with T
j

before T
j.

ORDER OF PRECEDENCE

18.8 Concurrency control by Timestamps


What is Timestamping?


Scheduler assign each transaction
T

a unique number, it’s
timestamp
TS
(T).


Timestamps must be issued in ascending order, at the time
when a transaction first notifies the scheduler that it is
beginning.


Two methods of generating Timestamps.


Use the value of system, clock as the timestamp.


Use a logical counter that is incremented after a new
timestamp has been assigned.


Scheduler maintains a table of currently active transactions
and their timestamps irrespective of the method used




Timestamps for database element X and commit bit


RT(X):
-

The read time of X, which is the highest timestamp of
transaction that has read X.


WT(X):
-

The write time of X, which is the highest timestamp of
transaction that has write X.


C(X):
-

The commit bit for X, which is true if and only if the most
recent transaction to write X has already committed.


Physically Unrealizable Behavior

Read too late:



A transaction U that started after transaction T, but wrote a
value for X before T reads X.




U

reads

X

T

writes

X

T

start

U

start

Figure
:

Transaction

T

tries

to

write

too

late

Physically Unrealizable Behavior

Write too late


A transaction U that started after T, but read X before T got a
chance to write X.


Dirty Read


It is possible that after T reads the value of X written by U,
transaction U will abort.


U

reads

X

T

writes

X

T

start

U

start

Figure
:

Transaction

T

tries

to

write

too

late

U

writes

X

T

reads

X

U

start

T

start

U

aborts


T

could

perform

a

dirty

read

if

it

reads

X

when

shown

Rules for Timestamps
-
Based scheduling

1. Scheduler receives a request
rT
(X)

a) If TS(T) ≥ WT(X), the read is physically realizable.


1. If C(X) is true, grant the request, if TS(T) > RT(X), set
RT(X) := TS(T); otherwise do not change RT(X).


2. If C(X) is false, delay T until C(X) becomes true or
transaction that wrote X aborts.

b)
If TS(T) < WT(X), the read is physically unrealizable.
Rollback T.

2
.
Scheduler receives a request WT(X).


a) if TS(T) ≥ RT(X) and TS(T) ≥ WT(X), write is physically
realizable and must be performed.



1. Write the new value for X,



2. Set WT(X) := TS(T), and



3. Set C(X) := false.



b) if TS(T) ≥ RT(X) but TS(T) < WT(X), then the write is physically
realizable, but there is already a later values in X.



a. If C(X) is true, then the previous writers of X is


committed, and ignore the write by T.



b. If C(X) is false, we must delay T.


c) if TS(T) < RT(X), then the write is physically unrealizable,
and T must be rolled back.

3. Scheduler receives a request to commit T. It must find all the
database elements X written by T and set C(X) := true. If any
transactions are waiting for X to be committed, these
transactions are allowed to proceed.

4.
Scheduler receives a request to abort T or decides to rollback
T, then any transaction that was waiting on an element X that
T wrote must repeat its attempt to read or write.






Multiversion Timestamps


Multiversion schemes keep old versions of data item to increase
concurrency.


Each successful write results in the creation of a new version of the
data item written.


Use timestamps to label versions.


When a read(
X
) operation is issued, select an appropriate version of
X

based on the timestamp of the transaction, and return the value
of the selected version.


Timestamps and Locking


Generally, timestamping performs better than locking in situations
where:


Most transactions are read
-
only.


It is rare that concurrent transaction will try to read and write
the same element.


In high
-
conflict situation, locking performs better than timestamps


Concurrency Control by
Validation(18.9)

What is optimistic concurrency control?

(
assumes no unserializable behavior will occur
)


Timestamp
-

based scheduling and




Validation
-
based scheduling


(
allows T to access data without locks
)



Validation based scheduling


Scheduler keeps a record of what the active
transactions are doing.


Executes in 3 phases

1.
Read
-

reads from RS( ), computes local address

2.
Validate
-

compares read and write sets

3.
Write
-

writes from WS( )

Validation based Scheduler


Contains an assumed serial order of
transactions.


Maintains three sets:


START( ): set of T’s started but not completed
validation.


VAL( ): set of T’s validated but not finished the
writing phase.


FIN( ): set of T’s that have finished.

Expected exceptions

1. Suppose there is a transaction U,

such that
:


U is in VAL or FIN; that is, U has validated,


FIN(U)>START(T); that is, U did not finish before T started


RS(T)

WS(T) ≠φ; let it contain database element X.

2. Suppose there is transaction U, such that:



U is in VAL; U has successfully validated.


FIN(U)>VAL(T); U did not finish before T entered its validation phase.


WS(T)


WS(U) ≠φ; let x be in both write sets.


Validation rules


Check that RS(T)

WS(U)= φ for any
previously validated U that did not finish
before T has started i.e. FIN(U)>START(T).


Check that WS(T)



WS(U)= φ for any
previously validated U that did not finish
before T is validated i.e. FIN(U)>VAL(T)


Example

Solution


Validation of U:


Nothing to check


Validation of T:

WS(U) ∩ RS(T)= {D} ∩{A,B}=φ

WS(U) ∩ WS(T)= {D}∩ {A,C}=φ


Validation
of V:

RS(V) ∩ WS(T)= {B}∩{A,C}=φ

WS(V) ∩ WS(T)={D,E}∩ {A,C}=φ

RS(V) ∩ WS(U)={B} ∩{D}=φ


Validation of W:

RS(W) ∩ WS(T)= {A,D}∩{A,C}={A}

WS(W) ∩ WS(V)= {A,D}∩{D,E}={D}

WS(W) ∩ WS(V)= {A,C}∩{D,E}=φ

(W is not validated)


Comparison

Concurrency control
Mechanisms

Storage

Utilization

Delays

Locks

Space in the lock table is
proportional to the number of
database elements locked.

Delays transactions but
avoids

rollbacks

Timestamps

Space is needed for read and
write times with every database
element, neither or not it is
currently accessed.

Do not delay the
transactions but cause them
to rollback unless Interface
is low

Validation

Space is used for timestamps
and read or write sets for each
currently active transaction, plus
a few more transactions that
finished after some currently
active transaction began.

Do not delay the
transactions but cause them
to rollback unless interface
is low


21.1 Introduction to Information
Integration

Need for Information Integration


All the data in the world could put in a single
database (ideal database system)


Databases In are created independently


hard to design a database to support future
use


The use of databases evolves, so we can not
design a database to support every possible
future use.




University Database


Registrar: to record student and grade


Bursar: to record tuition payments by students


Human Resources Department: to record
employees


Applications were build using these databases
like generation of payroll checks, calculation of
taxes and social security payments to
government.


Inconvenient


change in 1 database would not reflect in the
other database which had to be performed
manually.


Record grades for students who pay tuition


Want to swim in SJSU aquatic center for free
in summer vacation?


(all the cases above cannot achieve the
function by a single database)


Solution: one database

How to integrate


Start over


build one database: contains all the legacy
databases; rewrite all the applications


result: painful


Build a layer of abstraction (middleware)


on top of all the legacy databases


this layer is often defined by a collection of
classes

BUT…


When we try to connect information sources
that were developed independently, we
invariably find that sources differ in many
ways. Such sources are called
Heterogeneous,
and the problem of integrating them is
referred to as the
Heterogeneity Proble
m.

Heterogeneity Problem


What is Heterogeneity Problem


Aardvark Automobile Co.


1000 dealers has 1000 databases


to find a model at another dealer


can we use this command:

SELECT * FROM CARS





WHERE MODEL=“A6”;

Type of Heterogeneity


Communication Heterogeneity


Query
-
Language Heterogeneity


Schema Heterogeneity


Data type difference


Value Heterogeneity


Semantic Heterogeneity

Communication Heterogeneity


Today, it is common to allow access to your
information using HTTP protocols. However,
some dealers may not make their databases
available on net, but instead accept remote
accesses via anonymous FTP.


Suppose there are 1000 dealers of Aardvark
Automobile Co. out of which 900 use HTTP while
the remaining 100 use FTP, so there might be
problems of communication between the dealers
databases.


Query Language Heterogeneity


The manner in which we query or modify a
dealer’s database may vary.


For e.g. Some of the dealers may have
different versions of database like some might
use relational database some might not have
relational database, or some of the dealers
might be using SQL, some might be using Excel
spreadsheets or some other database.


Schema Heterogeneity


Even assuming that the dealers use a
relational DBMS supporting SQL as the query
language there might be still some
heterogeneity at the highest level like
schemas can differ.


For e.g. one dealer might store cars in a single
relation while the other dealer might use a
schema in which options are separated out
into a second relation.


Data type Diffrences


Serial Numbers might be represented by a
character strings of varying length at one
source and fixed length at another. The fixed
lengths could differ, and some sources might
use integers rather than character strings.


Value Heterogeneity


The same concept might be represented by
different constants at different sources. The
color Black might be represented by an integer
code at one source, the string BLACK at
another, and the code BL at a third.


Semantic Heterogeneity


Terms might be given different interpretations
at different sources. One dealer might include
trucks in
Cars

relation, while the another puts
only automobile data in Cars relation. One
dealer might distinguish station wagons from
the minivans, while another doesn’t.


21.2 Modes of Information
Integration


Federations



The simplest architecture for integrating several
DBs



One to one connections between all pairs of
DBs



n DBs talk to each other, n(n
-
1) wrappers are
needed



Good when communications between DBs are limited


Wrapper

: a software translates incoming
queries and outgoing answers.


allows information sources to conform to some shared schema.


Wrapper

Federations Diagram

DB2

DB1

DB3

DB4

2 Wrappers

2 Wrappers

2 Wrappers

2 Wrappers

2 Wrappers

2 Wrappers

A federated collection of 4 DBs needs 12 components to translate queries from one
to another.

Example

Car dealers want to share their inventory. Each dealer queries the
other’s DB to find the needed car.

Dealer
-
1’s DB relation: NeededCars(model,color,autoTrans)

Dealer
-
2’s DB relation: Auto(Serial, model, color)


Options(serial,option)

Dealer
-
1’s DB

Dealer
-
2’s DB

wrapper

wrapper

Example…

Dealer 1 queries Dealer 2 for needed cars

For(each tuple(:m,:c,:a) in NeededCars){


if(:a=TRUE){/* automatic transmission wanted */


SELECT serial


FROM Autos, Options


WHERE Autos.serial = Options.serial AND Options.option = ‘autoTrans’


AND Autos.model = :m AND Autos.color =:c;

}

Else{/* automatic transmission not wanted */


SELECT serial


FROM Auto


WHERE Autos.model = :m AND


Autos.color = :c AND


NOT EXISTS( SELECT * FROM Options WHERE serial = Autos.serial


AND option=‘autoTrans’);


}

}

Data Warehouse


Sources are translated from their local
schema to a global schema and copied to a
central DB.


User transparent: user uses Data Warehouse
just like an ordinary DB


User is not allowed to update Data Warehouse

Warehouse Diagram

Warehouse

Extractor

Extractor

Source 1

Source 2

User
query

result

Combiner

Example

Construct a data warehouse from sources DB of 2 car dealers:


Dealer
-
1’s schema: Cars(serialNo, model,color,autoTrans,cdPlayer,…)

Dealer
-
2’s schema: Auto(serial,model,color)




Options(serial,option)


Warehouse’s schema:


AutoWhse(serialNo,model,color,autoTrans,dealer)


Extractor
---

Query to extract data from Dealer
-
1’s data:


INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer)

SELECT serialNo,model,color,autoTrans,’dealer1’

From Cars;



Example

Extractor
---

Query to extract data from Dealer
-
2’s data:


INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer)

SELECT serialNo,model,color,’yes’,’dealer2’

FROM Autos,Options

WHERE Autos.serial=Options.serial AND


option=‘autoTrans’;


INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer)

SELECT serialNo,model,color,’no’,’dealer2’

FROM Autos

WHERE NOT EXISTS ( SELECT * FROM serial =Autos.serial


AND option = ‘autoTrans’);


Construct Data Warehouse

1)

Periodically reconstructed from the current data in the
sources, once a night or at even longer intervals.


Advantages:


simple algorithms.



Disadvantages:



1) need to shut down the warehouse;


2) data can become out of date.

There are mainly 3 ways to constructing
the data in the warehouse:

Construct Data Warehouse

2)

Updated periodically based on the changes(i.e. each
night) of the sources.


Advantages:


involve smaller amounts of data.

(important when warehouse is
large and needs to be modified in a short period)


Disadvantages:



1) the process to calculate changes to the warehouse is complex.


2) data can become out of date.

Construct Data Warehouse

3) Changed immediately, in response to each change or a
small set of changes at one or more of the sources.


Advantages:


data won’t become out of date.


Disadvantages:



requires too much communication, therefore, it is
generally too expensive.



(practical for warehouses whose underlying sources changes
slowly.)



Mediators


Virtual warehouse, which supports a virtual view or
a collection of views, that integrates several
sources.


Mediator doesn’t store any data.


Mediators’ tasks:


1)receive user’s query,


2)send queries to wrappers,


3)combine results from wrappers,


4)send the final result to user.


A Mediator diagram

Mediator

Wrapper

Wrapper

Source 1

Source 2

User query

Query

Query

Query

Query

Result

Result

Result

Result

Result

Example

Same data sources as the example of data warehouse, the mediator

Integrates the same two dealers’ source into a view with schema:


AutoMed(serialNo,model,color,autoTrans,dealer)


When the user have a query:


SELECT sericalNo, model

FROM AkutoMed

Where color=‘red’


In this simple case, the mediator forwards the same query to each

Of the two wrappers.


Wrapper1:
Cars(serialNo, model, color, autoTrans, cdPlayer, …)

SELECT serialNo,model

FROM cars

WHERE color = ‘red’;


Wrapper2:
Autos(serial,model,color); Options(serial,option)

SELECT
serial
, model

FROM Autos

WHERE color=‘red’;


Example

There may be different options for the mediator to forward user query,

for example, the user queries if there are a specific model&color car

(i.e. “Gobi”, “blue”).


The mediator decides 2
nd

query is needed or not based on the result of

1
st

query. That is, If dealer
-
1 has the specific car, the mediator doesn’t

have to query dealer
-
2.


Chapter 21

21.3 Wrappers in Mediator
-
Based
Systems



Wrappers in Mediator
-
based Systems


More complicated than that in most data warehouse
system.


Able to accept a variety of queries from the mediator
and translate them to the terms of the source.


Communicate the result to the mediator.








How to design a wrapper?


Classify the possible queries that the mediator can
ask into templates, which are queries with
parameters
that represent constants
.

Templates for Query Patterns:


Use notation T=>S to express the idea that the
template T is turned by the wrapper into the source
query S.



Example 1


Dealer 1



Cars (serialNo, model, color, autoTrans,

navi,

)



For use by a mediator with schema



AutoMed (serialNo, model, color, autoTrans,
dealer)



We denote the code representing that color by the parameter $c
, then
the template will be:


SELECT *

FROM
AutosMed

WHERE color =

$c

;




=>

SELECT
serialNo
, model, color,
autoTrans
,

dealer1


FROM Cars

WHERE color=

$c

;


(Template T =>
Source
query S
)


There will be total 2
n
templates if we have the option of
specifying n attributes.



Wrapper Generators


The wrapper generator creates a table holds
the various query patterns contained in the
templates.


The source queries that are associated with
each.



A driver is used in each wrapper, the task of
the driver is to:



Accept a query from the mediator.


Search the table for a template that matches the
query.


The source query is sent to the source, again using a

plug
-
in


communication mechanism.


The response is processed by the wrapper.

Filter


Have a wrapper filter to supporting more queries.



Example 2


If wrapper is designed with more complicated
template with queries specify both model and
color.

SELECT *

FROM AutosMed

WHERE model =

$m


AND color =

$c

;




=>

SELECT serialNo, model, color, autoTrans,

dealer1


FROM Cars

WHERE model =

$m


AND color=

$c

;



Now we suppose the only template we have is color. However the wrapper
is asked by the Mediator to find

blue Gobi model car.



Solution:



1. Use template with $c=

blue


find all blue cars
and store them in a temporary relation:



TemAutos (serialNo, model, color, autoTrans,
dealer)




2.The wrapper then return to the mediator the
desired set of automobiles by excuting the local
query:


SELECT*


FROM TemAutos


WHERE model=

Gobi

;




21.4 Capability Based Optimization


Introduction


Typical DBMS estimates the cost of each query
plan and picks what it believes to be the best


Mediator


has knowledge of how long its sources
will take to answer


Optimization of mediator queries cannot rely on cost measure alone
to select a query plan


Optimization by mediator follows capability based optimization


21.4.1 The Problem of Limited Source


Capabilities


Many sources have only Web Based interfaces


Web sources usually allow querying through a
query form


E.g. Amazon.com interface allows us to query about books in many
different ways.


But we cannot ask questions that are too
general


E.g. Select * from books;

21.4.1 The Problem of Limited Source