CIS
-
552
Introduction
1
Object
-
Oriented Database
•
New Database Applications
•
Object
-
Oriented Data Models
•
Object
-
Oriented Languages
•
Persistent Programming Languages
•
Persistent C++ Systems
CIS
-
552
Introduction
2
New Database Applications
•
Data models designed for data
-
processing
-
style
applications are not adequate for new technologies
such as computer
-
aided design, computer
-
aided
software engineering, multimedia, and image
database, and document/hypertext databases.
•
These new applications requirement the database
system to handle features such as:
–
Complex data types
–
Data encapsulation and abstract data structures
–
Novel methods for indexing and querying
CIS
-
552
Introduction
3
Object
-
Oriented Data Model
•
Loosely speaking, an object corresponds to an
entity in the E
-
R model.
•
The
object
-
oriented paradigm
is based on
encapsulating
code and data related to an object
into a single unit.
•
The object
-
oriented data model is a logical model
(like the E/R model).
•
Adaptation of the object
-
oriented programming
paradigm (e.g. Smalltalk, C++) to database
systems.
CIS
-
552
Introduction
4
Object Identity
•
An object retains its identity even if some or all of the
values of the variables or definitions of methods change
over time.
•
Object identity is a stronger notion of identity than in
programming languages or data models not based on
object orientation.
–
Value
–
data value; used in relational systems.
–
Name
–
supplied by user; used for variables in procedures.
–
Build
-
in
–
identity built into data model or programming language
•
No user
-
supplied identifier is required.
•
Form of identity used in object
-
oriented systems.
CIS
-
552
Introduction
5
Object Identifiers
Object identifiers
used to uniquely identify objects
–
Can be stored as a field of an object, to refer to another
object.
–
E.g., the
spouse
field of a
person
object may be an
identifier of another
person
object
–
Can be system generated (created by database) or
external (such as social
-
security number)
CIS
-
552
Introduction
6
Object Containment
•
Each component in a design may contain other components
•
Can be modeled as containment of objects. Objects containing other
objects are called
complex
or
composite
objects.
•
Multiple levels of containment create a
containment hierarchy:
links
interpreted as
is
-
part
-
of
, not
is
-
a
.
•
Allows data to be viewed at different granularities by different users.
bicycle
wheel
brake
frame
gear
rim
lever
cable
spokes
tire
pad
CIS
-
552
Introduction
7
Object
-
Oriented Languages
•
Object
-
oriented concepts can be used as a design
tool, and be encoded into, for example, a relational
database (analogous to modeling data with E/R
diagram and then converting to a set of relations).
•
The concepts of object orientation can be
incorporated into a programming language that is
used to manipulate the database.
–
Object
-
relational systems
–
add complex types and
object
-
orientation to relational languages.
–
Persistent programming languages
–
extend object
-
oriented programming language to deal with databases
by adding concepts such as persistence and collections.
CIS
-
552
Introduction
8
OO
-
DBMS
•
Save objects created by an OOP language to
disk (make objects persistent).
•
Ensure that if an object is saved, all of the
objects it references are saved.
•
Allow saved objects (and the objects they
reference) to be retrieved from disk.
•
Provide transaction management and
concurrency control to maintain data
integrity.
CIS
-
552
Introduction
9
Persistent Programming Language
•
Persistent programming languages:
–
Allow objects to be created and stored in a database without any
explicit format changes (format changes are carried out
transparently).
–
Allow objects to be manipulated in
-
memory
–
do not need to
explicitly load from or store to the database.
–
Allow data to be manipulated directly from the programming
language without having to go though a data manipulation
language like SQL.
•
Due to power of most programming languages, it is easy to
make programming errors that damage the database.
•
Complexity of languages makes automatic high
-
level
optimization more difficult.
•
Do not support declarative querying very well
CIS
-
552
Introduction
10
Persistence of Objects
Approaches to make transient objects persistent include
establishing persistence by:
–
Class
–
declare all objects of a class to be persistent;
simple but inflexible.
–
Creation
–
extend the syntax for creating transient
objects to create persistent objects.
–
Marking
–
an object that is to persist beyond program
execution is marked as persistent before program
termination.
–
Reference
–
declare (root) persistent objects; objects are
persistent if they are referred to (directly or indirectly)
from a root object.
CIS
-
552
Introduction
11
Object Identity and Pointers
•
A persistent object is assigned a persistent object identifier.
•
Degrees of permanence of identity:
–
Intraprocedure
–
identity persists only during the
execution of a single procedure.
–
Intraprogram
–
identity persists only during execution
of a single program or query.
–
Interprogram
–
identity persists from one program
execution to another.
–
Persistent
–
identity persists through program
executions and structural reorganizations of data;
required for object
-
oriented systems.
CIS
-
552
Introduction
12
Object Identity and Pointers (Cont.)
•
In O
-
O languages such as C++, an object identifier
is actually an in
-
memory pointer.
•
Persistent pointer
–
persists beyond program
execution; can be thought as a pointer into the
database.
CIS
-
552
Introduction
13
Storage and Access of Persistent Objects
How to find objects in the database:
•
Name objects (as you would name files)
–
cannot scale to
large number of objects.
–
Typically given only to class extents and other
collections of objects, but not to objects.
•
Expose object identifiers or persistent pointers to the
objects
–
can be stored externally.
–
All objects have object identifiers.
CIS
-
552
Introduction
14
Storage and Access of Persistent Objects (Cont.)
How to find objects in the database (Cont):
•
Store collections of objects and allow programs to iterate
over the collections to find required objects.
–
Model collections of objects as
collection types
–
Class extent
–
the collection of all objects belonging to
the class; usually maintained for all classes that can
have persistent objects.
CIS
-
552
Introduction
15
Persistent C++ System
•
C++ language allows support for persistence to be
added without changing the language
–
declare a class called
Persistent_Object
with
attributes and methods to support persistence
–
Overloading
-
ability to redefine standard function names
and operators (i.e., +,
-
, the pointer dereference operator
) when applied to new types
•
Providing persistence without extending the C++
language is
–
relatively easy to implement
–
but more difficult to use
CIS
-
552
Introduction
16
ODMG C++ Object Definition Language
•
Standardized language extensions to C++ to support persistence
•
ODMG standard attempts to extend C++ as little as possible, providing
most functionality via template classes and class libraries
•
Templates class
Ref<class>
used to specify references (persistent
pointers)
•
Template class
Set<class>
used to define sets of objects. Provides
methods such as insert_element and delete_element.
•
The C++ object definition language (ODL) extends the C++ type
definition syntax in minor ways.
Example: Use notation
inverse
to specify referential integrity
constraints.
CIS
-
552
Introduction
17
ODMG C++ ODL: Example
Class Person : public Persistent Object {
public:
String name;
String address;
};
class Customer : public Person {
public:
Date member_from;
int customer_id;
Ref<Branch> home_branch;
Set<Ref<Account>> accounts
inverse
Account::owners;
};
CIS
-
552
Introduction
18
ODMG C++: Example (Cont.)
Class Account : public Persistent_Object {
private:
int balance;
public:
int number;
Set<Ref<Customer>> owners
inverse
Customer::accounts;
int find_balance();
int update_balance(int delta);
}
CIS
-
552
Introduction
19
ODMG C++ Object Manipulation Language
•
Uses persistent versions of C++ operators such as
new(db).
Ref<Account> account = new(bank_db) Account;
new allocates the object in the specified database, rather than
in memory
•
Dereference operator
when applied on a
Ref<Customer>
object in memory (if not already
present) and returns in
-
memory pointer to the object.
•
Constructor
for a class
–
a special method to initialize
objects when they are created; called automatically when
new is executed
•
Destructor
for a class
–
a special method that is called
when objects in the class are deleted.
CIS
-
552
Introduction
20
ODMG C++ OML: Example
int create_account_owner(String name, String address) {
Database * bank_db;
bank_db = Database::open(“Bank
-
DB”);
Transaction Trans;
Trans.begin();
Ref<Account> account = new(bank_db) Account;
Ref<Customer> cust = new(bank_db) Customer;
cust
-
>name = name;
cust
-
>address = address;
cust
-
>accounts.insert_element(account);
account
-
>owners.insert_element(cust);
… Code to initialize customer_id, account number, etc.
Trans.commit();
}
CIS
-
552
Introduction
21
ODMG C++ OML: Example of Iterators
int print_customers() {
Database * bank_db;
bank_db = Database::open(“Bank
-
DB”);
Transaction Trans;
Trans.begin();
Iterator<Ref<Customer>> iter =
Customer::all_customer.create_iterator();
Ref<Customer> p;
while (iter.next(p)) {
print_cust(p);
}
Trans.commit();
}
•
Iterator construct helps step through objects in a collection
CIS
-
552
Introduction
22
Mapping of Objects to Files
•
Mapping objects to files is similar to mapping tuples to
files in a relational system; object data can be stored using
file structures.
•
Objects in O
-
O databases may lack uniformity and may be
very large; such objects have to be managed differently
from records in a relational system.
–
Set fields with a small number of elements may be implemented
using data structures such as linked lists.
–
Set fields with a larger number of elements may be implemented as
B
-
trees, or as separate relations in the database.
–
Set fields can also be eliminated at the storage level by
normalization.
CIS
-
552
Introduction
23
Mapping of Objects to Files (Cont.)
•
Objects are identified by an object identifier
(OID); the storage system needs a mechanism to
locate an object given its OID.
–
logical identifiers
do not directly specify an object’s
physical location; must maintain an index that maps an
OID to the object’s actual location.
–
physical identifiers
encode the location of the object
so the object can be found directly. Physical OIDs
typically have the following part:
1. a volume or file identifier
2. a page identifier within the volume or file
3. an offset within the page
CIS
-
552
Introduction
24
Management of Persistent Pointers
•
Physical OIDs may have a
unique identifier.
This identifier
is stored in the object also and is used to detect references
via dangling pointers.
Vol. Page Offset
Unique-Id
Physical Object Identifier
Unique-Id
Data ……
Object
(a) General Structure
(b) Example of use
51
… data …
6.32.45608
51
6.32.45608
50
6.32.45608
Good OID
Bad OID
Location
Unique
-
Id
Data
CIS
-
552
Introduction
25
Management of Persistent Pointers (Cont.)
•
Implement persistent pointers using OIDs; persistent pointers are
substantially longer than are in
-
memory pointers
•
Pointer swizzling cuts down on cost of locating persistent objects
already in memory.
•
Software swizzling (swizzling on pointer dereference)
–
When a persistent pointer is first dereferenced, it is
swizzled
(replaced by an in
-
memory pointer) after the object is located in
memory.
–
Subsequent dereferences of the same pointer become cheap
–
The physical location of an object in memory must not change if
swizzled pointers point to it; the solution is to pin pages in
memory
–
When an object is written back to disk, any swizzled pointers it
contains need to be
unswizzled
.
CIS
-
552
Introduction
26
Hardware Swizzling
•
Persistent pointers in objects need the same amount of
space as in
-
memory pointers
–
extra storage external to the
object is used to store rest of pointer information.
•
Uses virtual memory translation mechanism to efficiently
and transparently convert between persistent pointers and
in
-
memory pointers.
•
All persistent pointers in a page are swizzled when the
page is first read in.
–
Thus programmers have to work with just one type of
pointer, i.e. in
-
memory pointer.
–
Some of the swizzled pointers may point to virtual memory
addresses that are currently not allocated any real memory.
CIS
-
552
Introduction
27
Hardware Swizzling
•
Persistent pointer is conceptually split into two parts: a
page identifier, and an offset within the page.
–
The page identifier in a pointer is a short indirect
pointer: each page has a translation table that provides a
mapping from the short page identifiers to full database
page identifiers.
–
Translation table for a page is small (at most 1024
pointers in a 4096 byte page with 4 byte pointers)
–
Multiple pointers in a page to the same page share same
entry in the translation table.
CIS
-
552
Introduction
28
Hardware Swizzling (Cont.)
•
Page image when on disk (before swizzling)
2395
255
Page ID Off.
4867
020
Page ID Off.
4867
170
Page ID Off.
2395
679.34.28000
4867
519.56.84000
Object 2
Object 1
Object 3
PageID
FullPageID
Translation Table
CIS
-
552
Introduction
29
•
When an in
-
memory pointer is dereferenced, if the operating system
detects the page it points to has not yet been allocated storage, a
segmentation violation
occurs.
•
mmap
call associates function to be called on segmentation violation
•
The function allocates storage for the page and reads in the page from
disk.
•
Swizzling is then done for all persistent pointers in the page (located
using object type information).
–
If pointer points to a page not already allocated a virtual memory
address, a virtual memory address is allocated (preferably the
address in the short page identifier if it is unused). Storage is not
yet allocated for the page.
–
The page identifier in pointer (and translation table entry) are
changed to the virtual memory address of the page.
Hardware Swizzling (Cont.)
CIS
-
552
Introduction
30
Page image after swizzling
•
Page with short page identifier 2395 was allocated address 5001.
Observe change in pointers and translation table.
•
Page with short page identifier 4867 has been allocated address 4867.
No change
in pointer and translation table.
Hardware Swizzling (Cont.)
5001
255
Page ID Off.
4867
020
Page ID Off.
4867
170
Page ID Off.
5001
679.34.28000
4867
519.56.84000
Object 2
Object 1
Object 3
PageID
FullPageID
Translation Table
CIS
-
552
Introduction
31
•
After swizzling, all short page identifiers point to virtual
memory address allocated for the page
–
Functions accessing the objects need not know it has persistent
pointers!
–
Can reuse existing code and libraries that use in
-
memory pointers.
•
If all pages are allocated the same address as in the short
page identifier, no changes required in the page!
•
No need for deswizzling
–
page after swizzling can be
saved back directly to disk
•
A process should not access more pages than size of virtual
memory
–
reuse of virtual memory addresses for other
pages is expensive.
Hardware Swizzling (Cont.)
CIS
-
552
Introduction
32
Disk versus Memory Structure of Objects
•
The format in which objects are stored in memory may be
different from the format in which they are stored on disk
in the database. Reasons are :
–
software swizzling
–
structure of persistent and in
-
memory
pointers are different
–
database accessible from different machines, with different data
representations
•
Make the physical representation of objects in the database
independent of the machine and the compiler.
•
Can transparently convert from disk representation to form
required on the specific machine, language, and compiler,
when the object (or page) is brought into memory.
CIS
-
552
Introduction
33
Large Objects
•
Very large objects are called
binary large objects
(
blobs
)
because they typically contain binary data. Examples
include:
–
text documents
–
Graphical data such as images and computer aided designs
–
audio and video data
•
Large objects may need to be stored in a contiguous
sequence of bytes when brought into memory.
–
If an object is bigger than a page, contiguous pages of the buffer
pool must be allocated to store it.
–
May be preferable to disallow direct access to data, and only allow
access through a file
-
system
-
like API, to remove need for
contiguous storage.
CIS
-
552
Introduction
34
Modifying Large Objects
•
Use B
-
tree structures to represent object: permits reading
the entire object as well as updating, inserting, and deleting
bytes from specified regions of the object.
•
Special
-
purpose application programs outside the database
are used to manipulate large objects:
–
Text data treated as a byte string manipulated by editors and
formatters.
–
Graphical data is represented as a bit map or as a set of geometric
objects; can be managed within the database system or by special
software (e.g. VLSI design).
–
Audio/video data is typically created and displayed by separate
application software and modified using special purpose editing
software.
–
checkout/checkin
method for concurrency and version control
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο