A Middleware Design for Multiple Embedded Database Systems

bawltherapistΛογισμικό & κατασκευή λογ/κού

13 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

83 εμφανίσεις

A Middleware Design for Multiple Embedded
Database Systems
Jhe-Hao Hu,Chin-Hsien Wu,and Chang-Hong Lin
Department of Electronic Engineering
National Taiwan University of Science and Technology,Taipei,Taiwan
Email:{M9702120,chwu,chlin}@mail.ntust.edu.tw
Abstract—Since embedded systems and consumer electronic
devices are popular now,they have adopted huge-capacity storage
systems such as flash-memory cards or solid-state drives (SSDs).
Many embedded database systems(EDBS) also emerge for the
maintenance of data on these storage systems.However,it is
complicated and time-consuming to modify an application with
one embedded database systemto a new one with another embed-
ded database system.In the paper,we will design a middleware
for multiple embedded database systems by considering their
different interfaces and overhead.With the help of the middle-
ware,it is convenient for users to write applications that can
easily adopt various embedded database systems.Furthermore,
the middleware can leverage various embedded database systems
for better performance and reasonable cost.
Index Terms—Embedded Systems,Database Systems,Middle-
ware Design
I.I
NTRODUCTION
Most embedded systems and consumer electronic devices
have adopted NAND flash-memory as their storage media
since NAND flash-memory has advantages such as huge-
capacity,low-power consumption,non-volatility,and shock-
resistance.Many embedded database systems [4],[5],such as
SQLite [1],Berkeley DB [2],and GDBM[3],etc,have been
proposed for NAND-based storage systems.Each embedded
database system might have its application programming in-
terface (API) such that it is inconvenient to write applications
that can work well with all embedded database systems.Such
an observation motivates this research.
We will design a middleware for multiple embedded
database systems by providing a standard API.The mid-
dleware should integrate various embedded database systems
such that programmers can use the middleware to write
applications transparently over different embedded database
systems.The middleware will provide basic operations such
as insert,delete,select and join as well as embedded database
systems.Furthermore,the middleware should be responsible
to reduce programming overhead and improve system perfor-
mance when one embedded database system will be changed
to another one.
The rest of this paper is organized as follows:Section II
will introduce the motivation and the related work about some
popular embedded database systems.Section III introduct will
present the middleware design for multiple embedded database
systems.Section IV use the real EDBSs to implement our
middleware.Section V is experiment with middleware.Section
VI is the conclusion and future work.
II.R
ELATED
W
ORK AND
M
OTIVATION
We will choose two embedded database systems (SQLite
[1] and Berkeley DB [2] ) to test the middleware design.
SQLite [1] is a software library that implements a self-
contained,serverless,zero-configuration,transactional SQL
database engine.SQLite [1] is widely used SQL database
engine.The source code for SQLite [1] is also open.Berkeley
DB [2] is a fast,open-source embedded database and is used
in several well-known open-source products such as Linux and
BSD Unix operating systems,Apache Web server,OpenLDAP
directory,and OpenOffice productivity suite.
Fig.1.Traditional applications and embedded database systems
Traditional embedded database systems might have their
APIs such that it is inconvenient to write applications that
can work well with all embedded database systems.The rela-
tionship between applications and embedded database systems
is related and one application just can run on one embedded
database system,as shown in Fig.1.The middleware can help
applications to run on any embedded database systems since
the middleware will provide a standard API.Programmers can
reduce also their programming work when the used embedded
database system will be changed to another one.
III.A M
IDDLEWARE
D
ESIGN
A.Overview
Each embedded database system might have its application
programming interface for programmers.If a programmer uses
embedded database system A,he has to learn its APIs.If the
system environment does not support embedded database sys-
tem A,the programmer might modify the original application
2010 IEEE 14th International Symposium on Consumer Electronics
978-1-4244-6673-3/10/$26.00 ©2010 IEEE
for another embedded database system.it is complicated and
time-consuming for programmers to rewrite applications.
Fig.2.A middleware structure
We will propose a middleware that can work well with
embedded database systems.As shown in Fig.2,applications
can run on EDBS A,B and C through the middleware.The
middleware can play a transparent role of communication
between applications and embedded database systems.Ap-
plications will not need to be modified when one embedded
database system is changed to another one.The middleware
can handle the change process and reduce the overhead.Fur-
thermore,the middleware can also choose a better embedded
database system under different access patterns.
B.Design Issues
1) Standard API:The middleware provides programmers
with a standard API to access various embedded database
systems.The standard API includes five basic operations such
as create,insert,delete,select and join.Since the middle-
ware might support various embedded database systems,the
middleware must understand their operation mechanism and
programming setup.By considering these issues,a standard
API can be designed well for programmers.
2) Data format:Since different embedded database sys-
tems might have different data format,parameters for insert,
delete,select,and join operations might be different.The
middleware should understand their data format and hide their
parameters.When the middleware will be used for multiple
embedded database systems,programmers only use an unified
data format and parameters to write applications.
Fig.3.Embedded database system A convert to embedded database system
B
3) Conversion Overhead:When an application will be run
on another embedded database system,the middleware will
face the conversion problem.As shown in Fig.3,if an appli-
cation wants to change its original embedded database system,
the middleware will handle the data conversion process.The
middleware might obtain data from the original embedded
database system and write these data into a new embedded
database system.Obviously,the conversion process will cause
overhead and the middleware should reduce the overhead as
much as.
4) Benchmark:The middleware can play a smart role
by test their efficiency of embedded database systems and
then suggest a better one.Since different embedded database
systems might have different strength.For example,some
embedded database systems might have fast read and write
performance,some might have quick search time,and some
might require less system resource.The middleware can pro-
vide related benchmark for programmers and determine which
embedded database system is suitable.
IV.A M
IDDLEWARE
I
MPLEMENTATION
In this section,we will implement a middleware to support
two real EMBSs.We will use SQLite[1] and Berkeley DB
[2] that were introduced in Section II.Now,we will introduce
how SQLite and Berkeley DB work.We will also present how
to design a standard API for the middleware.
A.SQLite
SQLite is an open source lightweight embedded database
systemthat was created by D.Richard Hipp using C language.
SQLite has some features as follow:

Unlike most other SQL databases,SQLite does not have a
separate server process.SQLite reads and writes directly
to ordinary disk files.

Transactions are atomic,consistent,isolated,and durable
(ACID) even after system crashes and power failures.

Implement most functions of SQL-92.

Small code footprint:less than 300KB fully configured
or less than 180KB with optional features omitted.

Sources are in the public domain.Use for any purpose.
We will introduce how SQLite uses its APIs to access data
as follow:
1) SQLite Structure:

typedef struct sqlite3 sqlite3.
Each open SQLite database is represented by a pointer
to an instance of the opaque structure named ”sqlite3”.
It is useful to think of an sqlite3 pointer as an object.
2) SQLite API:SQLite have basic pointer structures and
basic APIs for opening and closing the database.

int sqlite3
open(const char *filename,sqlite3 *).
The function can open a SQLite database file whose name
is given by the filename argument.

int sqlite3
close(sqlite3 *)
This function is the destructor for the sqlite3 object.
SQLite can write and read data fromdatabases by a SQL-92
language.

int sqlite3
exec(sqlite3*,const char *sql,char **errmsg
/* Error msg written here */).
Programmers just write the SQL-92 language and use
sqlite3
exec() to control databases.If programmers
want to create a table,let char* sql be ”create ta-
ble table
name(table
spec)” and execute the function
sqlite3
exec.
B.Berkeley DB
Berkeley DB is a product of open source by Oracle and can
provide developers with fast,reliable,local persistence with
zero administration.Often deployed as embedded databases,
Berkeley DB can provide high performance,reliability,scal-
ability,and availability for applications.However,it does not
support a SQL-92 language.Berkeley DB has some features
as follow:

ACID transaction.

Indexed and sequential retrieval (Btree,Queue,Hash).

Programmatic administration and management - zero
human administration.

Sources are in the public domain.Use for any purpose.
Berkeley DB can support many programming language such
as C,C++,Java,Perl,PHP,etc.We will use C language API
design our middleware and we will introduce its structure and
C APIs as follow.
1) Berkeley DB Structure:Berkeley DB [2] does not sup-
port SQL-92 to control databases,it has specific structures to
access data from databases.

typedef struct { void *data;/* a pointer to a string */
u
int32
t size;/* The length of data,in bytes.*/} DBT.
DBT can store data and data length,Berkeley DB can
write and read data from database by DBT.

typedef struct DB DB.
DB is the handle for a Berkeley DB database.
2) Berkeley DB API:We will introduce how Berkeley DB
uses its APIs to access data as follow:

db
create(DB **dbp,DB
ENV *dbenv,u
int32
t flags).
db
create() function creates a DB structure that is the
handle for a Berkeley DB database.The function allocates
memory for the structure and returns a pointer to the
structure in the memory to which dbp refers.

DB->open(DB *db,const char *db
name,DBTYPE
type).
DB->open() method opens the database.DBTYPE are
Btree,Hash,Queue,and Recno.

DB->close(DB *db).
DB->close() method flushes any cached database in-
formation to disk,closes any open cursors,frees any
allocated resources,and closes any underlying files.

DB->put(DB *db,DBT *key,DBT *data).
DB->put() method stores key/data pairs in the database.
The default behavior of the DB->put() function is to in-
sert the new key/data pair,replace any previously existing
key if duplicates are disallowed,or add a duplicate data
item if duplicates are allowed.

DB->get(DB *db,DBT *key,DBT *data).
DB->get() method retrieves key/data pairs from the
database.The address and length of the data associated
with the specified key are returned in the structure to
which data refers.
C.Standard API Design
The middleware will integrate various embedded database
systems since each embedded database system might have its
APIs.A standard API should be designed and work well with
all embedded database systems.
According to our observations,SQLite supports a SQL-92
language and table conception.However,Berkeley DB does
not support table conception.If we want to insert data with
the same primary keys into Berkeley DB,Berkeley DB has to
store these data into different database files.When we design a
standard API,we should add this table conception and resolve
the potential conflicts between different embedded database
systems.We will list the standard APIs as follows:

void create(char *db
name,char *table
name,int flag/*
which database will be used */,char *spec/*Like:(a
INTEGER PRIMARY KEY,b INT,e TEXT)*/).
The function can create a table and an initial database.

void insert(char *db
name,char *table
name,int flag,char
*data).
The function can insert data to a database.

void deletedata(char *db
name,char *table
name,int
flag).
The function can delete data from a database.

void selectdata(char *db
name,char *table
name,int
flag,int num,char *spec).
The function can search data from a database.

void joindata(char *db
name,char *table1
name,int
num1,char *table2
name,int num2,int flag).
The function can join two tables.
V.E
XPERIMENT
In this section,we will test our middleware and measure its
performance and overhead.Our experiments were performed
on a Dual Core 2.0Ghz Intel Pentium machine with 3 GB
RAM running SUSE Linux 2.6.21.6.In the experiments,our
data specification is a phone book like:(ID INTEGER
PRIMARY KEY,Name VARCHAR(50),PhoneNumber VAR-
CHAR(50)).We will create 1000,5000 and 10000 records by
the data specification and execute insert and join operations to
these records.
By using SQLite,we inserted 1000,5000 and 10000
records.We can see that the execution time is just a little
different between using middleware and no using middleware,
as shown in Fig.4.It means that the middleware only caused a
little overhead for SOLite.This is because the middleware will
just a role of communication between application and SQLite.
By using Berkeley DB [2],we also inserted 1000,5000,
10000 records,as shown in Fig.5.We can observe that
the execution time with the middleware was longer than that
without the middleware.This is because Berkeley DB does
Fig.4.Insertion time for SQLite using middleware and no using middleware
Fig.5.Insertion time for Berkeley DB using middleware and no using
middleware
not provide the table conception and the middleware design
will cause overhead in adding the table conception.However,
it is required when a standard API is implemented.
Fig.6.Join time for SQLite and Berkeley DB using middleware
For join function using middleware,we can observe that
using SQLite was faster than using Berkeley DB,as shown
in Fig.6.Since Berkeley DB does not provide table concep-
tion,the middleware design should resolve the conflict when
different embedded database systems are integrated.Extra
conversion might cause overhead in the table conception.As
a result,SQLite might have better performance when join
operations are required.
According to Fig.4,Fig.5,and Fig.6,we can know
the middleware’s overhead is to add the table conception
into embedded database systems,especially for Berkeley DB.
So we can realize that the middleware design can provide
programmers with flexibility but might cause overhead for
integrating different embedded database systems.
In conversion overhead,we can know it will cause extra
overhead.So we advise programmers need to consider this
conversion overhead and performance.If programmers want
to change a embedded database system to another one,the
middleware will read the data information from the original
embedded database systemand insert the data information into
the new embedded database system.The advantage is that
programmers never handle how to do the conversion process.
VI.C
ONCLUSION
Since embedded systems and consumer electronic devices
are popular now,they have adopted flash-memory cards or
solid-state drives as their storage systems.Many embedded
database systems also emerge for the maintenance of data
on these storage systems.Since it is complicated for users
to modify an application with one embedded database system
to a new one with another embedded database system.We
propose a middleware design for resolving the issue.We list
the contributions of the paper in the following:

The middleware can provide a standard development
API for multiple embedded database systems and reduce
programming overhead.

According to the development environment,the middle-
ware can handle the change process between embedded
database systems and programmers do not rewrite appli-
cations.

The middleware can provide related benchmark for pro-
grammers and determine which embedded database sys-
tem is suitable.
For future research,we should further explore different
application characteristics and different workloads in embed-
ded database systems.More research and tool designs in
the optimization of the middleware for different embedded
applications might prove being very rewarding.
VII.A
CKNOWLEDGMENTS
This paper is supported in part by a research grant from
the National Science Council under Grant 98-2221-E-011-091-
and 98-2221-E-011-103-.
R
EFERENCES
[1] SQLite3.http://www.sqlite.org/
[2] BerkeleyDB.http://www.oracle.com/technology/products/berkeley-
db/index.html/
[3] GDBM.http://www.gnu.org/software/gdbm/
[4] GyeJeong Kim,SeungCheon Baek,HyunSook Lee,HanDeok Lee,
Moon Jeung oe,”LGeDBMS:a small DBMS for embedded system
with flash memory” 32nd international conference on Very large data
bases,2006,pp.1255-1258
[5] Sang-Won Lee,Gap-Joo Na,Jae-Myung Kim,Joo-Hyung Oh,Sang-
Woo Kim,”Research issues in next generation DBMS for mobile
platforms” 9th Intl.Conf.on Human Computer Interaction with Mobile
Devices and Services,2007,pp.457-461