Forewords

websterhissΒιοτεχνολογία

1 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

238 εμφανίσεις

1


Selected Presenters’ Papers Related to Quantum and DNA computing


7.

Vojislav Stojkovic and Hongwei Huo

DNA
-
Based Algorithms for Determining the Smallest/Largest Element of

a Set of Unsigned Integer Binary
Numbers


6
.

Vojislav Stojkovic and Hongwei Huo

A P
ermutations Generation DNA
-
Based Algorithm


An Example of a DNA Computing Killer Application


Journal of Scientific and Practical Computing, Vol 1, No. 1 (March 2007), pp. 16
-
29.

http://www.spcl
ab.com/publisher/vol1no1.html


5
.

Hongwei Huo and Vojislav Stojkovic


Two
-
phase Quantum based Evolutionary Algorithm for Multiple Sequence Alignments


2006 International Conference on Computational Intelligence and Security (cis'2006)


Guangzhou, China, N
ovember 3
-
6, 2006.

Volume 1,


Pages:374


379.

Digital Object Identifier 10.1109/ICCIAS.2006.294157

Reprinted

in



Y. Wang, Y. Choung, and H. Liu (Eds.): CIS 2006, LNAI 4456, pp. 11
-
21, 2007.


Springer
-
Verlag Berlin Heidelberg 2007


4
.

Hongwei Hu
o, Vojislav Stojkovic, and Qiuli Lin


Maximum Weighted Path Approach to Multiple Alignments for DNA Sequences



The 2nd International Conference on Natural Computation (ICNC'06) and


the 3rd International Conference on Fuzzy Systems and Knowledge Discove
ry (FSKD'06)


Xidian University,
Xi'an, China
, September 24
-
28, 2006.


Reprinted in


L. Jiao et al. (Eds): ICNC 2006.
Part I, LNCS 4222, pp. 336
-
339, 2006.


Springer
-
Verlag Belin Heidelberg 2006

http://
www.springerlink.com/content/y80016361665t171/


3
.

Hongwei Huo, Vojislav Stojkovic, and Qiuli Lin


Quantum Based Evolutionary Algorithm for Multiple Sequence Alignments (6
-
pages poster)


Computational Systems Bioinformatics

Stanford University, Stanford,
California, August 14
-
18, 2006.


2
.

Vojislav Stojkovic, Hongwei Huo, and Elizabeth Britto


DNA Based Addition and Subtraction of Two Unsigned Integer Numbers Inspired by Unrestricted

Grammars Implemented in Prolog Language (6
-
page poster)


Computational
Systems Bioinformatics

Stanford University, Stanford, California, August 14
-
18, 2006.


1
.

Clare Bates Congdon, Colby College,
ccongdon@colby.edu

John Dougherty, Haverford College,
jdougher@haverford.edu

David Evans, University of Virginia,
evans@cs.virginia.edu

Mark LeBlanc, Wheaton College,
mleblanc@wheatoncollege.edu

Joy
ce Currie Little, Towson University,
jlittle@towson.edu

Jane Prey, National Science Foundation,
jprey@nsf.gov

Vojislav Stojkovic, Morgan State University,
stojkovi@morgan.edu

Paul Tymann, Rochester Institute of Technology,
ptt@cs.rit.edu

Computer Science and Bioinformatics

Math & Bio 2010 Using Undergraduate Disciplines

Lynn Arthur Steen, Editor

The Mathematical Association of America, 2005, ISBN 0
-
88385
-
818
-
5


2


DNA
-
Based Algorithms for Determining the Smallest/Largest Element of

a Set of Unsigned Integer Binary Numbers


Vojislav Stojkovic
1

and Hongwei Huo
2


1
Morgan State University, Computer Scien
ce Department

Baltimore, MD 21251, USA

stojkovi@jewel.morgan.edu


2
Xidian University, School of Computer Science and Technology

Xi’an 710071, P.R. China

hwhuo@mail.xidian.edu.cn



____________________________________________________________________________
____

Abstract:



The paper presents DNA
-
Based algorithms for determining the smallest/largest element

of a

set of
unsigned integer binary numbers of the given length. The given DNA
-
Based algorithms can be executed
step by step in a DNA
-
lab or on a DNA
-
c
omputer, automatically on a super DNA
-
computer or on a von
Neumann
-
based electronic
-
digital computer (for the small number of elements).



Keywords: DNA
-
computation, DNA
-
computer, minimum, maximum.

________________________________________________________
________________________

1. Introduction


The minimum problem and the maximum problem that is determining the smallest element and the
largest element of a set of
numbers

are one of the first
nontrivial problems attacked by
mathematicians
and computer sc
ientists.



The minimum
/maximum

problem can be specified formally as follows:


Input:

A set of
numbers
.

The
input
set is not ordered. The
input
set may contain repeated numbers.

Output: The
smallest
/largest

number of the
input
set.


The simplicity and th
e restricted length of the paper are the main reasons to consider unsigned integer
binary numbers instead integer and floating point numbers.


2. DNA


DNA, deoxyribonucleic acid, is a molecule found in every living cell, which directs the formation,
grow
th, and reproduction of cells. DNA consists of nucleotides. Nucleotides contain compounds called
phosphate, deoxyribose, and base. Within all nucleotides, phosphate and deoxyribose are the same,
however, the bases vary. The four distinct bases are: adenine

(A), guanine (G), thymine (T), and cytosine
(C). The exact amount of each nucleotide and the order in which they are arranged are unique for every
kind of living organism. DNA represents information as a pattern of molecules on a DNA strand. A DNA
strand
is a string of the alphabet {A, C, G, T}. The length of a DNA strand is equal to the length of the
string that represents the DNA strand.


3. DNA Computer

3


A DNA computer is a chemical instrument consisting of a system of connected test tubes and other
au
xiliary units. DNA computers use the chemical properties of DNA molecules by examining the patterns
of combination or growth of the molecules or strings. DNA computers can do this through the
manufacture of enzymes, which are biological catalysts that coul
d be considered the ‘software’ used to
execute the desired DNA computation. DNA computers represent information in terms of DNA. In DNA
computers, deoxyribonucleic acids serve as the memory units that can take on four possible positions (A,
C, G, or T). DN
A computers do not have the vonNeumann architecture. DNA computers are massively
parallel and are considered promising for complex problems that require multiple simultaneous
computations. DNA computers perform computations by synthesizing particular seque
nces of DNA and
allowing them to react in test tubes. The task of the DNA computer is to check each possible solution and
remove those that are incorrect, using restrictive enzymes. When the chemical reactions are complete,
the DNA strands can then be

analyzed to find the solution.


A super DNA computer is a programmable DNA computer.


4. DNA Computing


In 1961 Feynman [4] predicts in 1994 Adleman [1] realized computations at a molecular level computing
with DNA. DNA computing began in 1994 when Ad
leman showed that DNA computing was possible by
solving the Traveling Salesman Problem on a DNA computer. Adelman [2] used DNA polymerase and
Watson
-
Crick complementary strands to do DNA computation. Since then, it has been a surge of research
in the DNA c
omputation field. DNA computation has emerged in an exciting new research field at the
intersection of computer science, biology, mathematics, and engineering. DNA computation has been
demonstrated to have the capability to solve problems considered to be
computationally difficult for von
Neumann machines. After the Hamiltonian Path problem was solved, several researchers proposed
solutions to a spectrum of NP
-
comp
lete problems (such as Lipton [6
]) dealing with satisfiability,
cryptography, as well as other

search oriented problems.


Adleman’s [3] work has greatly influenced our work, however, our approach is different. Adleman’s
approach was biochemical
-
oriented, while our approach is computer science
-
oriented: (program+DNA)
-
oriented (based on super DNA com
puter and/or modeling and simulation of biochemical processes using
the Easel or Prolog programming l
anguages). Stojkovic [8, 9, 10] and Steele [7
].



We have assumed that DNA computations are error
-
free, i.e., they work perfectly without any errors.
Howev
er, in reality DNA computations can be faulty because some DNA operations can introduce errors.


DNA operations are constrained by biological feasibility.

DNA operations may be:


(i) realized by the present biotechnology or

(ii) implemented by simula
tion on the conventional vonNeumann computers.


5. DNA Computation Model


As computer components become smaller and/or more compact, scientists and engineers dream of a
chemical, multi
-
processor computer, whose processors are individual molecules involve
d in chemical
processes.


Following this thinking, we propose DNA computation model that involves the following three
operations levels:

4


(i) Basic DNA operations (DNA molecular interactions);

(ii) Test tube operations (proposed in 1996 by Gibbons, Amos
, and Hodgson [5]) such as: remove, union,
copy, select, and etc.

(iii) High level operations


A selection of Easel/C
-
like programming language statements such as:


(i) begin
-
end

or {
-
}

(for grouping)

(ii) if
-
then
-
else (for selection)

(iii) for (for
loop)


The basic DNA operations level is the chemical interactions between DNA
-
s. It may be seen as machine
programming and may be interpreted as executions of machine code. The basic DNA operations can be
implemented at DNA computers or simulated at vonN
eumann machines.


The test tube operations level is an interface level that serves as an interface between von
-
Ne
umann
machine and DNA machine.
It may be seen as the hardware of a DNA computer. The test tube operations
can be implemented at DNA computers
or simulated at vonNeumann machines.


The high level operations
-

the programming language level can be implemented using vonNeumann
machines with standard processors, operating systems, and programming languages processors.


In the last twelve years DN
A computation has emerged as an exciting, fascinating, and important new
research field at the intersection of computer science, mathematics, biology, chemistry, bioinformatics,
and engineering.


The main reasons for the interest in DNA
-
computations are:


(i) size and variety of available DNA molecular "tool boxes"

(ii) massive parallelism inherent in laboratory and chemical operations on DNA molecule

(iii) feasible and efficient models

(iv) physical realizations of the models

(v) performing computat
ions in vivo.


Unfortunately it is still not clear whether DNA computing can compete (or will be able to compete in the
near future) with existing electronic
-
digital computing. We propose that in the near future it will be
possible to join vonNeumann
and DNA computer in a functional super biocomputer. We are confident
that in 10
-
20 years our desktop computers will be evolved into biocomputers. These machines will be
able to perform calculations in seconds that take today’s PCs hours, and solve in hours

problems that take
today’s PCs years.



A computational substrate


a substance that is acted upon by the implementation of DNA computational
model is DNA. DNAs are represented by strings. DNA computational model operates upon sets of
strings. A DNA comp
utation starts and ends with a single set of strings.


A DNA algorithm is composed of a sequence of operations upon one or more sets of strings. At the end of
the DNA algorithm’s execution, a solution to the given problem is encoded as a string in the fin
al set.



Characterization of DNA computations using traditional measures of complexity, such as time and space
is misleading due to the nature of the laboratory implementation of DNA computation. One way to
5


quantify the time complexity of a DNA
-
based alg
orithm is to count the required numbers of “biological
steps” to solve the problem. The biological steps include the creation of an initial library of strands,
separation of subsets of strands, sorting strands by length, chopping and joining strands, and e
tc.


6. Basic DNA Operations


An assignment is a finite sequence of unit assignments. A unit assignment is coded by a DNA strand. All
unit assignments of an assignment have the same length.


The most important basic DNA operations are:

(i) Append (Con
catenate, Rejoined)
--

appends two DNA strands with ‘sticky ends’

(ii) Melt (Anneal, Renaturation)
--

breaks two DNA strands with complementary sequences

(iii) Cut
--

cuts a DNA strand with restriction enzymes.



7. Test Tube Operations


A test tub
e contains an assignment.


The most important test tube operations are:


(i) Union (Merge, Create)
--

pours the context of more tubes into one tube.

(ii) Copy (Duplicate, Amplify)
--

makes copies of a tube.

(iii) Separate
--

separates an assignment into

a finite sequence of assignments sorted by the length of unit
assignments.

(iv) Detect
--

confirms presence or absence of a unit assignment in a tube.

(v) Select
--

selects on the uniformly random way from an assignment a unit assignment.

(vi) Append (
Concatenate, Rejoined)
--

appends an unit assignment to each unit assignment of an
assignment.

(vii) Melt (Anneal, Renaturation)
--

melts each unit assignment of an assignment with a unit assignment.

(viii) Extract
--

extracts the context of one tube in
to two tubes using a pattern unit assignment.

(ix) Remove
--

removes unit assignments that contain occurrence(s) of other unit assignments.

(x) Cut
--

cuts each unit assignment of an assignment for the given length.

(xi) Discard


empty the tube.


8. DN
A Representations


A DNA representation of a string c1 ... cm is a sequence c[1] ... c[m], where
:

-

c[i]
, where i = 1, ..., m,

is the character at the p
osition i.

Characters

c[i]

are uniquely encoded by DNA strands.


If an unsigned integer number is not
used for numerical calculations, then the unsigned integer number
may be represented as a string of digits

of
some base.


A
DNA representation of an unsigned integer number d1...dm is a sequence d[1]...d[m], where
:

-

d[i]
,
where i = 1, ...
, m,

is the digit

at the position i,
and

-

0 <= d[i] <= base
-
1.


The base may be any integer number >= then 1.

Digits

d[i] are uniqu
ely encoded by DNA strands.



6


If an unsigned integer number is used for numerical calculations, then the given DNA representation of an
unsig
ned integer number is not suitable because it does not care on carries what complicates
implementations of arithmetic operations with unsigned
integer binary numbers
.


9. DNA
-
Based Algorithm
for Creating
a Set of
Unsigned

Integer Binary Numbers



DNA
-
Ba
sed Algorithm for creating a set of unsigned integer binary numbers may be specified on the
following way
:


procedure CreateInputSet(m, T)

// m is the input data;

// m is a
unsigned

integer number;

// T is the input data and the output data;

// T is th
e tube;

// T as the input data

// is the set of one of more empty strings.

// The tube T contains a finite sequence of unit assignments

// (DNA strands) that represents empty strings.

// T as the output data

// is the set of
unsigned

integer binary numbe
rs of the length m;

// T has maximum 2
m

elements;

// The tube T contains a finite sequence of unit assignments

// (DNA strands) that represents

// the set of
unsigned

integer binary numbers of the length m;

{


base = 2;


T = {ε}; // ε is the empty string


for (i = 1; i <= m; i++)


{


Copy(T, { T[base
-
1] });


Parallèle for(j = 0; j <= base
-
1; j++)


{


k = rand(base
-
1);


Append(T[j], k, T[j]);


}


Union({ T[base
-
1] }, T);


Discard( { T[base
-
1] }
);


}

}


10
. DNA
-
Based Algorithm for Deter
mining the Smallest Element of a

Set of
Unsigned

Integer
Binary Numbers


DNA
-
Based Algorithm for determining the smallest element of a set of unsigned integer binary numbers
of the given length may be specified
on the following way
:


function Smallest(m, T)

// The function Smallest returns

// the smallest element of the set T;

// It is the unit assignment (DNA strand).

// m is the input data;

// m is a
unsigned

integer number;

// T i
s the input
data;

// T
is the tube;

// T as the input data

7


// is the set of
unsigned

integer binary numbers of the length m;

// T has maximum 2
m

elements;

// The tube T contains a finite sequence of unit assignments

// (DNA strands) that represents

// the set of
unsigned

inte
ger binary numbers of the length m;

{


for i = m to 1 do


{


// Parallel


// {


// T
-
> T[0];


// T
-
> T[1];


// }


Copy(T, { T[1] });


// It is a very important to empty the tube T.


Discard(T);


Parallel


{


Remove(T[1], {i 0});


Remove(
T[0], {i 1});


}


if(Detect(T[0]))


then


Union(T[0], T);


else


Union(T[1], T);


}


return Select(T);


}



Explanations


i means the unsigned intege
r number j from the range 1 .. m
.

i j means the unsigned integer number j from the range 1 .
. m at the position i.


Remove(T[j], {i
j}) removes from the tube T[j] all DNA strands which contain at least one occurrence of
th
e DNA substrands i
j.

Remove(T[j], {i
j}) saves in the tube T[j] only DNA strands which contain at the position i th
e value
¬
j.


Parallel{A; B} means that operation A and B may be executed in parallel.


Complexity


Complexity of the DNA
-
Based Algorithm for determining the smallest element of a set of
m unsigned

integer binary numbers

is O(m
).


Program


The DNA
-
Bas
ed p
rogram for determining the

smallest
element

of a

set of
unsigned

integer
binary
numbers

of the length m
may be specified on the following way
:



program S

{

8



m=4;


CreateInputSet(m, T);


smallest =
Smallest(m, T);

}.


Program Execution


The program S

may be executed:


(i) step by step in a DNA
-
lab or on a DNA
-
computer

(ii) automatically on a super DNA
-
computer or on an electronic
-
digital computer

(iii)
for small (l
ess then 10) number of elements
.


Test Example


The purpose of the test example i
s to "visualize" execution of

the DNA
-
Based p
rogram

S

for determining
the

smallest
element

of a

set of
unsigned

integer
binary
numbers

of the length m.


m=4;

CreateInputSet(4, T);



base = 2;


T = {ε}; // ε is the empty string


T

Ε



i = 1;


Copy(T, {T[1]});


T[0]

T[1]

ε
=
ε
=
=
††
偡牡汬敬
=
††
{
=
††=
欠㴠牡湤⠱⤻⁦潲⁥硡浰汥‽‰
=
††=
䅰灥湤⡻⁔嬰崬⁽Ⱐ呛そ⤻
=
††=
欠㴠牡湤⠱⤻⁦潲⁥硡浰汥‽‱k
=
††=
䅰灥湤⡔嬱崬Ⱐ呛ㅝ⤻
=
††
}
=
=
T[0
]

T[1]


=

=
=

啮楯渨笠呛ㅝ⁽Ⱐ吩r
=
=
T

0ε 1ε
=
=

䑩獣慲搨笠呛ㅝ⁽⤻
=
=
T[0]

T[1]



9



i = 2;


Copy(T, {T[1]});


T[0]

T[1]

0ε 1ε
=
0ε 1ε
=
=
††
偡牡汬敬
=
††
{
=
††=
欠㴠牡湤⠱⤻‽‰k
=
††=
䅰灥湤⡻⁔嬰崬⁽Ⱐ呛そ⤻
=
††=
欠㴠牡湤⠱⤻⁦潲⁥硡浰汥‽‰k
=
†=

䅰灥湤⡔嬱崬Ⱐ呛ㅝ⤻
=
††
}
=
=
T[0]

T[1]

00ε 01ε
=
00ε 01ε
=
=

啮楯渨笠呛ㅝ⁽Ⱐ吩r
=
=
T

00ε 01ε 00ε 01ε
=
=

䑩獣慲搨笠呛ㅝ⁽⤻
=
=
T[0]

T[1]




And etc.



k = rand(1); for example k = 1;


k = rand(1); for example k = 0;


k = rand(1); for example k =
0;


k = rand(1); for example k = 1;



And finally


T

0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε
=
1100ε 1101ε 1100ε 1101ε 1000ε 1001ε 1000ε 1001ε
=
=
smallest =
Smallest(4, T);



i = 4;


Copy(T, { T[1] });


T[0]

T[1]

0100ε 0101ε 0100ε 0101ε
=
〰〰
ε 0001ε 0000ε 0001ε
=
1100ε 1101ε 1100ε 1101ε
=
1000ε 1001ε 1000ε 1001ε
=
0100ε 0101ε 0100ε 0101ε
=
0000ε 0001ε 0000ε 0001ε
=
1100ε 1101ε 1100ε 1101ε
=
1000ε 1001ε 1000ε 1001ε
=

=

䑩獣慲搨吩a
=
=
T

10





Parallel{Remove(T[1], {4 0}); Remove(T[0], {4 1});}


T[0]

T[1]

0100ε 0101ε 0100ε 0101ε
=
0000ε 0001ε 0000ε 0001ε
=
1100ε 1101ε 1100ε 1101ε
=
1000ε 1001ε 1000ε 1001ε
=

=

楦⡄整散琨呛そ⤩
=
瑨敮⁕湩潮⡔嬰崬⁔⤻
=
敬獥⁕湩潮⡔嬱崬⁔⤻e
=
=
T

0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε
=
=
椠㴠㌻
=

䍯灹⡔Ⱐ笠呛ㅝ⁽⤻
=
=
T[0]

T[1]

0100ε 0101ε 0100ε 0101ε
=
0000ε 0001ε 0000ε 0001ε
=
0100ε 0101ε 0100ε 0101ε
=
0000ε 0001ε 0000ε 0001ε
=
=
偡牡汬敬筒敭潶攨呛ㅝⰠ笳‰紩㬠剥浯癥⡔嬰崬⁻㌠ㅽ⤻m
=
=
T[0]

T[1]

0000ε 0001ε 0000ε 0001ε
=
0100ε 0101ε 0100ε 0101ε
=

=

楦⡄整散琨呛そ⤩
=
瑨敮⁕湩t
渨呛そⰠ吩n
=
敬獥⁕湩潮⡔嬱崬⁔⤻e
=
=
T

0000ε 0001ε 0000ε 0001ε
=
=
椠㴠㈻
=

䍯灹⡔Ⱐ笠呛ㅝ⁽⤻
=
=
T[0]

T[1]

0000ε 0001ε 0000ε 0001ε
=
0000ε 0001ε 0000ε 0001ε
=
偡牡汬敬筒敭潶攨呛ㅝⰠ笲‰紩㬠剥浯癥⡔嬰崬⁻㈠ㅽ⤻m
=
=
T[0]

T[1]

0000ε 0001ε 0000ε 0001ε
=
0000ε 0001ε 000
0ε 0001ε
=
=

楦⡄整散琨呛そ⤩
=
瑨敮⁕湩潮⡔嬰崬⁔⤻
=
敬獥⁕湩潮⡔嬱崬⁔⤻e
=
=
T

0000ε 0001ε 0000ε 0001ε
=
=
椠㴠ㄻ
=

䍯灹⡔Ⱐ笠呛ㅝ⁽⤻
=
=
T[0]

T[1]

0000ε 0001ε 0000ε 0001ε
=
0000ε 0001ε 0000ε 0001ε
=
=
偡牡汬敬⁻剥浯癥⡔嬱崬⁻ㄠぽ⤻⁒敭潶攨呛そⰠ笱‱紩㭽
=
11


T[0]

T[1]

0000ε 0000ε

0001ε 0001ε




if(Detect(T[0])) then Union(T[0], T);

else Union(T[1], T);


T

0000ε 0000ε
=


return Select(T);


smallest =
0000ε;


11
. DNA
-
Based Algori
thm for Determining the Largest Element of a

Set of
Unsigned

Integer
Binary Numbers


DNA
-
Based algorithm for determining the largest element of a set of unsigned integer binary numbers of
the given length straight follows.


function

Largest(m, T)

// The function Largest returns

// the largest element of the
set T;

//
It is the unit assignmen
t (DNA strand
)
.


{


for i = m to 1 do


{


Copy(T, { T[1] });


// It is a very important to empty the tube T.


Discard(T);


Parallel


{


Remove(T[1], {i 0});


Remove(T[0], {i 1});


}


if(Detect(T[1]))


then


Union(T[1], T);


else


Union
(T[0], T);


}


return Select(T);

}



12
. Conclusion


The DNA
-
Based program

and functions

for determining the smallest
/largest

element of a set of unsigned
integer binary numbers

are

written in C/Eas
el
-
like programming language. They represent

a DNA
co
mputer and computing environment based on DNA operations. This type of framework enables,
facilitates, and supports the work of bioinformatics scientists and researchers in the field. Tools such as
DNA computers will allow:

(i) Bioscientists to better und
erstand the fundamental processes involved in biological systems and
perhaps aid in predicting likely behaviors;

(ii) Computer scientists to better understand parallelism and maybe to get the new parallel
-
oriented ideas.

12


13
. Future Research


Our future r
esearch will be focused on:

(i) Hunting/searching the new so
-
called “killer applications”


that is applications of DNA computation
that would establish its superiority within a certain domain.

Our favorite domains are:

-

computer security

-

inform
ation assu
rance

-

cryptography

-

DNA
-
controlled devices and

-

DNA
-
motors.

(ii) We believe that an assured future for DNA computation can only be established through the discovery
of such and other ap
plications of DNA
-
computations.

(iii) Introducing

through
counting, recur
sion, and iteration
-

the fundamental concepts of the classical
computer science
-

into DNA
-
computation.


References


1. Adleman, L.M. Molecular Computation of Solutions of Combinatorial Problems. Science, 266, pp:
1021
-
1024, November 1994.

2. Adleman,
L.M. On Constructing a Molecular Computer in R. Lipton and E. Baum, editors, DNA
Based Computers, Discrete Mathematics and Theoretical Computer Science Series, Vol. 27, American
Mathematical Society, pp: 1
-
21, 1995.

3. Adleman, L.M. Computing with DNA.
Scientific American, 279(2), pp: 54
-
61, August 1998.

4. Feynman, R.P. There’s plenty of room at the bottom. Miniaturization, Reinhold, 1961.

5. Gibbons, A., Amos, M. and Hodgson, D. Models of DNA computation. Mathematical Foundations of
Computer Science
, Lecture Notes in Computer Science, Springer, 1996.

6. Lipton, R. J. DNA solution of hard computational problems. Science, pp: 268, 542
-
545, 1995.

7. Steele, G. and Stojkovic, V. Agent
-
Oriented Approach to DNA Computing (6 pages poster
presentation) i
n Proceedings, Poster, Workshops, and Demo Abstracts of CSB2004 Conference, Stanford,
CA, August 16
-
19, 2004.

8. Stojkovic, V. and Huo, H. A Permutation Generation DNA
-
Based Algorithm

-

An Example of a DNA
Computing Killer Application, Journal of Scientif
ic and Practical Comp
uting
, Vol.1, No.1. 2007 16
-
29.

9. Stojkovic, V., Huo, H., and Britto, E. DNA Based Addition and Subtraction of Two Unsigned Integer
Numbers Inspired by Unrestricted Grammars Implemented in Prolog Language (6 pages poster
presentation)

in Proceedings, Poster, Workshops, and Demo Abstracts CD of CSB2006 Conference,
Stanford, CA, August 14
-
18, 2006.

10. Stojkovic, V. and Huo, H. DNA Based Addition and Subtraction of Two Unsigned Integer Numbers
Inspired by Unrestricted Grammars (1 page
poster presentation) in Proceedings of DNA12 Conference,
Seoul, South Korea, June 5
-
9, 2006.











13


DNA
-
based A
ddition and
Subtraction of Two Unsigned Integer N
umbers

Inspired by
U
nrestr
icted Grammars I
mple
mented in Prolog L
anguage


Hongwei Huo
1

and

Voj
islav Stojkovic
2


1
Xidian University, School of Computer Science and Technology

Xi’an 710071, P.R. China

hwhuo@mail.xidian.edu.cn


2
Morgan State University, Computer Science Department

Baltimore, MD 21251, USA

stojkovi@jewel.morgan.edu



DNA computers cann
ot be considered important computing machines until they are used to solve
complex numerical calculations problems. The challenge is to discover the right ways how DNA
computers use numbers and arithmetic operations. Guarnieri, Fliss, and Bancroft (GFB) we
re the
first researchers who found a way how a DNA computer can add two binary unsigned integer
numbers. Using GFB's work as a foundation, we expanded their results by connecting
Unrestricted Grammars, Automata
-
Machines and DNAs.

Our unique approach has ac
complished
the following objectives: transform DNA computers into numerical computing machines and
extend DNA computing to numerical computing.

The paper presents Unrestricted (phrase
structure, type 0, semi
-
Thue) Grammars to add/subtract two unsigned inte
ger numbers, DNA
representation of digits of an unsigned integer number, translation scheme for translating rules of
an Unrestricted Grammar into DNA sequences, DNA based addition and subtraction of two
unsigned integer numbers and DNA representation of un
signed integer numbers. The results are
theoretically f
ounded, general
, shor
ter
, eff
icient
, more elegant than GFB's,
and implemented in
the Prolog language.


Introduction


DNA computing bega
n in 1994 when Adleman
1
-
3

showed that DNA computing is pos
sible b
y solving
the directed
Hamiltonian path proble

m
on a DNA
computer. Adleman’s work has greatly influenced the work of many researchers and scienti
sts. A year
later, Lipton
6

solved the NP
-
complete "satisfaction" problem. These and many other results have de
monstrated the
feasibility of using DNA computers to solve combinatorial pro
blems.

DNA computers cannot
be important
until
they begin to be applicable in solving problems with a lot of complex numerical calculations. The challenge is to
discover ways that
DNA computers use numbers and arithmetic operations. Gu
arnieri, Fliss, and Bancroft
5

were the
first who found a way how a DNA computer can add two binary unsigned integer numbers. Their discovery is very
important but unfortunately has not the theoretical
b
ackground. GFB
's work has greatly influenced our work
and we
have extended their work
by connecting Unrestricted Grammars and DNAs. With this original approach we have
reached a few goals:

-

making DNA computers numerical computing machines;

-

extend DN
A computing to numerical computing;

-

making Unrestricted Grammars not only theoretically but also practically very useful.

Dna C
omputers

14


A DNA computer is a nanocomputer that uses the DNA sequences to store information and perform complex
computations. A

DNA computer uses deoxyribonucleic acids as the memory units that can take on four possible
positions and recombinant DNA techniques to carry out fundamental operations.

Two DNA sequences may be rejoined if their terminal overhangs are complementary. Usin
g
these operations, DNA sequences may be inserted or deleted. The input data and output data are
DNA sequences. A program on a DNA computer is executed as a series of biochemical
processes, which have the effect of synthesizing, extracting, modifying and c
loning the DNA
sequences. The restriction enzymes can be employed to cut a DNA sequence at a specific
position. A DNA computer uses enzymes as the software to execute the desired DNA
computation.

If the solution to the problem exists then the solution to t
he problem is one of the
new generated DNA sequence and can be obtained by the elimination process.

Unsigned Integer N
umbers

Major The set of unsigned integer numbers N is the union of the set of positive natural numbers

{1, 2, 3, …} and the
set of number
zero {0}. N is a countable infinite set. N is closed

under the operations

of addition, multiplication, int
-
division, and int
-
mod. N is not closed under the operation of subtraction and division.

There are many ways to represent unsigned integer numbers. T
he most important representation
of
an unsigned integer number n is

dk
-
1, …, d1d0 b =

d(k
-
1), …, d(1)d(0)[b],

where d(i),

0 <= i <k, are digits.

The digits d(i), 0 <= i < k, satisfy the following properties 0 <= d(i) < b, 0 <= i < k.

The number b is the

base of the number system. The base 10, b=10, is the default base and may
be omit.

d
k
-
1
,…,d
1
d
0

10

=
d(k
-
1),…,d(1)d(0)[10] = d
k
-
1
,…,d
1
d
0

= d(k
-
1),…,d(1)d(0)

The value of an unsigned integer number

d(k
-
1), …, d(1)d(0)[b] is

d(k
-
1)*b**(k
-
1) + …
+ d(1)*b
**1 + d(0)*b**0 =

d(k
-
1)*b**(k
-
1) + … + d(1)*b + d(0)

U
nrestricted

G
rammars

A grammar G is a four
-
tuple (N, Σ, Π, S) where:

-

N is a set of nonterminal symbols

-

Σ is a set of terminal symbols

-

Π = (N U Σ)
*
N(N U Σ)
*

× (N U Σ)
*

is a set of rules

-

S is starting symbol (S є N)

Rules take the form α → β where:

-

α є (N U Σ)*N(N U Σ)
*

-

β є (N U Σ)
*


An Unrestricted Grammar is a grammar without any restrictions.

Unrestricted Grammar to Add Two Decimal Unsigned Integer N
umbers

A skeleton of an Unrestricted Grammar to add two decimal unsigned integer numbers (AddDec) of the same length
Fa...Fa and Sb...Sb is

(1) Addition → FaSb...FaSbC


where 0<=a<=9 and 0<=b<=9

(A) C → Carry0

(B) Sb Carryu → Carryv


where 0<=b<=9, 0<=u<=1, v = b+u, 0<=v<=10

15


(C) Fa Carryw → Carryp q


where 0<=a<=9, 0<=w<=10,


p = div(a+w, 10), and

q = mod(a+w, 10)

(2) Carry0 → ε

(3) Carry1 → 1

On the direct way the AddDec skeleton can be extended to AddDecUnrestrictedGrammar


(A) C → Carry0

(B00) S0 Carry0 → Carry0

(B01) S0 Carry1 → Carry1


(B10) S1 Carry0 → Carry1

(B11) S1 Carry1 → Carry2

...

(B90
) S9 Carry0 → Carry9

(B91) S9 Carry1 → Carry10


(C00) F0 Carry0 → Carry0 0

(C01) F0 Carry1 → Carry0 1

...

(C010) F0 Carry10 → Carry1 0


(C10) F1 Carry0 → Carry0 1

(C11) F1 Carry1 → Carry0 2

...

(C110) F1 Carry10 → Carry1 1

...

(C90) F9 Carry0 → Carry0 9

(C
91) F9 Carry1 → Carry1 0

...

(C910)F9 Carry10 → Carry1 9

The total number of rules is 134.


Example


Let us compute 98 + 76 (= 174).

The first decimal unsigned integer number is represented as F9 F8.

The second decimal unsigned integer number is represen
ted as S7 S6.

The appropriate derivation is


Addition


=(1)=> F9 S7 F8 S6 C




=(A)=> F9 S7 F8 S6 Carry0




=(B60)=> F9 S7 F8 Carry6




=(C86)=> F9 S7 Carry1 4




=(B71)=> F9 Carry8 4




=(C9
8)=> Carry1 7 4




=(3)=> 1 7 4

unrestricted grammar to add two binary unsigned integer numbers

On the direct way AddDec skeleton can be transformed into AddBin skeleton


(1) Addition → FaSb...FaSbC


where 0<=a<=1 and 0<=b<=1

(A) C → Carry0

(B) Sb Carryu → Carryv


where 0<=b<=1, 0<=u<=1, v = b+u, 0<=v<=2

(C) Fa Carryw → Carryp q


where 0<=a<=1, 0<=w<=2,


p = div(a+w, 2), and q = mod(a+w, 2)

16


(2) Car
ry0 → ε

(3) Carry1 → 1


On the direct way the AddBin skeleton can be extended to AddBinUnrestrictedGrammar

(1) Addition → FaSb...FaSbC


where 0<=a<=1 and 0<=b<=1


(A)

C → Carry0


(B00)

S0 Carry0 →
Carry0

(B01)

S0 Carry1 →
Carry1


(B10)

S1 Carry0 →
Car
ry1

(B11)

S1 Carry1 →
Carry2


(C00)

F0 Carry0 →
Carry0 0

(C01)

F0 Carry1 →
Carry0 1

(C02)

F0 Carry2 →
Carry1 0


(C10)

F1 Carry0 →
Carry0 1

(C11)

F1 Carry1 →
Carry1 0

(C12)

F1 Carry2 →
Carry1 1


(2) Carry0 → ε

(3) Carry1 → 1


Substituting Carry0 with
NoCarry and Carry1 with Carry the following AddBinUnrestrictedGrammar can be
obtained


(1) Addition → FaSb...FaSbC


where 0<=a<=1 and 0<=b<=1


(A)


C
→ NoCarry



(B00)

S0 NoCarry →
NoCarry

(B01)

S0 Carry →
Carry


(B10)

S1 NoCarry →
Carry

(B11)

S1 Carr
y →
Carry2

(C00) F0 NoCarry → NoCarry 0

(C01) F0 Carry → NoCarry 1

(C02) F0 Carry2 → Carry 0


(C10) F1 NoCarry → NoCarry 1

(C11) F1 Carry → Carry 0

(C12) F1 Carry2 → Carry 1


(2) NoCarry → ε

(3) Carry → 1


The total number of rules is

14.

unrestricted gr
ammar to subtract two d
ecimal unsigned integer numbers

A skeleton of an Unrestricted Grammar to subtract two decimal unsigned integer numbers (SubDec) Fa...Fa and
Sb...Sb is


17


(1) Subtraction → FaSb...FaSbC


where 0<=a<=9 and 0<=b<=9

(A) C → Carry0

(
B) Sb Carryu → Carryv


where 0<=b<=9, 0<=u<=1, v = b+u, 0<=v<=10

(C) Fa Carryw → Carryp q



where 0<=a<=9, 0<=w<=10
,


p = 1 and q = 10+a
-
w if a<w,


p = 0 and q = a
-
w if a>=w,


0<=p<=1, 0<=q<=9

(2) Carry0 → ε

(3) Carry1 → 1


On
the direct way the
SubDec skeleton can be extended to

SubDecUnrestrictedGrammar.

unestricted grammar to subtract two binary unsigned integer numbers

On the direct way the
SubDec skeleton can be transformed into SubBin skeleton:



(1) Subtraction → FaSb...
FaSbC


where 0<=a<=1 and 0<=b<=1

(A) C → Carry0

(B) Sb Carryu → Carr
yv


where 0<=b<=1, 0<=u<=1,


v = b+u, 0<=v<=2

(C) Fa Carryw → Carryp q


where 0<=a<=1, 0<=w<=2,


p = 1 and q = 2+a
-
w if a<w,


p = 0 and q = a
-
w if a>=w,

0<=p<=1, 0<=q<=1

(2) Carry0 → ε

(3) Carry1 → 1


On the direct way the
SubBin skeleton can be extend
ed to SubBinUnrestrictedGrammar.

dna representation of digits of an unsigned integer number

The DNA representation of unsigned integer numbers is the critic
al for defining and executing arithmetic operations
on unsigned integer numbers. The DNA representation of unsigned integer numbers must be unique. Conversely,
each unsigned integer number must have the unique DNA representation.

One solution of the proble
m is to represent
each digit of an unsigned integer number by the unique DNA sequence. Digits of an unsigned integer number are
“naturally” connected as elements of a sequence that represents the number. DNA representations of digits are
distributed (in a
tube) and must be connected by the “extra” DNA
-
pointers.

U
sing AddBinUnrestrictedGrammar the following rules for the DNA representation of the binary
digits of two binary unsigned integer numbers First and Second are getting straight:




18












(2) NoCarry: 5' 3'

(3) Carry: 5' 1 3'


The upper rules are still useless because they do not care about the position of binary digits. The
problem may be solved by i
ntroducing the position index.









(2) NoCarry(n,n+1): 5' (n+1)0 NoCarry(n,n+1) 3'

(3) Carry(n,n+1): 5' (n+1)1 Carry(n,n+1) 3'


n = 0, 1, 2, ...

translation scheme

The
new algorithm (the translation scheme) for translating rules of an UnrestrictedGramm
ar into DNA sequences is
simple and natural.


S → A ... S: 3' A(
-
1,0) 5'

S A → B ... S(n): 5' B(n) A(n
-
1,n) 3'

S A → B b ...

S(
n): 5' B(n,n+1) (n)b A(n) 3'

A → ε

...

A(n,n+1): 5'
ε

A(n,n+1) 3'

A


b
... A(n,n+1): 5' (n+1) b A(n,n+1) 3'


n = 0, 1, 2, ...; S, A, B є N; b є Σ.


We also proved that any Unrestricted Grammar can be transformed into equivalent Unrestricted
Grammar which rules are of the upper form.

dna based addition and subtraction o
f two unsigned integer numbers in different bases

The DNA computation is directed by the DNA representation of data.

The main
-
basic DNA operation is two DNA
sequences rejoin operation. Two DNA sequences may be rejoined if their terminal overhangs are compl
ementary.


rejoin:

(A) C: 3' NoCar
ry 5'

(B00) S0: 5’ NoCarry NoCarry 3’

(B01) S0: 5’ Carry Carry 3’

(B10) S1: 5’ Carry NoCarry 3’

(B11) S1: 5’ Carry2 Carry 3’

(C00) F0: 5’ NoCarry 0 NoCarry
3’

(C01) F0: 5’ NoCarry 1 Carry 3’

(C02) F0: 5’ Carry 0 Carry2 3’

(C10) F1: 5’ NoCarry
1 NoCarry
3’

(C11) F1: 5’ Carry 0 Carry 3’

(C12) F1: 5’ Carry 1 Carry2 3’

(A) C:3' NoCarry(
-
1,0) 5'


(B00) S(n)0:5’ NoCarry(n) NoCarry(n
-
1,n)3’

(B00) S(n)0:5’ Carry(n) Carry(n
-
1,n) 3’


(B10) S(n)1: 5’ Carry(n) NoCarry(n
-
1,n) 3’

(B11) S(n)1: 5’ Carry2(n) Carry(n
-
1,n) 3’


(C00) F(n)0:5’ NoCarry
(n,n+1) (n)0 NoCarry(n) 3’

(C01) F(n)0:5’ NoCarry(n,n+1) (n)1 Carry(n) 3’

(C02) F(n)0:5’ Carry(n,n+1) (n)0 Carry2(n) 3’

(C10) F(n)1: 5’ NoCarry(n,n+1) (n)1 NoCarry(n) 3’

(C11) F(n)1: 5’ Carry(n,n+1) (n)0 Carry(n) 3’

(C12) F(n)1: 5’ Carry(n,n+1
) (n)1 Carry2(n) 3’

19


5’ AB...CTerminal 3’


3’ Terminal EF...H 5’

------------------------------------

3’ AB...C Terminal EF...H 5’


Using two DNA sequences rejoin operations, DNA sequences may be inserted or deleted.


Example


Let us compute 11 + 0
1 (= 100).

The first binary unsigned integer number is represented as F(1)1 F(0)1.

The second binary unsigned integer number is represented as S(1)0 S(0)1.

The appropriate DNA computation is


Starting NoCarry: 3' NoCarry(
-
1,0) 5'

selects S(0)

1: 5’ Carry(
0) NoCarry(
-
1,0) 3’

S(0)1 + Starting Nocarry

5’ Carry(0) NoCarry(
-
1,0) 3’


3' NoCarry(
-
1,0) 5'

-------------------------------------

3’ Carry(0) NoCarry(
-
1,0) 5’


Carry(0) selects F(0)1: 5’ Carry(0,1) (0)0 Carry(0) 3’

F(0)1 + Temporary

5
’ Carry(0,1) (0)0 Carry(0) 3’


3’ Carry(0) NoCarry(
-
1,0) 5’

-------------------------------------------------------

3’ Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’

Carry(0,1) selects S(1)0: 5’ Carry(1) Carry(0,1) 3’

S(1)0 + Temporary

5’ Carry(1) Carry(0,1) 3’


3’ Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’

------------------------------------------------------------------

3’ Carry(1) Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’


Carry(1) selects F(1)1: 5’ Carry(1,2) (1)0 Carry
(1) 3’

F(1)1 + Temporary

5’ Carry(1,2) (1)0 Carry(1) 3’


3’ Carry(1) Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’

-----------------------------------------------------
--------------

3’ Carry(1,2) (1)0 Carry(1) Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’


Carry
(1,2) selects Carry(1,2): 5' (2)1 Carry(1,2) 3'

Carry(1,2) + Temporary

5' (2)1 Carry(1,2) 3'


3’ Carry(1,2) (1)0 Carry(1) Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’

----------------------------------------------------------
---------

3' (2)1 Carry(1,2) (1)0
Carry(1) Carry(0,1) (0)0 Carry(0) NoCarry(
-
1,0) 5’

20


dna representation of unsigned integer numbers

From the previous example follows the algorithm (the translation scheme) for the DNA representation of unsigned
integer numbers.

The DNA representation of uns
igned integer number n, d
k
-
1
, …, d
1
d
0

b

= d(k
-
1), …, d(1)d(0)[b]
is


3'

(k
-
1)d(k
-
1) Carry(k
-
2,k
-
1)

(k
-
2)d(k
-
2) Carry(k
-
2) Carry(k
-
3,k
-
2)

...

(1)d(1) Carry(1) Carry(0,1)

(0)d(0) Carry(0) NoCarry(
-
1,0)

5’


The main minus of the given DNA representation o
f unsigned intege
r numbers is the DNA
arithmetic
operations are digits (not the number) oriented.

Prolog

Im
plementation

A DCG rule can be imagined

as
a rule that add labeled arcs to
a
linear
graph that starts as a linear
chain represent
ing the input string
.

A Context
-
F
ree

DCG

rule

adds edges to the linear graph.

For example a Context
-
Free DCG rule
A
--
> B,C

tells
tha
t if there is an arc from node S0

to node S1 labeled by b and an arc from node S1 to node
S2 labeled by c, then an arc from node S0 to node S
2 labeled by a should be added.

T
he
Context
-
Free
DCG rule in

the Prolog language is

a(S0,S
2
) :
-

b(S0,S1),

c(S1,
S2).

A c
ontext
-
sensitive rule, with for example
two symbols on

the left
-
hand
-
side, can be imagined
as
a graph
-
generating rule

that

introduce
s

a
ne
w node as well as new arcs.

For example
a
Context
-
Sensitive DCG

rule AB
--
> CD

tells that if there are
two adjacent edges labeled C and D, a new node

should be introduced

and
connect
ed

with the fir
st node of the C
-
arc labeled with A, and also connected

to the f
inal node of
the D
-
arc, labeled

with B.

The Context
-
Sensitive DCG rule in the Prolog language are


a(S0,p1(S0)) :
-

c(S0,S1), d(S1,S
2
).

b(p1(S0),S
2
) :
-

c(S0,S1), d(S1,S
2
).


We have to introduce a new name for the new node. We choose to identify the
new nodes by
using a functor symbol that uniquely determines the rule and left
-
hand internal position, and
pairing it with the name of the initial node i
n the base arc.

p1 uniquely identifies the (only)
internal position in the left
-
hand
-
side of this rule.

Other rules, and positions, would have
different functors to identify them uniquely.

T
he
context sensitive grammar can be represented
using the following XSB rules:

:
-

auto_table.

% def
ine word/3 using base word

word(X,Y,Z) :
-

base_word(X,Y,Z).

% parse
a string...

21


%
assert words first, then call sentence
%
symbol

parse(String) :
-


abolish_all_tables,


retractall(base_word(_,_,_)),


assertWordList(String,0,Len),


s(0,Len).

% assert the list of words.

assertWordList([],N,N).

assertWordList([Sy
m|Syms],N,M) :
-


N1 is N+1,


assert(base_word(N,Sym,N1)),


assertWordList(Syms,N1,M).



The grammar can be ran to parse input strings.

C
onclusion

The mai
n results of this paper are:

-

Unrestricted Grammars to add/subtract tw
o unsigned integer
numbers

-

DNA representation of digits of a
n unsigned integer number

-

translation scheme for translating rules of an Unrestricted G
rammars into DNA sequences

-

DNA based addition and subtraction of t
wo unsigned integer numbers

-

DNA representation of unsi
gned integer numbers.

The results are theoretically

founded, general
,
shorter
, eff
icient
, and more elegant tha
n those
provided by Adleman
3

and Gu
arnieri, Fliss, and Bancroft
6
.

The presented algorithms are not technically demanding. They involved the simple

biochemical
procedures that require approximately one day of work in the DNA computing lab.

future researches

Our future short
-
term research will be focused on:

-

Unrestricted Grammars to multiple, divide, module two unsigned integer numbers

-

DNA based
multiplication, division and module of two unsigned integer numbers

-

Unrestricted Grammars to Add, Subtract, Multiple, and Divide two unsigned real numbers

-

DNA represent
ation of unsigned real numbers

-

DNA based addition, subtraction, multiplication, an
d division of two unsigned real numbers

-

DNA computation of the standard mathematical functions

and
expressions

References

1.
Adleman

LM.
Molecular Computation of Solut
ions


of Combinatorial Problems.

Science

1994;
266
:


1021
-
1024.

2.
Adleman

LM. On C
o
nstructing a Molecular


Computer.

In
:
Lipton

R,
Baum

E (
editors
)
,
DNA


Based Computers
,
DIMACS Series in Discrete


Mathematics and Theoretical Computer Science,


American Mathematical Society,
1995;
27
: 1

21
.

3.
A
dleman LM. Computi
ng with DNA.

Scientific


American

1998;

54
-
61
.

22


4. Baum

EB,
Boneh

D.
Running dynamic


programmi
ng algorithms on a DNA computer. In:


Proceedings of the Second Annual Meeting on DNA


Based Computers
, Prinston University, June 10
-
12,


1
996.


5
. Guarnieri

F
, Fliss

M,
Bancroft

C. Making DNA


Add.

Science

1996;
273
: 220
-
223
.

6
. Lipton RJ. Using DNA to solve NP
-
complete


problems.
Science

1995;
268
: 542
-
545.































23


A PERMUTATION GENERATION DNA
-
BASED ALGORITHM

-

AN EXAMPLE OF A DNA
COMPUTING KILLER APPLICATION

VOJISLAV STOJKOVIC
1
, HONGWEI HUO
2


1
Morgan State University, Baltimore, MD 21251, USA


2
Xidian University, Xi'an 710071, China,


______________________________________________________________________
__________

Abstract
:




The paper presents a DNA
-
based algorithm for permutation generation. A DNA
-
based permutation generation is an example of
so
-
called "killer application" of DNA computation because the given permutation generation DNA
-
based algorith
m is superior
for the large number (for example larger than 20) of elements than any known permutation generation sequential or parallel vo
n
Neumann
-
based algorithm. The given permutation generation DNA
-
based algorithm can be executed step by step in a DNA
-
lab
or on a DNA
-
computer, automatically on a super DNA
-
computer or on a von Neumann
-
based electronic
-
digital computer (for the
small number of elements).

Keywords:

Permutation generation; DNA
-
computation; DNA
-
computer; killer application.

_____________
___________________________________________________________________


1. Introduction


The permutation generation problem is a motivating computational puzzle, an interesting example of an
application of computer science in combinatorial mathemat
ics, one of

the first nontrivial nonnumeric problems attacked by mathematicians and computer scientists. The permutation
generation problem is to generate all possible ways of rearranging n, n


1,

distinct items. The permutation
generation problem is si
mply stated, but not easily solved. The permutation generation problem has a long and
distinguished history. Over one hundred Permutation Generation algorithms have been published during the past
twenty years. The most well
-
know surveys of the field are D.

H. Lehmer [6] from 1960, R. J. Ord
-
Smith [8, 9] from
1970
-
1971, and R. Sedgewick [10] from 1977. In 1956, C. Tompkins [15] wrote a paper describing a number of
practical areas where permutation generation was being used to solve problems.

Email address:
stojkovi@jewel.morgan.edu; hwhuo@mail.xidian.edu.cn




The study of existing and development of new methods for permutation generation is still important today
because they illustrate the relationship between counting, recursion, and iteration.

The perm
utation generation problem has big the inherent limitation difficulty. Without computers
-

for n


10
the permutation generation problem is practically not solvable.

Table 1: APPROXIMATE TIME TO GENERATE PERMUTATIONS OF n
-
ELEMENTS

(1/msec per permutation)



n


n!


Time


1


1






2


2






3


6






4


24















24


9


362880






10


3
628800


3

seconds


11


39916800


40


seconds


12


479001600


8

minutes


13


6227020800


2

hours


14


87178291200


1

day


15


1307674368000


2

weeks


16


20922789888000


8

months


17


355689428096000


10


years




Table 1 shows the values of
n! and the computation time. For n > 15 the computation time is too long.

Table 2: APPROXIMATE DISTRIBUTION OF PERMUTATION GENERATION ALGORIT
MS


90% sequential algorithms (targeting one
-
processor digital computers)


9% parallel algorithms (targeting mu
lti
-
processor digital computers)


1% parallel algorithms (targeting DNA, quantum, ..., and others non vonNeumann computers)


2. Permutations


A permutation, also called an "arrangement number" or "order," is a rearrangement of

the elements of
an ordered list S into a one
-
to
-
one correspondence with S itself.

The number of permutations on a set of n elements is n!



Example:

There are 2!=2*1=2 permutations of {1, 2}, namely

{1, 2} and {2, 1}.

There are 3!=3*2*1=6 permutations of {1, 2, 3
}, namely

{1, 2, 3}, {1, 3, 2}, {2, 1, 3}, {2, 3, 1}, {3, 1, 2}, and {3, 2, 1}.


3. DNA


DNA, deoxyribonucleic acid, is a molecule found in every living cell, which directs

the formation, growth, and reproduction of cells. DNA consists of n
ucleotides. Nucleotides contain compounds
called phosphate, deoxyribose, and base. Within all nucleotides, phosphate and deoxyribose are the same, however,
the bases vary. The four distinct bases are: adenine (A), guanine (G), thymine (T), and cytosine (C)
. The exact
amount of each nucleotide and the order in which they are arranged are unique for every kind of living organism.
DNA represents information as a pattern of molecules on a DNA strand. A DNA strand is a string of the alphabet
{A, C, G, T}. The le
ngth of a DNA strand is equal to the length of the string that represents the DNA strand.


4. DNA Computer


A DNA computer is a chemical instrument consisting of a system of connected test tubes and other auxiliary units.
DNA computers use the chemical
properties of DNA molecules by examining the patterns of combination or growth
of the molecules or strings. DNA computers can do this through the manufacture of enzymes, which are biological
catalysts that could be considered the ‘software’ used to execute

the desired DNA computation. DNA computers
represent information in terms of DNA. In DNA computers, deoxyribonucleic acids serve as the memory units that
can take on four possible positions (A, C, G, or T). DNA computers do not have the vonN
eumann
architecture. DNA computers are massively parallel and are considered promising for complex problems that require
multiple simultaneous computations. DNA computers perform computations by synthesizing particular sequences of
25


DNA and allowing them to

react in test tubes. The task of the DNA computer is to check each possible solution and
remove those that are incorrect, using restrictive enzymes. When the chemical reactions are complete, the DNA
strands can then be analyzed to find the solution.


A super DNA computer is a programmable DNA computer.


5. DNA Computing


In 1961 Feynman [4] predicts in 1994 Adleman [1] realized computations at a molecular level computing with
DNA. DNA computing began in 1994 when Adleman showed that DNA compu
ting was possible by solving the
Traveling Salesman Problem on a DNA computer. Adelman [2] used DNA polymerase and Watson
-
Crick
complementary strands to do DNA computation. Since then, it has been a surge of research in the DNA
computation field. D
NA computation has emerged in an exciting new research field at the intersection of computer
science, biology, mathematics, and engineering. DNA computation has been demonstrated to have the
capability to solve problems considered to be computa
tionally difficult for von Neumann machines. After
the Hamiltonian Path problem was solved, several researchers proposed solutions to a spectrum of NP
-
complete
problems (such as Lipton [7]) dealing with satisfiability, cryptography, as we
ll as other search oriented problems.

Adleman’s [3] work has greatly influenced our work, however, our approach is different. Adleman’s approach
was biochemical
-
oriented, while our approach is computer science
-
oriented: (program+DNA)
-
oriented (based on
super DNA computer and/or modeling and simulation of biochemical processes using the Easel or Prolog
programming languages). Stojkovic [12, 13, 14] and Steele [11].

DNA computing is a field that holds promise for ultra
-
dense systems that pack megabytes of

information into
devices the size of a silicon transistor. Each molecule of DNA is roughly equivalent to a computer chip. With DNA
computing, in order to find a solution, DNA molecules are primed to generate different chemical states. These
molecules can
be examined to determine whether the molecules have combined to form DNA strands or whether
there is a separation of DNA strands. Most of the possible solutions are incorrect; however, one or a few may be
correct.

We have assumed that DNA computations ar
e error
-
free, i.e., they work perfectly without any errors. However,
in reality DNA computations can be faulty because some DNA operations can introduce errors.

DNA operations are constrained by biological feasibility.

DNA operations may be:


(i) realiz
ed by the present biotechnology or


(ii) implemented by simulation on the conventional vonNeumann computers.


6. DNA Computation Model


As computer components become smaller and/or more compact, scientists and engineers dream of a chemical, multi
-
proce
ssor computer, whose processors are individual molecules involved in chemical processes.

Following this thinking, we propose DNA computation model that involves the following three operations
levels:


(i) Basic DNA operations (DNA molecular interactions)
;


(ii) Test tube operations (proposed in 1996 by Gibbons, Amos, and Hodgson [5]) such as: remove, union, copy,
select, and etc.


(iii) High level operations



A selection of Easel/C
-
like programming language statements such as:


(i) begin
-
end (for gro
uping)


(ii) if
-
then
-
else (for selection)


(iii) for (for loop)


The basic DNA operations level is the chemical interactions between DNA
-
s. It may be seen as machine
programming and may be interpreted as executions of machine code. The basic DNA operatio
ns can be
implemented at DNA computers or simulated at vonNeumann machines.

The test tube operations level is an interface level that serves as an interface between von
-
Neumann machine and
DNA machine. It may be seen as the hardware of a DNA computer. Th
e test tube operations can be implemented at
DNA computers or simulated at vonNeumann machines.

26


The high level operations
-

the programming language level can be implemented using vonNeumann machines
with standard processors, operating systems, and progra
mming languages processors.

In the last twelve years DNA computation has emerged as an exciting, fascinating, and important new research
field at the intersection of computer science, mathematics, biology, chemistry, bioinformatics, and engineering.

The

main reasons for the interest in DNA
-
computations are:


(i) size and variety of available DNA molecular "tool boxes"


(ii) massive parallelism inherent in laboratory and chemical operations on DNA molecule


(iii) feasible and efficient models


(iv) physi
cal realizations of the models


(v) performing computations in vivo.


Unfortunately it is still not clear whether DNA computing can compete (or will be able to compete in the near
future) with existing electronic
-
digital computing. We propose that in th
e near future it will be possible to join
vonNeumann and DNA computer in a functional super biocomputer. We are confident that in 10
-
20 years our
desktop computers will be evolved into biocomputers. These machines will be able to perform calcu
lations in
seconds that take today’s PCs hours, and solve in hours problems that take today’s PCs years.

A computational substrate


a substance that is acted upon by the implementation of DNA computational model
is DNA. DNAs are represented by strings.
DNA computational model operates upon sets of strings. A DNA
computation starts and ends with a single set of strings.

A DNA algorithm is composed of a sequence of operations upon one or more sets of strings. At the end of the
DNA algorithm’s execution, a

solution to the given problem is encoded as a string in the final set.

Characterization of DNA computations using traditional measures of complexity, such as time and space is
misleading due to the nature of the laboratory implementation of DNA computat
ion. One way to quantify the time
complexity of a DNA
-
based algorithm is to count the required numbers of “biological steps” to solve the problem.
The biological steps include the creation of an initial library of strands, separation of subsets of strands,

sorting
strands by length, chopping and joining strands, and etc.


7. Basic DNA Operations


An assignment is a finite sequence of unit assignments. An unit assignment is coded by a DNA strand. All unit
assignments of an assignment have the same lengt
h.

The most important basic DNA operations are:


(i) Append (Concatenate, Rejoined)
--

appends two DNA strands with ‘sticky ends’


(ii) Melt (Anneal, Renaturation)
--

breaks two DNA strands with complementary sequences


(iii) Cut
--

cuts a DNA strand wi
th restriction enzymes.



Append Operation: append(alpha, beta, gama)



Input:


− the unit assignment alpha and


− the unit assignment beta.


Output:


− the unit assignment gama.



Append operation appends the unit assignment alpha with the unit assignment beta.

The unit assignment beta can be appended at the beginning or at the en
d of the unit assignments alpha.

gama = beta . alpha or gama = alpha

. beta

The default is at the beginning.

Melt Operation: melt(alpha, beta, gama)

Input:


− the unit assignment alpha and


− the unit assignment beta.


Output:

27



− the unit assignment g
ama.


Melt operation melts the unit assignment alpha with the unit assignment beta.

Unit assignment alpha can be melted from the beginning or from the end.

Default is from the beginning.

Cut Operation: cat(alpha, i, beta)

Input:


− the unit assignmen
t alpha and


− the non negative integer i.


Output:


− the unit assignment beta.


Cut operation cuts the unit assignment alpha for i
-
places.

Unit assignment alpha can be cut from the beginning or from the end.

Default is from the beginning.

If cut l
ength i is equal to 0, cut operation has no effect.

If cut length i is greater than the maximum length of unit assignment alpha, the result will be empty.


8. Test Tube Operations


A test tube contains an assignment.

The most important test tube ope
rations are:


(i) Union (Merge, Create)
--

pours the context of more tubes into one tube


(ii) Copy (Duplicate, Amplify)
--

makes copies of a tube


(iii) Separate
--

separates an assignment into a finite sequence of assignments sorted by the length of

unit
assignments


(iv) Detect
--

confirms presence or absence of an unit assignment in a tube


(v) Select
--

selects from an assignment an unit assignment on the uniformly random way


(vi) Append (Concatenate, Rejoined)
--

appends an unit assignment to ea
ch unit assignment of an assignment


(vii) Melt (Anneal, Renaturation)
--

melts each unit assignment of an assignment with an unit assignment


(viii) Extract
--

extracts the context of one tube into two tubes using a pattern unit assignment


(ix) Remove
-
-

removes unit assignments that contain occurrence(s) of other unit assignments


(x) Cut
--

cuts each unit assignment of an assignment for the given length.



Union (Merge, Create) Operation:



Union({T
1
, ..., T
i
, ..., T
m
}, T)
or



− Union({T
m
}, T)




Inp
ut: the finite sequence of tubes {T
1
, ..., T
i
, ..., T
m
}.

Output: the tube T that contains the content of tubes T
i
, where i = 1, ..., m.



Copy (Duplicate, Amplify) Operation:


− Copy(T, {T
1
, ..., T
i
, ..., T
m
})
or



− Copy(T, {T
m
})



Input: the tube T.

O
utput: the finite sequence of tubes {T
1
, ..., T
i
, ..., T
m
}. The tube Ti, where i = 1, ..., m, contains the content of
the tube T.



Separate Operation:



Separate(T, {T
1
, ..., T
i
, ..., T
m
})
or



− Separate(T, {T
m
})



28


Input:


− the tube T.


Output:



the finite sequences of tubes {T
1
, ..., T
i
, ..., T
m
}, where m


max(length(DNA strand)). The tube Ti, where i =
1, ..., m, contains DNA strands of the length i, where i


m.



Separate operation separates an assignment into a finite sequence of ass
ignments sorted by the length of unit
assignments.



Detect Operation: Detect(T)



Input:


− the tube T.


Output:


− true, if the tube T contains at least one unit assignment;


− false, if the tube T contains zero unit assignments.




Select Operation:

Select(T, alpha
i
)



Input:


− the tube T that contains the finite sequence of unit assignments {alpha
m
}.


Output:


− the unit assignment alpha
i
.



Select operation selects from the tube T that contains the finite sequence of unit assignments {alpha
m
} a
n unit
assignment alpha
i

on the uniformly random way.

If the tube T is empty, then the empty unit assignment will be returned.



Append Operation: Append(S, beta
i
, T)



Input:


− the tube S that contains the finite sequence of unit assignments {alpha
m
}
and


− the unit assignment beta
i
.


Output:


− the tube T that contains all unit assignments alpha
m

of the tube S concatenated to the unit assignment beta
i
.


The unit assignment beta
i

can be appended at the beginning or at the end of the unit assignmen
ts alpha
m
.



tube T = {beta
i

. alpha
m
} or tube T = {alpha
m
. beta
i
}



The default is at the beginning.



Melt Operation: Melt(S, beta
n
, T)



Input:


− the tube S that contains the assignment alpha
-

that is the finite sequence of unit assignments {alph
a
m
} and


− the unit assignment beta
n
.


29


Output:


− the tube T that contains all unit assignments alpha
m

from the tube S melted with the unit assignment
beta
n
.



The unit assignment alpha
m

can be melted from the beginning or from the end.

The default i
s from the beginning.



Extract Operation: Extract(alpha, T, T
1
, T
2
)



Input:


− the unit assignment alpha and


− the tube T.


Output:


− the tube T
1

consisting of DNA strands from the tube T that contains the unit assignment alpha as substrand
and


− the tube T
2

consisting of DNA strands from the tube T that does not contain the unit assignment alpha as
substrand.



Extract operation extracts using the given pattern DNA strands alpha the tube T into the tube T
1

and the tube
T
2
.



Remove Opera
tion: Remove(T
1
, T
2,

T
3
)





Input:


− the tube T
1

and


− the tube T
2
.


Output:


− the tube T
3
.


The tube T
3

contains the finite sequence of all unit assignments from the tube T
1

that do not contain occurrences
of unit assignments from the tube T
2
.



Cu
t Operation: Cut(T
1
, i, T
2
)



Input:


− the tube T
1

that contains the finite sequence of unit assignments {alpha
m
} and


− the cut length i, i


0.


Output:


− the tube T
2

that contains all unit assignments of tube T
1

cut for the length i.



Cut operat
ion cuts each unit assignment of an assignment from the beginning for the given length.

DNA strands can be cut with restriction enzymes.

The test tube operations allow us to solve problems
-

code DNA
-
based algorithms and write the appropriate
pro
grams.

Test Tube Programming Language was proposed by Lipton [7] and developed by Adleman [3] and then
discussed at many places.


9. DNA Representations

DNA representation of a string c
1
... c
m

is a sequence c[1] ... c[m], where c[i] is the

character at the position i,
where i = 1, ..., m. Characters are uniquely encoded by DNA strands.

30


If an unsigned integer number is not used for numerical calculations, then the unsigned integer number may be
represented as a string of digits.

DNA repres
entation of an unsigned integer number d
1
...d
m

is a sequence d[1]...d[m], where d[i] is the digit at the
position i, where i = 1, ..., m. Digits are uniquely encoded by DNA strands.

If an unsigned integer number is used for numerical calculations, then th
e given DNA representation of an
unsigned integer number is not suitable because it does not care on carries what complicates implementations of
arithmetic operations with unsigned integer numbers.

If 0


m


10, then a permutation of the integers {1, .
.., m} may be represented by unsigned integer numbers.

If 10 < m, then DNA representation of a permutation of the integers {1, ..., m} is p[1]v[1] ... p[i]v[i] ... p[m]v[m]
where p[i] is the position i and v[i] is the value at the position i, wher
e i = 1, ..., m. Positions and values must be
uniquely encoded by DNA strands.


10. A Permutation Generation DNA
-
based Algorithm


Permutation generation algorithm generates the set of all permutations

{P
1
, ..., P
m

| P
m

is m
-
th permutation of the inte
gers {1, ..., m} and 1


m


m!}.

The input set T is (the tube T contains) a finite sequence of unit assignments (DNA strands) that represents
candidates for permutations.

The output set T is (the tube T contains) a finite sequence of unit assignments (
DNA strands) that represents
permutations.

The input set T may be created using the following CreateInputSet(m, T) algorithm.

procedure CreateInputSet(m, T) // m is input; T is output

{


//T =

Ø
; //{e} /
/ empty


T = {
ε
}; //
ε

is the empty string



for (i = 1; i <= m; i++)


{


Copy(T, { T[m] });


for(j = 1; j<=m; j++) { Append(T[j], j, T[j]); }


Union({ T[m] }, T);


}

}



The output set T may be created using the following PermutationGeneration(T) algorith
m.



PermutationGeneration(T) // T is input and output

{


for (i = 1; i <= m
-
1; i++)


{


Copy(T, {T[m]});


for(j = 1; j <= m; j++) // may be executed in parallel


{


S[j] = {i ¬j}; // k >= i


for (k = i+1; k <= m; k++) { S[j] = S[j]
U

{k j}; };


Remove(T[j], S[j], T[j]);


};


Union({T[m]}, T);


};

}



The frame of the algorithm is sequential.

Test tube operations execute in parallel.

The whole algorithm is semi
-
parallel.

Explanations

31


j means the unsigned integer number j

from the range 1 .. m.

¬j means all unsigned integer numbers from the range 1 .. m, not equal to j.

¬j = {1, ..., m}
\

{j} = {1, ..., j
-
1, j+1, ..., m}.

i j means the unsigned integer number j from the range 1 .. m at the position i.

i ¬j means all
unsigned integer numbers from the range 1 .. m, not equal to j at the position i.



Remove(T[j], {i ¬j, k j}) removes from the tube T[j] all DNA strands which contain at least one occurrence of
the DNA substrands i ¬j and/or k j.

Remove(T[j],
{i ¬j, k j}) saves in the tube T[j] only DNA strands which contain at the position i the value j
and does not contain at other positions the value j.

At the end of the computation each of the surviving strings will contain exactly one occurrence
of each unsigned
integer number from the set {1, …, m} and so represents one of the possible permutations.

Complexity

Complexity of Permutation Generation DNA
-
based algorithm is O(m) parallel
-
time.

Program Execution

Permutation Generation DNA
-
based pr
ogram may be executed:


(i) step by step in a DNA
-
lab or on a DNA
-
computer


(ii) automatically on a super DNA
-
computer or on an electronic
-
digital computer


(iii) (for small (less then 10) number of elements).

Test Example

The purpose of the test examp
le is to "visualize" execution of Permutation Generation DNA
-
based algorithm.

The number of elements is 3.

CreateInputSet(3, T);




T

111 112 113 121 122 123 131 132 133


211 212 213 221 222 223 231 232 233

311 312 313 321 322 323 331 332 333



Co
py(T, {T[1], T[2], T[3]});



T[1]

T[2]

T[3]

111 112 113


121 122 123


131 132 133


211 212 213


221 222 223


231 232 233


311 312 313


321 322 323


331 332 333


111 112 113


121 122 123


131 132 133


211 212 213


221 222 223


231 232 233


3
11 312 313


321 322 323


331 332 333


111 112 113


121 122 123


131 132 133


211 212 213


221 222 223


231 232 233


311 312 313


321 322 323


331 332 333




Remove(T[1], {1 ¬1, 2 1, 3 1}, T[1]);

Remove(T[2], {1 ¬2, 2 2, 3 2}, T[2]);

Remove(T[3],

{1 ¬3, 2 3, 3 3}, T[3]);



32


T[1]

T[2]

T[3]

122 123 132 133


211 213 231 233


311 312 321 322



Union({T[1], T[2], T[3]}, T);



T

122 123 132 133


211 213 231 233


311 312 321 322




Copy(T, {T[1], T[2], T[3]});



T[1]

T[2]

T[3]

122 123 132 1
33


211 213 231 233


311 312 321 322


122 123 132 133


211 213 231 233


311 312 321 322


122 123 132 133


211 213 231 233


311 312 321 322




Remove(T[1], {2¬1, 3 1}, T[1]);

Remove(T[2], {2¬2, 3 2}, T[2]);

Remove(T[3], {2¬3, 3 3}, T[3]);



T[1]

T[2]

T[3]

213 312


123 321


132 231




Union({T[1], T[2], T[3]}, T);



T

213 312


123 321


132 231




Set T is the output
-

the result