IEEE
Proof
IEEE TRANSACTIONS ON NEURAL NETWORKS 1
Binary Higher Order Neural Networks for
Realizing Boolean Functions
Chao Zhang,Jie Yang,and Wei Wu
Abstract—In order to more efﬁciently realize Boolean func
1
tions by using neural networks,we propose a binary productunit
2
neural network (BPUNN) and a binary pisigma neural network
3
(BPSNN).The network weights can be determined by onestep
4
training.It is shown that the addition “σ,” the multiplication
5
“π,” and two kinds of special weighting operations in BPUNN
6
and BPSNN can implement the logical operators “∨,” “∧,” and
7
“¬” on Boolean algebra Z
2
,∨,∧,¬,0,1 (Z
2
= {0,1}),respec
8
tively.The proposed two neural networks enjoy the following
9
advantages over the existing networks:1) for a complete truth
10
table of N variables with both truth and false assignments,the
11
corresponding Boolean function can be realized by accordingly
12
choosing a BPUNN or a BPSNN such that at most 2
N−1
hidden
13
nodes are needed,while O(2
N
),precisely 2
N
or at most 2
N
,
14
hidden nodes are needed by existing networks;2) a new network
15
BPUPS based on a collaboration of BPUNN and BPSNN can be
AQ:1
16
deﬁned to deal with incomplete truth tables,while the existing
17
networks can only deal with complete truth tables;and 3) the
18
values of the weights are all simply −1 or 1,while the weights of
19
all the existing networks are real numbers.Supporting numerical
20
experiments are provided as well.Finally,we present the risk
21
bounds of BPUNN,BPSNN,and BPUPS,and then analyze their
22
probably approximately correct learnability.
23
Index Terms—Binary pisigma neural network,binary
24
productunit neural network,Boolean function,principle con
25
junctive normal form,principle disjunctive normal form.
26
I.I
NTRODUCTION
27
N
EURAL networks possess simple parallel structure,com
28
putational robustness,and ease of hardware implementa
29
tion.These advantages make them good candidates for the
30
effective realization of Boolean functions [1]–[3].Various
31
kinds of neural networks have been proposed and studied to
32
this end,such as McCulloch–Pitts neurons [4],linear thresh
33
old neurons [5],[6],pisigma neural networks (PSNNs) [7],
34
spiking neurons [8],radial basis function neurons [9],sigmapi
35
neurons [10],[11],and cellular neural networks [12].Anthony
36
[13] analyzes the performances of linear threshold neurons,
37
spiking neurons,and sigmapi neurons used as classiﬁers to
38
compute Boolean functions.Xiong et al.[14] consider the
39
approximation to Boolean functions by the neural networks
40
Manuscript received December 23,2008;revised February 1,2011;accepted
XXXX XX,XXXX.
EQ:1
This work was supported in part by the National Natural
Science Foundation of China under Grant 10871220.
C.Zhang was with the School of Mathematical Sciences,Dalian Univer
sity of Technology,Dalian 116023,China.He is now with the School of
Computer Engineering,Nanyang Technological University,Singapore (email:
zhangchao1015@gmail.com)
J.Yang and W.Wu are with the School of Mathematical Sci
ences,Dalian University of Technology,Dalian 116023,China (email:
yangjiee@dlut.edu.cn;wuweiw@dlut.edu.cn).
Digital Object Identiﬁer 10.1109/TNN.2011.2114367
with summation units and show that a Boolean function of
41
N variables can be approximated by a threelayer neural net
42
work with 2
N
hidden nodes.Computation and approximation
43
to Boolean functions are mentioned in [15].By using an
44
algebraic form,Chen [16] converted the logicbased input
45
state dynamics of Boolean networks into an algebraic discrete
46
time dynamic system,and then investigated the structure of
47
Boolean networks (also see [17]).Chen et al.[18] studied
48
the realization of linearly separable Boolean functions and
49
parity Boolean functions by using universal perceptrons with a
50
deoxyribonucleic acid (DNA)like learning algorithm.Subse
51
quently,based on the fact that a linearly nonseparable Boolean
52
function can be decomposed into logic
XOR
operations of
53
a sequence of linearly separable Boolean functions,Chen
54
et al.[19] proposed a DNAlike learning and decomposing
55
algorithm for implementing linearly nonseparable Boolean
56
functions.
57
Any Boolean function of Boolean algebra Z
2
,∨,∧,¬,
58
0,1 can be uniquely represented by either a principle dis
59
junctive normal form (PDNF) or a principle conjunctive nor
60
mal form (PCNF) [20].Hence,a certain kind of networks
61
can realize arbitrary Boolean functions if the networks can
62
implement arbitrary PDNFs or PCNFs.In order to construct
63
such networks,we concentrate our attention in this paper to
64
the implementation of the logical operators “∨,” “∧,” and “¬,”
65
and of PDNFs and PCNFs.
66
In order to enhance the nonlinear mapping ability of
67
the traditional feedforward neural networks with sum
68
mation units,various kinds of higher order neural net
69
works (HONNs) have been developed,such as sigma
70
pi neural networks (SPNNs) [21]–[23],PSNNs [24]–[26],
71
and productunit neural networks (PUNNs) [27]–[31].There
72
are also some important works on HONNs proposed in
73
[32]–[36].The main difference between HONNs and the
74
traditional feedforward networks is that HONNs have the
75
multiplication operation “” which can provide more non
76
linearity.It is proved in [10] and [11] that fully connected
77
SPNNs can realize arbitrary Boolean functions (see [23] for
78
the concept of “fully connected”),and two onestep methods
79
are respectively proposed to compute the weights.In [7],it
80
is shown that any conjunctive normal form can be realized
81
by PSNN,and the elementary algebra operations “” and
82
“” are taken as the logical disjunction “∨” and the logical
83
conjunction “∧,” respectively.The logical negation “¬” is
84
implemented in [7] by a combination of inputs and threshold
85
values.The implementation of the logical operators “∨,” “∧,”
86
and “¬” by elementary algebra operations is also mentioned
87
1045–9227/$26.00 © 2011 IEEE
IEEE
Proof
2 IEEE TRANSACTIONS ON NEURAL NETWORKS
in [37],where an extracting rule from the trained networks is
88
developed.In the domain Z
2
,the logical conjunction “∧” is
89
equal to the multiplication “.” Because of the existence of the
90
multiplication “,” if the addition “” and some weighting
91
operation can respectively implement logical operators “∧”
92
and “¬” in some kinds of HONNs,Boolean functions can be
93
realized by the HONNs.
94
The concept of functional completeness shows a sufﬁcient
95
condition for a model to realize arbitrary Boolean functions
96
(see [20]).
97
Deﬁnition 1:A set of logical operators {g
1
,g
2
,...,g
k
}
98
is said to be functionally complete if all possible Boolean
99
functions can be implemented by using only the operators g
1
,
100
g
2
,...,g
k
.
101
Our model will be based on the set of logical operators
102
{∨,∧,¬}.The reason for doing so is as follows.
103
1) Each of the sets {∨,∧,¬},{∧,¬},and {∨,¬} is func
104
tionally complete,while the set {∨,∧} is not (see [20]).
105
2) The set {∧,¬} or {∨,¬} contains fewer operations.
106
However,since x ∨ y = ¬(¬x ∧ ¬y) and x ∧ y =
107
¬(¬x∨¬y) (see [20]),models based on {∧,¬} or {∨,¬}
108
need more number of computations than models based
109
on {∨,∧,¬}.
110
3) Models based on {∨,∧,¬} directly correspond
111
to Boolean expressions on the Boolean algebra
112
Z
2
,∨,∧,¬,0,1.
113
Our aim in this paper is to construct neural networks
114
that are based on {∨,∧,¬} and can more efﬁciently realize
115
Boolean functions than other existing networks.In particular,
116
we propose a binary PUNN (BPUNN) and a binary PSNN
117
(BPSNN) derived from PUNN and PSNN,respectively.We
118
prove that BPUNN (resp.BPSNN) can implement arbitrary
119
PDNFs (resp.PCNFs) and hence arbitrary Boolean functions
120
with both truth and false assignments.In both BPUNN and
121
BPSNN,as usual,the addition “” and the multiplication “”
122
stand for the logical disjunction “∨” and the logical conjunc
123
tion “∧,” respectively.An important trick in our approach is
124
to implement the unary logical operator “¬” by two special
125
inputtohidden weighting operations,respectively.The two
126
neural networks enjoy the following advantages compared to
127
the existing networks.
128
1) At most 2
N−1
hidden nodes are needed to realize a
129
Boolean function with a complete truth table of N
130
variables by suitably choosing a BPUNN or a BPSNN,
131
with the exception that 2
N
hidden nodes are needed
132
to realize two special Boolean functions with 2
N
truth
133
assignments or 2
N
false assignments respectively,while
134
O(2
N
),precisely 2
N
and at most 2
N
,hidden nodes are
135
needed for multilayer perceptron (MLP) in [13],SPNN
136
in [10] and [11],and PSNN in [7],respectively.
137
2) A Boolean function with an incomplete truth table can
138
also be realized by a new network BPUPS based on a
139
collaboration of BPUNN and BPSNN,such that it gives
140
the right response to an input that is in the truth table and
141
recognizes an input that is not in the truth table,while
142
this issue is not considered for the existing networks.
143
3) The values of the weights are simply −1 or 1,which
144
is convenient for storage and hardware realization
145
( [1]),while the weights of the existing networks are
146
real numbers.
147
4) The network weights can be determined by onestep
148
training,while they are obtained through iterative train
149
ing procedures for the DNAlike algorithm in [18] and
150
[19],MLP in [13],and PSNN in [7].The network
151
weights of SPNN in [10] and [11] also enjoy this
152
advantage.
153
For the further investigation of the properties of BPUNN
154
and BPSNN,we consider a more general scenario which
155
allows that the training samples can be evaluated with certain
156
probability.We give the Vapnik–Chervonenkis (VC) dimen
157
sion of a function class composed of BPUNN (resp.BPSNN).
158
Afterwards,we present the risk bound of BPUNN (resp.
159
BPSNN) based on the VC dimension,and then analyze the
160
learnability of the two networks by using the resulting risk
161
bounds.Finally,we discuss the learnability of BPUPS.
162
The rest of this paper is arranged as follows.The next sec
163
tion gives a brief introduction on Boolean functions.BPUNN
164
and BPSNN are respectively deﬁned in Sections III and IV.
165
The main theorems and their proofs are presented in Section V.
166
Section VI gives a comparison with other kinds of neural
167
networks and provides the corresponding numerical experi
168
ments.In Section VII,we propose a network to deal with
169
incomplete truth tables.The learnability of BPUNN,BPSNN,
170
and BPUPS is analyzed in Section VIII and some conclusions
171
are drawn in the last section.The relevant proofs are given in
172
the Appendixes.
173
II.P
RELIMINARIES ON
B
OOLEAN
F
UNCTIONS
174
In this section,we present some preliminaries on Boolean
175
functions,and refer to [20] for the further details.
176
A.PDNF
177
Deﬁnition 2:A minterm in the variables x
1
,x
2
,...,x
N
is
178
a Boolean expression of the form
179
x
1
∧x
2
∧· · · ∧x
N
(1)
180
where x
n
is either x
n
or ¬x
n
.
181
For a minterm x
1
∧ x
2
∧ · · · ∧ x
N
,a vector a =
182
(a
1
,a
2
,...,a
N
) ∈ Z
N
2
is generated as follows.For 1 ≤ n ≤
183
N,we set
184
a
n
=
1,x
n
= x
n
;
0,x
n
= ¬x
n
.
(2)
185
The resulting vector a is the unique truth assignment for the
186
minterm x
1
∧x
2
∧ · · · ∧x
N
.Thus,for x = (x
1
,x
2
,...,x
N
),
187
the minterm can be represented as
188
m
a
(x) =x
1
∧x
2
∧· · · ∧x
N
(3)
189
and
190
m
a
(x) =
1,x = a;
0,x = a.
(4)
191
The next theorem and its proof can be found in [20].
192
Theorem 1:If f:Z
N
2
→ Z
2
,then f is a Boolean
193
function.If f is not identically zero,let a
1
,a
2
,...,a
J
∈ Z
N
2
194
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 3
be the truth assignments for Boolean function f.For each
195
a
j
= (a
j,1
,a
j,2
,...,a
j,N
) (1 ≤ j ≤ J),set
196
m
a
j
(x) =x
1
∧x
2
∧· · · ∧x
N
197
where x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
,and
198
x
n
=
x
n
,a
j,n
= 1;
¬x
n
,a
j,n
= 0.
199
Then
200
f (x) = m
a
1
(x) ∨m
a
2
(x) ∨· · · ∨m
a
J
(x).(5)
201
Deﬁnition 3:The representation (5) of a Boolean function
202
f:Z
N
2
→Z
2
is called the PDNF.
203
B.PCNF
204
Deﬁnition 4:A maxterm in the variables x
1
,x
2
,...,x
N
is
205
a Boolean expression of the form
206
x
1
∨x
2
∨· · · ∨x
N
(6)
207
where x
n
is either x
n
or ¬x
n
.
208
For a maxterm x
1
∨x
2
∨ · · · ∨x
N
,we generate a vector
209
b = (b
1
,b
2
,...,b
N
) ∈ Z
N
2
as follows.For 1 ≤ n ≤ N,we
210
set
211
b
n
=
1,x
n
= ¬x
n
;
0,x
n
= x
n
.
(7)
212
The resulting vector b is the unique false assignment for the
213
maxterm x
1
∨x
2
∨· · · ∨x
N
.Thus,for x = (x
1
,x
2
,...,x
N
),
214
the maxterm can be represented as
215
M
b
(x) =x
1
∨x
2
∨· · · ∨x
N
(8)
216
and
217
M
b
(x) =
1,x = b;
0,x = b.
(9)
218
Combining the dual principle and Theorem 1,we have the
219
following result,which can be found in [20].
220
Theorem 2:If f:Z
N
2
→ Z
2
,then f is a Boolean
221
function.If f is not identically zero,let b
1
,b
2
,...,b
I
∈ Z
N
2
222
be the false assignments for Boolean function f.For each
223
b
i
= (b
i,1
,b
i,2
,...,b
i,N
) (1 ≤ i ≤ I ),set
224
M
b
i
(x) =x
1
∨x
2
∨· · · ∨x
N
,1 ≤ i ≤ I
225
where x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
,and
226
x
n
=
x
n
,b
i,n
= 0;
¬x
n
,b
i,n
= 1.
227
Then
228
f (x) = M
b
1
(x) ∧ M
b
2
(x) ∧· · · ∧ M
b
I
(x).(10)
229
Deﬁnition 5:The representation (10) of a Boolean function
230
f:Z
N
2
→Z
2
is called the PCNF.
231
x
1
h
(h(x
1
))
w
1
(h(x
2
))
w
2
(h(x
N
))
w
N
H(w, x)
h
−1
h h
−1
h h
−1
x
2
x
N
Fig.1.Structure of BPU.
III.BPUNN
232
A.BPU
233
The form of PUs is
N
n=1
x
w
n
n
.Compared to the form
234
of traditional summation units
N
n=1
x
n
w
n
,PUs have more
235
powerful nonlinearity due to their special exponential form.
236
It is shown in [27] that the information capacity of a single
237
PU (as measured by its capacity for learning random Boolean
238
patterns) is approximately 3N,compared to 2N for a single
239
summation unit [38],where N is the number of inputs to the
240
units.Therefore,PUNNs with a hidden layer composed of
241
PUs can handle many complicated cases [31].For example,
242
as demonstrated in [39],a PUNN with only one PU in its
243
hidden layer is sufﬁcient to solve difﬁcult symmetry problems
244
and parity problems.
245
Based on PU,we try to deﬁne a BPU with inputs and
246
outputs in Z
2
,as shown in Fig.1.
247
Write w = (w
1
,...,w
N
) ∈ {−1,1}
N
as the weight vector.
248
For an input vector x = (x
1
,...,x
N
) ∈ Z
N
2
,let H(w,x) be
249
the output of BPU
250
H(w,x) =
N
n=1
h
−1
(
h(x
n
)
)
w
n
(11)
251
where h(t):Z
2
→{(1/2),2} is deﬁned by
252
h(t) =
2,t = 1;
1
2
,t = 0
(12)
253
and h
−1
(t) is the inverse function of h(t)
254
h
−1
(t) =
1,t = 2;
0,t =
1
2
.
(13)
255
The operation of h(t) is to switch the input domain Z
2
256
into {2,(1/2)}.(Actually,we may use {(1/a),a} in place of
257
{(1/2),2} for any a > 1.) The switch operation will be used in
258
the weighting operation of BPU to implement logical negation
259
“¬.”
260
B.BPUNN
261
Fig.2 shows a BPUNN with one output and J hidden
262
nodes (BPUs).The weight vector between inputs and the j th
263
hidden node is denoted by w
j
= (w
j,1
,w
j,2
,...,w
j,N
) ∈
264
{−1,1}
N
(1 ≤ j ≤ J),and the hiddentooutput weights are
265
IEEE
Proof
4 IEEE TRANSACTIONS ON NEURAL NETWORKS
x
1
x
2
x
N
w
1, 1
1
1
1
1
H(w
1
, x)
H(w
2
, x)
H(w
J −
1
, x)
H(w
J
, x)
g y
Fig.2.Structure of NJ1 BPUNN.
x
1
k
x
2
x
N
k
−1
k
k
−1
k
k
−1
v
1
k(x
1
)
v
2
k(x
2
)
v
N
k(x
N
)
g
K(v, x)
Fig.3.Structure of a hidden node of BPSNN.
ﬁxed to 1.Thus,the ﬁnal output of BPUNN is
266
y = g
⎛
⎝
J
j =1
H(w
j
,x)
⎞
⎠
267
= g
⎛
⎝
J
j =1
N
n=1
h
−1
(
h(x
n
)
)
w
j,n
⎞
⎠
(14)
268
269
where H(w
j
,x) stands for the output of the j th hidden node
270
(11),and
271
g(t) =
1,t > 0;
0,t = 0.
(15)
272
IV.BPSNN
273
The PSNN was proposed by Shin in 1991 [24].Incorpo
274
rating product neurons with polynomials of inputs,PSNN
275
has powerful capability of nonlinear mapping and avoids the
276
dimensional explosion that might appear in SPNN [24].Based
277
on PSNN,we propose BPSNN with both inputs and outputs
278
domains being Z
2
.
279
A.Hidden Node of BPSNN
280
The structure of a hidden node of BPSNN is shown in Fig.3.
281
Let v = (v
1
,...,v
N
) ∈ {−1,1}
N
be the weight vector.For a
282
binary input vector x = (x
1
,...,x
N
) ∈ Z
N
2
,the output of the
283
hidden node is
284
K(v,x) = g
N
n=1
k
−1
(
v
n
k(x
n
)
)
(16)
285
x
1
v
1, 1
v
1, N
x
2
x
N
K(v
2
, x)
K(v
1
, x)
K(v
I −
1
, x)
K(v
I
, x)
y
1
1
1
1
Fig.4.Structure of NI1 BPSNN.
where g(t) is given in (15),k(t):Z
2
→ {−1,1} is deﬁned
286
by
287
k(t) =
1,t = 1;
−1,t = 0
(17)
288
and k
−1
(t) is the inverse function of k(t)
289
k
−1
(t) =
1,t = 1;
0,t = −1.
(18)
290
Similar to h(t),k(t) switches the domain of input Z
2
291
into {−1,1} and will be used in the weighting operation to
292
implement the logical negation “¬.”
293
B.BPSNN
294
A BPSNN with I hidden nodes and one output is illustrated
295
in Fig.4,where x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
stands for the
296
input vector and v
i
= (v
i,1
,v
i,2
,...,v
i,N
) ∈ {−1,1}
N
(1 ≤
297
i ≤ I ) is the weight vector between the input layer and the
298
i th hidden node.The output of the i th hidden node is denoted
299
by K(v
i
,x) (16).The hiddentooutput weights are ﬁxed to 1.
300
According to (16) and (17),the ﬁnal output of the BPSNN is
301
y =
I
i=1
K(v
i
,x) =
I
i=1
g
N
n=1
k
−1
v
i,n
k
(
x
n
)
(19)
302
where g(t),k(t),and k
−1
(t) are deﬁned by (15),(17),and
303
(18),respectively.
304
V.M
AIN
T
HEOREMS AND
P
ROOFS
305
A.Implementation of “∨,” “∧,” and “¬”
306
“∨,” “∧,” and “¬” are the basic logical operators on Boolean
307
algebra Z
2
,∨,∧,¬,0,1.In the following theorem,it is
308
pointed out that the addition “,” the multiplication “,” and
309
two special weighting operations can implement the logical
310
operators “∨,” “∧,” and “¬,” respectively.
311
Theorem 3:For x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
,if g(t),h(t),
312
h
−1
(t),k(t),and k
−1
(t) are deﬁned by (12),(13),(15),(17),
313
and (18),respectively,then the following formulas are valid
314
g
N
n=1
x
n
=
N
n=1
x
n
,and
N
n=1
x
n
=
N
n=1
x
n
.
315
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 5
Furthermore,for 1 ≤ n ≤ N,we have
316
h
−1
(
h
(
x
))
w
=
x,w = 1;
¬x,w = −1
(20)
317
and
318
k
−1
(
vk
(
x
))
=
x,v = 1;
¬x,v = −1.
(21)
319
Proof:Using (12),(13),(15),(17),and (18),the above
320
formulas can be directly obtained by the principles of logical
321
operators “∨,” “∧,” and “¬.”
322
In (20),h(t) switches the input domain Z
2
into {(1/2),2} so
323
that the exponential weight of input can implement the logical
324
negation “¬.” k(t) does a similar job in (21).
325
B.Implementation of Boolean Functions by BPUNN
326
First,we consider the implementation of minterms.Let x =
327
(x
1
,x
2
,...,x
N
) ∈ Z
N
2
.A given minterm has the form (1)
328
m
a
(x) =x
1
∧x
2
∧· · · ∧x
N
329
where x
n
is either x
n
or ¬x
n
.Let the vector a =
330
(a
1
,a
2
,...,a
N
) ∈ Z
N
2
be the unique truth assignment for the
331
minterm (2).We deﬁne a function ψ(a):Z
2
→ {−1,1} as
332
follows:
333
ψ(a)
1,a = 1;
−1,a = 0
(22)
334
and then deﬁne a mapping (a):Z
N
2
→{−1,1}
N
by
335
(a) (ψ(a
1
),ψ(a
2
),...,ψ(a
N
)).(23)
336
We deﬁne the weight vector w = (w
1
,w
2
,...,w
N
) ∈
337
{−1,1}
N
as
338
w = (a).(24)
339
By (3),(4),(11),(12),(20),and (24),the output of BPU
340
satisﬁes
341
H(w,x) = m
a
(x) =
1,x = a;
0,x = a.
(25)
342
The above discussion leads to the following theorem.
343
Theorem 4:For any given minterm of N variables shown
344
in (1),if the weight vector of a BPU with N inputs (11) is
345
determined by (24),then the BPU can implement the minterm.
346
Since any minterm has a unique truth assignment,for a
347
given minterm of N variables,the corresponding BPU can be
348
obtained by (24).Since arbitrary Boolean functions can be
349
written as PDNFs—the disjunctions of minterms,if a kind of
350
networks can implement PDNFs,then the networks can realize
351
arbitrary Boolean functions as well.
352
Let f:Z
N
2
→ Z
2
be a given Boolean function with J
353
truth assignments a
1
,a
2
,...,a
J
∈ Z
N
2
(1 ≤ J ≤ 2
N
).By
354
(3) and (4),the vector a
j
= (a
j,1
,a
j,2
,...,a
j,N
) is the truth
355
assignment for the mintermm
a
j
(1 ≤ j ≤ J).Then,according
356
to Theorem 1,we have
357
f (x) = m
a
1
(x) ∨m
a
2
(x) ∨· · · ∨m
a
J
(x) (26)
358
where x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
.
359
As shown in Fig.2,we consider a BPUNN with J hidden
360
nodes and let the j th hidden node correspond to the minterm
361
m
a
j
(x) (1 ≤ j ≤ J).By (24) and Theorem 4,the output of
362
the j th hidden node is
363
H(w
j
,x) = m
a
j
(x) =
1,x = a
j
;
0,x = a
j
(27)
364
where w
j
= (a
j
).Then,combining (14),(26),and (27),we
365
have
366
g
⎛
⎝
J
j =1
H(w
j
,x)
⎞
⎠
= m
a
1
(x) ∨m
a
2
(x) ∨· · · ∨m
a
J
(x)
367
=
1,x ∈ {a
1
,a
2
,...,a
J
};
0,x/∈ {a
1
,a
2
,...,a
J
}
(28)
368
369
where g(t) is deﬁned by (15).As a consequence,we have the
370
following theorem.
371
Theorem 5:Let f:Z
N
2
→Z
2
be a given Boolean function
372
with J (1 ≤ J ≤ 2
N
) truth assignments.Let a BPUNN be
373
given with N inputs,J hidden nodes,and one output as shown
374
in Fig.2,and with the hidden nodes being determined by (27).
375
Then the Boolean function f can be realized by the resulting
376
BPUNN.
377
Note that if J = 0,the outputs of f are all zeros and the f
378
has no truth assignment.Therefore,according to Theorem 4,
379
BPUNN cannot realize such a function f.However,we will
380
show that BPSNN can achieve it in the following discussion.
381
C.Implementation of Boolean Functions by BPSNN
382
Let x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
.A given maxterm has the
383
form (see (6))
384
M
b
(x) =x
1
∨x
2
∨· · · ∨x
N
385
where x
n
is either x
n
or ¬x
n
.Let b = (b
1
,b
2
,...,b
N
) ∈ Z
N
2
386
be the unique false assignment for the maxterm deﬁned in (7).
387
We deﬁne a function φ(b):Z
2
→{−1,1} as follows:
388
φ(b)
1,b = 0;
−1,b = 1
(29)
389
and then deﬁne a mapping (b):Z
N
2
→{−1,1}
N
by
390
(b) (φ(b
1
),φ(b
2
),...,φ(b
N
)).(30)
391
We deﬁne the weight vector v = (v
1
,v
2
,...,v
N
) ∈
392
{−1,1}
N
as
393
v = (b).(31)
394
For each maxterm M
b
(x),we accordingly deﬁne a hidden
395
node of BPSNN with the output
396
K(v,x) = M
b
(x) =
1,x = b;
0,x = b.
(32)
397
Consequently,we have the following theorem.
398
Theorem 6:For any given maxterm of N variables shown
399
in (6),if the weight vector of a hidden node of BPSNN with
400
N inputs [see (16)] is determined by (31),then the hidden
401
node can implement the maxterm.
402
Because any Boolean function can be written as PCNF—a
403
conjunction of maxterms,if a network can implement PCNF,
404
then the network can realize the Boolean function as well.
405
IEEE
Proof
6 IEEE TRANSACTIONS ON NEURAL NETWORKS
For a given Boolean function f:Z
N
2
→ Z
2
with I false
406
assignments b
1
,b
2
,...,b
I
∈ Z
N
2
(1 ≤ I ≤ 2
N
).By (8)
407
and (9),the vector b
i
= (b
i,1
,b
i,2
,...,b
i,N
) is the false
408
assignment for the maxterm M
b
i
(1 ≤ i ≤ I ).According
409
to Theorem 2,we have
410
f (x) = M
b
1
(x) ∧ M
b
2
(x) ∧· · · ∧ M
b
I
(x) (33)
411
where x = (x
1
,x
2
,...,x
N
) ∈ Z
N
2
.
412
We consider a BPSNN with I hidden nodes.Let the i th
413
hidden node correspond to the maxterm M
b
i
(x) (1 ≤ i ≤ I ),
414
and deﬁne its output as
415
K(v
i
,x) = M
b
i
(x) =
1,x = b
i
;
0,x = b
i
(34)
416
where v
i
= (b
i
).
417
Then,combining (19),(33),and (34),we have
418
I
i=1
K(v
i
,x) = M
b
1
(x) ∧ M
b
2
(x) ∧· · · ∧ M
b
I
(x)
419
=
1,x/∈ {b
1
,b
2
,...,b
I
};
0,x ∈ {b
1
,b
2
,...,b
I
}.
(35)
420
421
Subsequently,the following theorem holds.
422
Theorem 7:Let f:Z
N
2
→Z
2
be a given Boolean function
423
with I (1 ≤ I ≤ 2
N
) false assignments.Let a BPSNN be
424
given with N inputs,I hidden nodes,and one output as shown
425
in Fig.4,with the hidden nodes being determined by (34).
426
Then the Boolean function f can be realized by the resulting
427
BPSNN.
428
Note that,if I = 0,the outputs of f are all 1’s and the f
429
has no false assignment.Therefore,according to Theorem 6,
430
BPSNN cannot realize such a function f,but BPUNN can
431
just do it (see Theorem 5).
432
D.Choose BPUNN or BPSNN Such That at Most 2
N−1
433
Hidden Nodes Are Needed
434
According to Theorems 5 and 7,when BPUNN (resp.
435
BPSNN) is used to implement a Boolean function of N
436
variables with both truth and false assignments,the hidden
437
node number is equal to the number of truth (resp.false) as
438
signments for the Boolean function.For a given complete truth
439
table,we can choose a network with smaller size fromBPUNN
440
and BPSNN.For example,for a truth table of N variables
441
with both truth and false assignments,if the number of truth
442
assignments is less than that of false assignments in the truth
443
table,we should choose BPUNN to implement the Boolean
444
function with at most 2
N−1
hidden nodes and every hidden
445
node corresponds to a truth assignment.Contrarily,BPSNN
446
is a better choice when the number of false assignments is
447
less than that of the truth ones,and then every hidden node
448
corresponds to a false assignment.In this strategy,for arbitrary
449
Boolean functions of N variables with both truth and false
450
assignments,the resulting networks have at most 2
N−1
hidden
451
nodes.
452
For completeness,we remark that,as pointed out in the ends
453
of Sections VB and C,for the two special Boolean functions
454
with no truth or no false assignment,respectively,2
N
rather
455
than 2
N−1
hidden nodes are needed.
456
VI.N
UMERICAL
E
XPERIMENTS
457
This section compares BPUNN and BPSNN with MLP,
458
SPNN,PSNN,and PUNN for realizing arbitrary Boolean
459
functions.The comparison focuses on the following two
460
respects:1) the hiddennode number of a network to realize a
461
Boolean function,and 2) the feasibility to train such a neural
462
network.
463
The number of all the Boolean functions is 2
2
N
for given N
464
variables,which becomes very large as N increases.Therefore,
465
for the sake of computational cost,we only consider the case
466
N = 3.All the networks have one hidden layer.
467
BP algorithm is applied to train MLP,SPNN,PSNN,
468
and PUNN for realizing arbitrary Boolean functions of three
469
variables.We set the hiddennode number of MLP,PSNN,and
470
PUNN from 1 to 10.Zhang et al.discussed the divisibility of
471
SPNN in [23].To compare with the existing works [10],[11],
472
we adopt the same SPNN.The hiddennode number of the
473
presented SPNN is set from 4 to 8.We are mainly concerned
474
with how the order of SPNN inﬂuences the performance of
475
SPNN for realizing Boolean functions.Therefore,we remove
476
the higher order hidden nodes from the fully connected SPNN
477
one after another to form new SPNNs with 4–7 hidden nodes,
478
respectively.In detail,the fully connected SPNN with three
479
inputs has eight hidden nodes [23,Fig.1].First,we remove
480
the thirdorder node to forma SPNN with seven hidden nodes,
481
which has three secondorder hidden nodes.We then remove a
482
secondorder node from the resulting SPNN to form a SPNN
483
with six hidden nodes.In this manner,the resulting SPNN
484
with four hidden nodes is actually equivalent to a perceptron
485
network [23].
486
All initial weights are randomly selected from the interval
487
[−0.5,0.5].For any Boolean function and any given number
488
of hidden nodes,we run repeatedly 30 training procedures to
489
train MLP,SPNN,PSNN,and PUNN.A training procedure
490
is stopped when the error is smaller than the error bound
491
0.05 (deemed as “success”),or when it reaches the maximum
492
iterative epoches 10000 (deemed as “failure”).We record the
493
number of successes in the 30 training procedures and then
494
compute the success rate.
495
According to Theorems 5 and 7,BPUNN and BPSNN can
496
be trained by onestep methods for realizing arbitrary Boolean
497
functions.In the experiment,the numbers of successfully real
498
ized Boolean functions are recorded for BPUNN and BPSNN.
499
Moreover,the cooperation of BPUNN and BPSNN,denoted
500
by “BPUNN&BPSNN,” is considered and the corresponding
501
results are provided as well.
502
The experimental results are shown in Table I.For example,
503
for MLP with two hidden nodes (N
H
= 2),four Boolean func
504
tions (N
B
= 4) fail to be realized (R
S
= 0%),ﬁve (resp.41)
505
Boolean functions are realized in the success rate between 40
506
and 83.3% (resp.86.7–96.7%) in all the 30 training proce
507
dures,i.e.,R
S
∈ [40%,83.3%] (resp.R
S
∈ [86.7%,96.7%]),
508
and 206 Boolean functions are realized successfully in all the
509
30 training procedures (R
S
= 100%).
510
Tabel I shows that MLP cannot successfully realize all
511
the Boolean functions when the number of hidden nodes
512
is less than 10.PSNN behaves better in that it can realize
513
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 7
TABLE I
R
ESULTS OF
R
EALIZING THE
B
OOLEAN
F
UNCTIONS OF
T
HREE
V
ARIABLES
MLP
PSNN
PUNN
SPNN
BPUNN
BPSNN
BPUNN&BPSNN
N
H
N
B
R
S
(%)
N
B
R
S
(%)
N
B
R
S
(%)
N
B
R
S
(%)
N
B
R
S
(%)
N
B
R
S
(%)
N
B
R
S
(%)
1
152
0
152
0
26
0
8
100
8
100
16
100
104
100
104
100
23
70 ∼ 90
25
93.3 ∼ 96.7
182
100
2
4
0
2
0
2
0
36
100
36
100
72
100
5
40 ∼ 83.3
254
100
5
76.7 ∼ 86.7
41
86.7 ∼ 96.7
19
90 ∼ 96.7
206
100
230
100
3
2
0
9
86.7 ∼ 96.7
2
0
92
100
92
100
184
100
4
53.3 ∼ 63.3
247
100
9
90 ∼ 96.7
14
90 ∼ 96.7
245
100
236
100
4
4
0
13
40 ∼ 73.3
2
0
152
0
162
100
162
100
254
100
1
30
40
76.7 ∼ 83.3
3
96.7
104
100
13
76.7 ∼ 96.7
60
86.7 ∼ 90
251
100
238
100
143
93.3 ∼ 100
5
3
0
1
0
2
0
110
0
218
100
218
100
11
53.3 ∼ 93.3
41
3.3 ∼ 16.7
1
96.7
146
100
7
96.7
105
20 ∼ 33
253
100
235
100
90
36.7 ∼ 53.3
19
56.7 ∼ 73.3
6
5
0
142
0
2
0
60
0
246
100
246
100
7
26.7 ∼ 30
49
3.3
254
100
196
100
16
93.3 ∼ 96.7
46
6.7 ∼ 10
228
100
19
13.3 ∼ 26.7
7
8
0
244
0
2
0
2
0
254
100
254
100
8
40 ∼ 76.7
9
3.3
254
100
254
100
16
86.7 ∼ 96.7
3
6.7
224
100
8
9
0
256
0
2
0
256
100
255
100
255
100
256
100
4
30 ∼ 50
254
100
20
80 ∼ 96.7
223
100
9
12
0
256
0
2
0
9
10 ∼ 86.7
1
96.7
20
90 ∼ 96.7
253
100
215
100
10
13
0
256
0
2
0
11
3.3 ∼ 83.3
254
100
27
90 ∼ 96.7
205
100
N
H
:number of hidden nodes
N
B
:number of Boolean functions
R
S
:success rate
a ∼ b:between a and b
all Boolean functions with three and four hidden nodes by
514
performing up to 30 training procedures in our experiment,
515
but it is not guaranteed to be successful in each training
516
procedure.In fact,this is a trouble that iterative network
517
training procedures often face.It is guaranteed that there exists
518
a target somewhere,but it is not guaranteed for a particular
519
training procedure to ﬁnally reach the target.We also observe
520
another interesting phenomenon that more hidden nodes do
521
not necessarily bring about better convergence for MLP and
522
PSNN.This is related to another wellknown fact,when
523
approximating a given mapping by using neural networks,too
524
many hidden nodes may give more trouble than help,just like
525
the case to approximate a given function by using polynomials
526
of too high orders.
527
FromTable I,we can ﬁnd that PUNN and SPNN have better
528
performance than MLP and PSNN.This is because PUNN
529
and SPNN have the nonlinearity strong enough to afford
530
to realizing Boolean functions of three variables.Moreover,
531
the structures of the presented PUNNs and SPNNs are not
532
complicated and thus the BP algorithm can be used to train
533
them in a high success rate.PUNN can use relatively few
534
hidden nodes to realize most Boolean functions and the
535
success rate increases as the hiddennode number increases.
536
However,it is actually very difﬁcult to train a PUNN to realize
537
highdimensional Boolean functions because of its extremely
538
mountainous error surface [31]).As shown in Table I,if a
539
Boolean function can be realized by a SPNN,every training
540
is successful.When the hiddennode number is 4,the SPNN
541
IEEE
Proof
8 IEEE TRANSACTIONS ON NEURAL NETWORKS
x
1
x
2
x
N
v
1, 1
w
1, 1
v
I, N
H(w
1
, x)
K(v
I
, x)
K(v
1
, x)
H(w
J
, x)
g y
1
y
2
1
1
1
1
Fig.5.Structure of a BPUPS.
is equivalent to a perceptron network that is also the MLP
542
with one hidden node and their experimental results are the
543
same.If a SPNN has more number of higher order hidden
544
nodes,more Boolean functions can be realized by the SPNN.
545
Especially,the SPNN with eight hidden nodes can realize all
546
of 256 Boolean functions,which is in accordance with the
547
theoretical results given in [10] and [11].
548
According to Theorems 5 and 7,for a given Boolean
549
function with truth (resp.false) assignments,the hiddennode
550
number of the corresponding BPUNN (resp.BPSNN) is equal
551
to the number of the truth (resp.false) assignments of the
552
Boolean function.Therefore,as shown in Table I,the number
553
of Boolean functions that can be realized by BPUNN (resp.
554
BPSNN) increases to 255 with the exception of the Boolean
555
function with no truth (resp.false) assignment,as the hidden
556
node number increases to 8.Since BPUNN (resp.BPSNN) is
557
trained by onestep methods given in (24) [resp.(31)],every
558
training is successful when the hiddennode number matches
559
the number of the truth (resp.false) assignments,and hence
560
its success rate is 100%.
561
As predicted by our theoretical results in the last section,
562
it is shown in the last column of Table I that we only need
563
four hidden nodes to realize all the Boolean functions with
564
both truth and false assignments.The two exceptional Boolean
565
functions with no false or no truth assignment can only be
566
realized by BPUNN or BPSNN,respectively,by using eight
567
hidden nodes.
568
As a comparison,although SPNN can also realize arbitrary
569
Boolean functions of N variables by onestep training,it
570
always needs 2
N
hidden nodes (see [10],[11]) to realize any
571
Boolean function.
572
VII.N
ETWORK FOR
I
MPLEMENTING
I
NCOMPLETE
573
T
RUTH
T
ABLES
574
All the inputs and corresponding outputs of a given Boolean
575
function form a complete truth table.But,sometimes,there
576
might be some assignments missing from a truth table.This
577
truth table is then called an incomplete truth table.The
578
implementation to an incomplete truth table should give right
579
response to an input that is in the truth table and recognize
580
TABLE II
I
NCOMPLETE
T
RUTH
T
ABLE AND THE
O
UTPUTS OF
BPUPS
x = (x
1
,x
2
,x
3
)
f (x)
(y
1
,y
2
)
a
1
= (1,1,1)
1
(1,1)
a
2
= (1,1,0)
1
(1,1)
a
3
= (0,1,1)
1
(1,1)
b
1
= (1,0,1)
0
(0,0)
b
2
= (1,0,0)
0
(0,0)
b
3
= (0,1,0)
0
(0,0)
c
1
= (0,0,1)
−
(0,1)
c
2
= (0,0,0)
−
(0,1)
an input that is not in it.In this section,we integrate BPUNN
581
and BPSNN into a new network—BPUPS,which meets the
582
requirements.
583
For a given incomplete truth table of N variables with J
584
truth assignments and I false assignments (I + J < 2
N
),
585
we select all the truth (resp.false) assignments to form a set
586
of input vectors,and accordingly construct a BPUNN (resp.
587
BPSNN) as described in Section V.Furthermore,employing
588
the resulting BPUNN and BPSNN,we construct a BPUPS
589
with N inputs,I +J hidden nodes,and two outputs as shown
590
in Fig.5.The output vector of BPUPS is denoted by (y
1
,y
2
),
591
where y
1
and y
2
are,respectively,the outputs of BPUNN and
592
BPSNN.For different kinds of inputs,BPUPS will generate
593
different output vectors as follows.
594
1) When a truth assignment is inputted into the network,
595
the output vector (y
1
,y
2
) is (1,1).
596
2) If a false assignment is inputted,the output vector
597
(y
1
,y
2
) is (0,0).
598
3) For a missing assignment,the output vector (y
1
,y
2
) is
599
(0,1).
600
Let us give a simple example to illustrate the details.An
601
incomplete truth table is given in the ﬁrst two columns of
602
Table II with three truth assignments {a
1
,a
2
,a
3
},three false
603
assignments {b
1
,b
2
,b
3
} and two missing assignments {c
1
,c
2
}.
604
As described in Section V,we use the truth assignments
605
{a
1
,a
2
,a
3
} to construct a BPUNN with three inputs,three
606
hidden nodes and one output (Fig.2).By (24),the inputto
607
hidden weight vectors are speciﬁed as follows:
608
w
1
= ((1,1,1)) = (1,1,1)
609
w
2
= ((1,1,0)) = (1,1,−1)
610
w
3
= ((0,1,1)) = (−1,1,1).
611
Similarly,we use the false assignments {b
1
,b
2
,b
3
} to
612
construct a BPSNN with three inputs,three hidden nodes,and
613
one output (Fig.4).According to (31),we compute the input
614
tohidden weights as follows:
615
v
1
= ((1,0,1)) = (−1,1,−1)
616
v
2
= ((1,0,0)) = (−1,1,1)
617
v
3
= ((0,1,0)) = (1,−1,1).
618
Thus,the BPUNN and the BPSNN are obtained.
619
Then,we integrate the resulting BPUNN and BPSNN into
620
a BPUPS to implement the incomplete truth table (Fig.5).
621
For {a
1
,a
2
,a
3
},{b
1
,b
2
,b
3
},and {c
1
,c
2
},the BPUPS gives
622
different responses to different kinds of inputs as shown in
623
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 9
the last column of Table II.Hence,the BPUPS meets the
624
requirements that a right response is given to an input that
625
is in the truth table,and that an input that is not in can be
626
recognized.
627
VIII.L
EARNABILITY OF
B
INARY
HONN
S
628
The discussion in the former sections is based on a deter
629
ministic scenario,where the target Boolean function can be
630
evaluated deterministically.This section is concerned with the
631
behavior of our models from a probabilistic point of view
632
when the target Boolean function cannot be evaluated deter
633
ministically.In particular,the risk bounds and the learnability
634
of BPUNN and BPSNN are investigated.
635
A.Risk Bounds of BPUNN and BPSNN
636
Let us reframe our models BPUNN and BPSNN in the
637
learning theory framework.Deﬁne Z:= X ×Y,where X is
638
an input space and Y is its corresponding output space.Since
639
this paper is restricted to binary issues,we choose X ⊆ Z
M
2
640
and Y ⊆ Z
2
.Let the pair z = (x,y) ∈ Z be a randomvariable
641
distributed according to a certain probability distribution P(z).
642
Given a function class ,it is expected to ﬁnd a function
643
T ∈ :X → Y so as to predict,for any given input x,
644
the corresponding output y.A natural criterion to choose this
645
function T is the low probability of the error Pr(T(x) = y).
646
Thus,we deﬁne the expected risk of a function ψ ∈ as
647
E
P
(ψ):=
1
ψ(x)=y
dP(z) (36)
648
where
649
1
T(x)=y
=
1,if T(x) = y;
0,otherwise.
(37)
650
Then,the desired function T satisﬁes
651
E
P
(T) = min
ψ∈
E
P
(ψ).(38)
652
Theorem 5 (resp.Theorem 7) provides a onestep method
653
to obtain a BPUNN (resp.BPSNN) based on a given truth
654
table.The resulting networks minimize the expected risk (36)
655
in a special deterministic case,where each input x corresponds
656
deterministically to an output value y.
657
Generally,if the distribution P(z) is unknown,the target
658
function T cannot be directly obtained by minimizing (36).
659
Instead,we can utilize the empirical risk minimization to han
660
dle this issue.Given a function class and an independently
661
and identically distributed (i.i.d.) sample set S
M
= {z
m
}
M
m=1
662
drawn from Z with z
m
= (x
m
,y
m
),we deﬁne the empirical
663
risk of ψ ∈ by
664
E
M
(ψ):=
1
M
M
m=1
1
ψ(x
m
)=y
m
(39)
665
which is an approximation to the expected risk (36).We then
666
choose a ψ
M
to minimize the empirical risk over and deem
667
ψ
M
as an estimation to T with respect to the sample set S
M
.
668
Therefore,we have to consider the asymptotic behavior of
669
E
P
(ψ
M
) − E
P
(T) when the sample number M goes to the
670
inﬁnity (see [40],[41]).Since
E
M
(T) −
E
M
(ψ
M
) ≥ 0,we
671
have
672
E
P
(ψ
M
) = E
P
(ψ
M
) −E
P
(T) +E
P
(T)
673
≤
E
M
(T) −
E
M
(ψ
M
) +E
P
(ψ
M
)
674
−E
P
(T) +E
P
(T)
675
≤ 2 sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
+E
P
(T).
676
677
Thus,we have
678
0 ≤ E
P
(ψ
M
) −E
P
(T) ≤ 2 sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
.
679
Therefore,the upper bound
680
sup
ψ∈
E
P
(ψ) −
E
M
(ψ) (40)
681
becomes a major concern in statistical learning theory,and is
682
called the risk bound.
683
The risk bound measures the probability that a function
684
produced by an algorithm has a sufﬁciently small error and is
685
usually used to analyze the learnability of function classes.It is
686
well known that risk bounds can be obtained by incorporating
687
complexity measures of function classes [40].Since this paper
688
concerns binary issues,we only adopt the complexity measures
689
for function classes with the range {0,1}.
690
Let us deﬁne
691
ψ
S
M
:=
ψ(z
1
),ψ(z
2
),...,ψ(z
M
)
(41)
692
and
693
(,S
M
):=
ψ
S
M
ψ ∈
.(42)
694
Then,the growth function is deﬁned by
695
G(,M):= max
S
M
∈
Z
M

,S
M

(43)
696
where S stands for the cardinality of a set S.Since the range
697
of is {0,1},we have G(,M) ≤ 2
M
.Therefore,the VC
698
dimension of the function class can be deﬁned as follows
699
[41].
700
Deﬁnition 6:Let be a function class with the range
701
{0,1}.Then,the VC dimension of is deﬁned as
702
VCdi m():= max
M > 0  G(,M) = 2
M
.(44)
703
In this paper,we consider the function class
J
(J ≥ 1)
704
composed of BPUNNs (resp.BPSNNs) with J hidden nodes
705
deﬁned in (14) [resp.(19)].According to (42),(43),and
706
Deﬁnition (44),we can get the following theorem on the
707
complexity of the function class
J
.The proof of this theorem
708
is presented in Appendix A.
709
Theorem 8:Let
J
(J ≥ 1) be the function set of BPUNNs
710
or BPSNNs with J hidden nodes.Then,we have
711
J
,S
M
≤ G(
J
,M) ≤
J
j =0
C
j
M
(45)
712
and
713
VCdi m() ≤ J (46)
714
where C
j
M
stands for a binomial coefﬁcient.
715
IEEE
Proof
10 IEEE TRANSACTIONS ON NEURAL NETWORKS
In the next theorem,the risk bounds of BPUNN and BPSNN
716
are given in terms of the VC dimension.The proof of this
717
theorem is postponed to Appendix B.
718
Theorem 9:Let
J
(J ≥ 1) be a set composed of BPUNNs
719
or BPSNNs with J hidden nodes,and S
2M
= {z
m
}
2M
m=1
be an
720
i.i.d.sample set drawn from Z.Then,for any ξ > 0 such that
721
Mξ
2
≥ 2,we have
722
Pr
sup
ψ∈
J
E
P
(ψ) −
E
M
(ψ)
> ξ
723
≤ 2 exp
⎛
⎝
ln
⎛
⎝
J
j =0
C
j
2M
⎞
⎠
−
Mξ
2
8
⎞
⎠
.(47)
724
B.Learnability of BPUNN and BPSNN
725
We now study the learnability of BPUNN and BPSNN in
726
the probably approximately correct (PAC) learning framework
727
[42].In this framework,under the assumption that the samples
728
are drawn from an arbitrary distribution,the main concern
729
is the existence of a learning algorithm that leads to an
730
estimation to the target function related to the distribution with
731
a high probability.Anthony [13] applied this framework to
732
neural networks and analyzed the PAClearnability of some
733
traditional neural networks.
734
Let be a function class and S
M
= {z
m
}
M
m=1
be an i.i.d.
735
sample set drawn from an arbitrary probability distribution
736
on Z.Let us regard a learning algorithm as a function L:
737
S
∗
→ ,where S
∗
is the set composed of all the possible
738
sample sets,i.e.,of all the subsets of the input space.Then,
739
for instance,L(S
M
) denotes the learning result with respect
740
to the sample set S
M
.Deﬁne the smallest error over as
741
E
∗
P
():= inf{E
P
(ψ):ψ ∈ }.
742
Then,the PAClearnability of the function class can be
743
formalized as follows.
744
Deﬁnition 7:Let be a function class with the range
745
{0,1}.We say that is PAClearnable if there is a learning
746
algorithm L such that,for any δ,ξ > 0,there exists a number
747
M(δ,ξ) satisfying
748
Pr
sup
S
M
∈
Z
M
E
P
L
S
M
−E
∗
P
()
> ξ
≤ δ (48)
749
for any M > M(δ,ξ) and any probability distribution P on
750
Z.
751
An equivalent expression of (48) is that,for any probability
752
distribution P on Z
753
lim
M→+∞
Pr
sup
S
M
∈
Z
M
E
P
L
S
M
−E
∗
P
()
> ξ
= 0.
(49)
754
The next theorem conﬁrms the PAClearnability of BPUNN
755
and BPSNN.Its proof is given in Appendix C.
756
Theorem 10:Let
J
(J ≥ 1) be a set composed of
757
BPUNNs or BPSNNs with J hidden nodes.Then,
J
is PAC
758
learnable.
759
C.Learnability of BPUPS
760
At the end of this section,we discuss the learnability of
761
BPUPS that is proposed to implement incomplete truth tables
762
in Section VII.For convenience,we reformthe output (y
1
,y
2
)
763
of BPUPS (Fig.5) as follows:
764
y = y
1
+ y
2
.(50)
765
It is clear that y ∈ {0,1,2}.By combining Fig.5 and
766
(50),for different kinds of inputs,the reformed BPUPS will
767
generate the following outputs.
768
1) If a truth assignment is inputted into the network,the
769
output y equals to 2.
770
2) If a false assignment is inputted,the output y equals to
771
0.
772
3) For a missing assignment,the output y equals to 1.
773
Therefore,the reformed BPUPS is equivalent to the original
774
BPUPS from a functional perspective.Based on the reformed
775
BPUPS,we study the risk bound of BPUPS and its learnability.
776
In the rest of this section,the reformed BPUPS is called
777
BPUPS if no confusion arises.
778
We consider a function class
I,J
(I,J ≥ 1) composed of
779
BPUPS with I + J hidden nodes shown in Fig.5.Similar to
780
Theorem 8,we have the following result on the complexity of
781
I,J
.
782
Theorem 11:Let
I,J
(I,J ≥ 1) be the function set of
783
BPUPS with I + J hidden nodes shown in Fig.5.Then,we
784
have
785
I,J
,S
M
≤ G(
I,J
,M) ≤
I +J
j =0
C
j
M
.(51)
786
Subsequently,following the style of Theorem9,we can obtain
787
the risk bound of BPUPS as follows.
788
Theorem 12:Assume that
I,J
(I,J ≥ 1) is a function
789
set of BPUPS with I + J hidden nodes shown in Fig.5.Let
790
S
2M
= {z
m
}
2M
m=1
be an i.i.d.sample set drawn from Z.Then,
791
for any ξ > 0 such that Mξ
2
≥ 8,we have
792
Pr
sup
ψ∈
I,J
E
P
(ψ) −
E
M
(ψ)
> ξ
793
≤ 2exp
⎛
⎝
ln
⎛
⎝
I +J
j =0
C
j
2M
⎞
⎠
−
Mξ
2
32
⎞
⎠
.(52)
794
795
The above theorem can be proved in the same way as
796
Theorem9 and thus we omit it.According to Theorem 12,we
797
have the following result on the PAClearnability of BPUPS.
798
Theorem 13:Let
I,J
(I,J ≥ 1) is a function set of
799
BPUPS with I +J hidden nodes shown in Fig.5.Then,
I,J
800
is PAClearnable.
801
We omit the proof of Theorem 13,because it is similar to
802
that of Theorem 10.
803
IX.C
ONCLUSION
804
In this paper,we proposed two binary HONNs:BPUNN
805
and BPSNN.They both correspond to a functionally complete
806
set {∨,∧,¬} that provides a sufﬁcient condition for realizing
807
arbitrary Boolean functions (see Deﬁnition 1).Based on this
808
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 11
point,we theoretically proved that BPUNN and BPSNN can
809
realize arbitrary Boolean functions.Numerical experiments
810
were provided to show the excellent performance of BPUNN
811
and BPSNN for realizing Boolean functions compared with
812
other neural networks.
813
The structures of BPUNN and BPSNN were respectively
814
derived fromthose of PUNN and PSNN.The original network
815
structures were modiﬁed so as to make them correspond
816
to the functionally complete set {∨,∧,¬}.In particular,the
817
logical disjunction “∨” and the logical conjunction “∧” were
818
implemented by the addition “” and the multiplication “,”
819
respectively.A key point of our approach was to implement
820
the unary logical operator “¬” by using a special inputto
821
hidden weighting operation (Figs.1 and 3).By using such
822
structures,all weights of BPUNN and BPSNN can be chosen
823
simply as either 1 or −1 by applying a onestep training.
824
Our approach could deal with both complete truth table and
825
incomplete truth table.Given a complete truth table of N vari
826
ables with both truth and false assignments,the corresponding
827
Boolean function could be realized by accordingly choosing
828
a BPUNN or a BPSNN such that at most 2
N−1
hidden nodes
829
were needed.The two exceptional Boolean functions with 2
N
830
truth assignments but no false assignment,or with 2
N
false
831
assignments but no truth assignment,could be realized by
832
BPUNN or BPSNN,respectively,with 2
N
hidden nodes.On
833
the other hand,a newnetwork BPUPS based on a collaboration
834
of BPUNN and BPSNN was proposed to deal with incomplete
835
truth tables.Numerical experiments were also provided to
836
support our theoretical results.
837
The above conclusion was based on a deterministic scenario,
838
where the target Boolean function is deterministic.We also
839
considered a more general scenario where the samples could
840
be evaluated with certain probability.We showed that the
841
VC dimension of the function class composed of BPUNN or
842
BPSNN with J hidden nodes is at most J,and then obtained
843
the risk bounds of BPUNN and BPSNN based on the VC
844
dimension.The PAClearnability of BPUNN and BPSNN was
845
conﬁrmed by using the risk bounds.Finally,we presented
846
the risk bound of BPUPS and then studied the learnability
847
of BPUPS.It should be pointed out that these results are
848
preliminary and further discussions remain to be done for
849
BPUPS in future works.
850
A
PPENDIX
A
851
P
ROOF OF
T
HEOREM
8
852
We only prove the theorem for BPUNN.The proof for
853
BPSNN can be done analogously.
854
Proof of Theorem 8:According to (42) and (43),it is clear
855
that (
J
,S
M
) ≤ G(
J
,M).
856
Next,assume that the function class
∗
J
contains the
857
BPUNNs with all possible weights.By Theorem 5,
∗
J
is
858
equivalent to a set of all Boolean functions with at most J
859
truth assignments.We then separate
∗
J
into J + 1 disjoint
860
subset
j
(0 ≤ j ≤ J) such that
861
∗
J
=
J
j =0
j
(53)
862
where
j
is composed of the Jhiddennode BPUNNs corre
863
sponding to the Boolean functions with j truth assignments.
864
According to (41)–(43),and (53),for any 0 ≤ j ≤ J,we have
865
G(
j
,M) = C
j
M
(54)
866
and then
867
G(
∗
J
,M) = G
⎛
⎝
J
j =0
j
,M
⎞
⎠
=
J
j =0
G(
j
,M) =
J
j =0
C
j
M
.
(55)
868
Noting
J
⊆
∗
J
,G(
J
,M) ≤ G(
∗
J
,M),we get (45).
869
Finally,(46) results from Deﬁnition 6 and (45).This com
870
pletes the proof.
871
A
PPENDIX
B
872
P
ROOF OF
T
HEOREM
9
873
In order to prove the theorem,we need the following
874
lemmas.
875
Lemma 1 [40]:Assume that is a function class with the
876
range {0,1},and let S
M
,S
M
be two i.i.d.sample sets both
877
drawn from Z.Then,for any ξ > 0 such that Mξ
2
≥ 2
878
Pr
sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
> ξ
879
≤ 2Pr
sup
ψ∈
E
M
(ψ) −
E
M
(ψ)
>
ξ
2
.(56)
880
881
Based on Lemma 1,we can obtain the following result.
882
Lemma 2:Let be an indicator function class with the
883
range {0,1} and S
2M
= {z
m
}
2M
m=1
be a set of 2M i.i.d.samples
884
drawn from Z.Then,for any ξ > 0 such that Mξ
2
≥ 2,we
885
have
886
Pr
sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
> ξ
887
≤ 2E
(,S
2M
)
max
ψ∈
Pr
E
P
(ψ) −
E
M
(ψ)
>
ξ
4
.
(57)
888
889
Proof:According to Lemma 1,for any ξ > 0 such that
890
Mξ
2
≥ 2,we have
891
Pr
sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
> ξ
892
≤ 2Pr
sup
ψ∈
E
M
(ψ) −
E
M
(ψ)
>
ξ
2
893
≤ 2E
(,S
2M
)
max
ψ∈
Pr
E
M
(ψ) −
E
M
(ψ)
>
ξ
2
894
≤ 2E
(,S
2M
)
895
×max
ψ∈
Pr
E
P
(ψ) −
E
M
(ψ) +E
P
(ψ) −
E
M
(ψ) >
ξ
2
.
(58)
896
897
IEEE
Proof
12 IEEE TRANSACTIONS ON NEURAL NETWORKS
Since S
M
and S
M
are both independently drawn from an
898
identical distribution,according to (58),we have
899
Pr
sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
> ξ
900
≤ 2E
(,S
2M
)
max
ψ∈
Pr
E
P
(ψ) −
E
M
(ψ)
>
ξ
4
.
(59)
901
902
This completes the proof.
903
Next,we introduce a concentration inequality for the i.i.d.
904
learning process.
905
Lemma 3 [40]:Let z
1
,...,z
M
be M i.i.d.randomvariables
906
with ψ(z) ∈ [a,b].Then,for all ξ > 0,we have
907
Pr
E
M
(ψ) −E
P
(ψ)
> ξ
≤ 2exp
−
2Mξ
2
(b −a)
2
.(60)
908
909
Now,we are ready to prove the theorem.
910
Proof of Theorem 9:According to Theorem 8 and Lem
911
mas 2 and 3,we can obtain (47).This completes the proof.
912
A
PPENDIX
C
913
P
ROOF OF
T
HEOREM
10
914
In order to prove the theorem,we need the following lemma.
915
Lemma 4:For any M ≥ J
916
J
j =0
C
j
M
≤
Me
J
J
.(61)
917
Proof:Since M ≥ J,we have
918
J
j =0
C
j
M
≤
M
J
J
J
j =0
C
j
M
J
M
j
919
≤
M
J
J
M
j =0
C
j
M
J
M
j
920
≤
M
J
J
1 +
J
M
M
921
≤
M
J
J
e
J
.
922
923
Next,we begin to prove Theorem 10.
924
Proof of Theorem 10:By combining (47),Theorem8,and
925
Lemma 4,we can obtain,if M > J/2
926
Pr
sup
ψ∈
E
P
(ψ) −
E
M
(ψ)
> ξ
927
≤ 2 exp
J
(ln 2Me) −(ln J)
−
Mξ
2
8
.(62)
928
Let ψ
∗
∈ be the function satisfying E(ψ
∗
) =
929
E
∗
P
().For any i.i.d.sample set S
M
drawn from Z,
930
we have
E
M
(ψ
∗
) −
E
M
(L(S
M
)) ≥ 0,and then
931
E
P
(L(S
M
)) = E
P
(L(S
M
)) −E
∗
P
() +E
∗
P
()
932
≤
E
M
(ψ
∗
) −
E
M
(L(S
M
)) +E
P
(L(S
M
))
933
−E
P
(ψ
∗
) +E
P
(ψ
∗
)
934
≤ 2 sup
ψ∈
E
M
(ψ) −E
P
(ψ)
+E
P
(ψ
∗
).
935
936
Namely,for any i.i.d.sample set S
M
,there holds
937
E
P
L
S
M
−E
P
(ψ
∗
)
≤ 2 sup
ψ∈
E
M
(ψ) −E
P
(ψ)
.(63)
938
Therefore,according to (62) and (63),we have,if M > J/2
939
Pr
sup
S
M
∈
Z
M
E
P
(L(S
M
)) −E
∗
P
()
> ξ
940
≤ Pr
2 sup
ψ∈
E
M
(ψ) −E
P
(ψ)
> ξ
941
≤ 2 exp
J
(ln 2Me) −(ln J)
−
Mξ
2
32
942
= 2 exp
(ln 2e) +ln(M/J)
(M/J)
−
ξ
2
32
M
.(64)
943
944
We see clearly that,for a ﬁxed J,the righthand side of the
945
above equation goes to 0 when M →∞.This completes the
946
proof.
947
A
CKNOWLEDGMENT
948
The authors are grateful to the anonymous reviewers and
949
the editors for their valuable comments and suggestions.
950
R
EFERENCES
951
[1] V.Deolalikar,“Mapping Boolean functions with neural networks having
952
binary weights and zero thresholds,” IEEE Trans.Neural Netw.,vol.12,
953
no.3,pp.639–642,May 2001.
954
[2] F.Chen and G.Chen,“Realization and bifurcation of Boolean functions
955
via cellular neural networks,” Int.J.Bifur.Chaos,vol.15,no.7,pp.
956
2109–2129,2005.
957
[3] I.K.Sethi and J.H.Yoo,“Symbolic mapping of neurons in feedforward
958
networks,” Pattern Recognit.Lett.,vol.17,no.10,pp.1035–1046,Sep.
959
1996.
960
[4] W.S.McCulloch and W.Pitts,“A logical calculus of the ideas immanent
961
in nervous activity,” Bullet.Math.Biol.,vol.5,no.4,pp.115–133,Dec.
962
1943.
963
[5] S.E.Hampson and D.J.Volper,“Disjunctive models of Boolean
964
category learning,” Biol.Cybern.,vol.56,nos.2–3,pp.121–137,May
965
1987.
966
[6] M.L.Johnson,“Perceptronhow this neural network model lets you
967
evaluate Boolean functions,” IEEE Potent.,vol.12,no.3,pp.17–18,
968
Oct.1993.
969
[7] Y.Shin and J.Ghosh,“Realization of Boolean functions using binary
970
Pisigma networks,” in Proc.Artif.Neural Netw.Eng.,St.Louis,MO,
971
Nov.1991,pp.1–6.
972
[8] M.Schmitt,“On computing Boolean functions by a spiking neuron,”
973
Ann.Math.Artif.Intell.,vol.24,nos.1–4,pp.181–191,1998.
974
[9] F.Friedrichs and M.Schmitt,“On the power of Boolean computations in
975
generalized RBF neural networks,” Neurocomputing,vol.63,pp.483–
976
498,Jan.2005.
977
[10] G.Fahner,“A higher order unit that performs arbitrary Boolean func
978
tions,” in Proc.Int.Joint Conf.Neural Netw.,vol.3.San Diego,CA,
979
Jun.1990,pp.193–197.
980
[11] J.Y.Li,T.W.S.Chow,and Y.L.Yu,“The estimation theory and
981
optimization algorithm for the number of hidden units in the higher
982
order feedforward neural network,” in Proc.IEEE Int.Conf.Neural
983
Netw.,vol.3.Perth,Australia,Nov.–Dec.1995,pp.1229–1233.
984
[12] F.Y.Chen,G.L.He,X.B.Xu,and G.R.Chen,“Implementation of
985
arbitrary Boolean functions via CNN,” in Proc.10th Int.Workshop Cell.
986
Neural Netw.Appl.,2006,pp.1–6.
987
[13] M.Anthony,“Boolean functions and artiﬁcial neural networks,” Dept.
988
Math.,London Sch.Econ.Polit.Sci.,London,U.K.,Res.Rep.LSE
989
CDAM200301,Jan.2003.
990
[14] X.Shenshu,Z.Zhaoying,Z.Limin,and Z.Wendong,“Approximation
991
to Boolean functions by neural networks with applications to thinning
992
algorithms,” in Proc.17th IEEE Conf.Instrum.Meas.Technol.,vol.2.
993
Baltimore,MD,May 2000,pp.1004–1008.
994
IEEE
Proof
ZHANG et al.:BINARY HIGHER ORDER NEURAL NETWORKS FOR REALIZING BOOLEAN FUNCTIONS 13
[15] P.Auer,H.Burgsteiner,and W.Maass,“A learning rule for very simple
995
universal approximators consisting of a single layer of perceptrons,”
996
Neural Netw.,vol.21,no.5,pp.786–795,2008.
997
[16] D.Cheng,“Inputstate approach to Boolean networks,” IEEE Trans.
998
Neural Netw.,vol.20,no.3,pp.512–521,Mar.2009.
999
[17] D.Cheng and H.Qi,“State–space analysis of Boolean networks,” IEEE
1000
Trans.Neural Netw.,vol.21,no.4,pp.584–594,Apr.2010.
1001
[18] F.Chen,G.Chen,G.He,X.Xu,and Q.He,“Universal perceptron and
1002
DNAlike learning algorithm for binary neural networks:NonLSBF
1003
implementation,” IEEE Trans.Neural Netw.,vol.20,no.8,pp.1645–
1004
1658,Aug.2009.
1005
[19] F.Chen,G.Chen,Q.He,G.He,and X.Xu,“Universal perceptron and
1006
DNAlike learning algorithm for binary neural networks:NonLSBF
1007
implementation,” IEEE Trans.Neural Netw.,vol.20,no.8,pp.1293–
1008
1301,Aug.2009.
1009
[20] R.Johnsonbaugh,Discrete Mathematics,5th ed.London,U.K.:Pearson,
1010
2000.
1011
[21] D.E.Rumelhart and J.L.McClelland,Parallel Distributed Processing,
1012
Explorations in the Microstructure of Cognition.Cambridge,MA:MIT
1013
Press,1986.
1014
[22] B.Lenze,“Note on interpolation on the hypercube by means of sigma–pi
1015
neural networks,” Neurocomputing,vol.61,pp.471–478,Oct.2004.
1016
[23] C.Zhang,W.Wu,and Y.Xiong,“Convergence analysis of batch gradient
1017
algorithm for three classes of sigmapi neural networks,” Neural Process.
1018
Lett.,vol.26,no.3,pp.177–189,Dec.2007.
1019
[24] Y.Shin and J.Ghosh,“The pisigma network:An efﬁcient higherorder
1020
neural network for pattern classiﬁcation and function approximation,” in
1021
Proc.Int.Joint Conf.Neural Netw.,vol.1.Seattle,WA,Jul.1991,pp.
1022
13–18.
1023
[25] Y.Xiong,W.Wu,X.Kang,and C.Zhang,“Training pisigma network
1024
by online gradient algorithm with penalty for small weight update,”
1025
Neural Comput.,vol.19,no.12,pp.3356–3368,Dec.2007.
1026
[26] A.J.Hussain and P.Liatsis,“Recurrent pisigma networks for DPCM
1027
image coding,” Neurocomputing,vol.55,nos.1–2,pp.363–382,Sep.
1028
2003.
1029
[27] R.Durbin and D.E.Rumelhart,“Product units:A computationally pow
1030
erful and biologically plausible extension to backpropagation networks,”
1031
Neural Comput.,vol.1,no.1,pp.133–142,1989.
1032
[28] C.Zhang,W.Wu,X.H.Chen,and Y.Xiong,“Convergence of BP
1033
algorithm for product unit neural networks with exponential weights,”
1034
Neurocomputing,vol.72,nos.1–3,pp.513–520,Dec.2008.
1035
[29] H.Murata,M.Koshino,M.Mitamura,and H.Kimura,“Inference of S
1036
system models of genetic networks using product unit neural networks,”
1037
in Proc.IEEE Int.Conf.Syst.,Man Cybern.,Singapore,Oct.2008,pp.
1038
1390–1395.
1039
[30] A.Ismail and A.P.Engelbrecht,“Global optimization algorithms for
1040
training product unit neural networks,” in Proc.IEEEINNSENNS Int.
1041
Joint Conf.Neural Netw.,vol.1.Como,Italy,Jul.2000,pp.132–137.
1042
[31] L.R.Leerink,C.L.Giles,B.G.Horne,and M.A.Jabri,“Learning with
1043
product units,” in Advances in Neural Information Processing Systems,
1044
vol.7,G.Tesauro,D.Touretzky,and T.Leen,Eds.Cambridge,MA:
1045
MIT Press,1995,pp.537–544.
1046
[32] O.T.Yildiz and E.Alpaydin,“Omnivariate decision trees,” IEEE Trans.
1047
Neural Netw.,vol.12,no.6,pp.1539–1546,Nov.2001.
1048
[33] G.L.Giles and T.Maxwell,“Learning,invariance,and generalization in
1049
highorder neural networks,” Appl.Opt.,vol.26,no.23,pp.4972–4978,
1050
1987.
1051
[34] L.Spirkovska and M.B.Reid,“Robust position,scale,and rotation
1052
invariant object recognition using higherorder neural networks,” Pattern
1053
Recognit.,vol.25,no.9,pp.975–985,Sep.1992.
1054
[35] G.Thimm and E.Fiesler,“Highorder and multilayer perceptron ini
1055
tialization,” IEEE Trans.Neural Netw.,vol.8,no.2,pp.349–359,Mar.
1056
1997.
1057
[36] G.L.Foresti and T.Dolso,“An adaptive highorder neural tree for
1058
pattern recognition,” IEEE Trans.Syst.,Man,Cybern.,Part B:Cybern.,
1059
vol.34,no.2,pp.988–996,Apr.2004.
1060
[37] H.Tsukimoto,“Extracting rules from trained neural networks,” IEEE
1061
Trans.Neural Netw.,vol.11,no.2,pp.377–389,Mar.2000.
1062
[38] T.Cover,“Geometrical and statistical properties of systems of linear
1063
inequalities with applications in pattern recognition,” IEEE Trans.Elec
1064
tron.Comput.,vol.14,no.3,pp.326–334,Jun.1965.
1065
[39] J.H.Wang,Y.W.Yu,and J.H.Tsai,“On the internal representations
1066
of product units,” Neural Process.Lett.,vol.12,no.3,pp.247–254,
1067
2000.
1068
[40] O.Bousquet,S.Boucheron,and G.Lugosi,“Introduction to statis
1069
tical learning theory,” in Advanced Lectures on Machine Learning,
1070
O.Bousquet,U.V.Luxburg,and G.Rsch,Eds.New York:Springer
1071
Verlag,2004,pp.169–207.
1072
[41] V.N.Vapnik,“An overview of statistical learning theory,” IEEE Trans.
1073
Neural Netw.,vol.10,no.5,pp.988–999,Sep.1999.
1074
[42] D.Haussler,“Probably approximately correct learning,” in Proc.8th
1075
Nat.Conf.Artif.Intell.,1990,pp.1101–1108.
1076
Chao Zhang was born in Dalian,China.He received
1077
the Bachelors and Ph.D.degrees from Dalian Uni
1078
versity of Technology,Dalian,in 2004 and 2009,
1079
respectively.
1080
He is currently a Research Fellow in the School
1081
of Computer Engineering,Nanyang Technological
1082
University,Singapore.His current research interests
1083
include neural networks,machine learning,and sta
1084
tistical learning theory.
1085
1086
Jie Yang received the B.S.degree in computa
1087
tional mathematics from Shanxi University,Taiyuan,
1088
China,in 2001,and the Ph.D.degree from the De
1089
partment of Applied Mathematics,Dalian University
1090
of Technology,Dalian,China,in 2006.
1091
She is currently a Lecturer at the School of Math
1092
ematical Sciences,Dalian University of Technology.
1093
Her current research interests include fuzzy sets and
1094
systems,fuzzy neural networks,and spiking neural
1095
networks.
1096
1097
Wei Wu received the Bachelors and Masters degrees
1098
from Jilin University,Changchun,China,in 1974
1099
and 1981,respectively,and the Ph.D.degree from
1100
Oxford University,Oxford,U.K.,in 1987.
1101
He is now with the School of Mathematical Sci
1102
ences,Dalian University of Technology,Dalian,
1103
China.He has published 4 books and 90 research
1104
papers.His current research interests include learn
1105
ing methods of neural networks.
1106
1107
IEEE
Proof
EDITOR QUERY
EQ:1 = Please provide the accepted date for this article,since this is not
available in the provided “metadata.xml” ﬁle.
AUTHOR QUERY
AQ:1 = Please provide the expansion for BPUPS.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο