Hierarchical Method for Foreground Detection Using Codebook Model

bunchshopfitterAI and Robotics

Nov 6, 2013 (3 years and 7 months ago)

48 views

Hierarchical Method for Foreground Detection

Using Codebook Model

Jing
-
Ming Guo, Member, IEEE and Chih
-
Sheng Hsu

Department of Electr
ical

Engineering

National Taiwan University of
Science and Technology

Taipei
, Taiwan

E
-
mail:
jmguo@seed.net.tw
,
seraph1220@gmail.com


A
BSTRACT

This paper presents a
hierarchical
scheme with block
-
based and pixel
-
based
codebooks for

foreground detection
. The codebook is mainly used to
compress

information

to achieve high efficient
processing speed.
I
n the block
-
based stage, 12
intensity

values are employed to represent a block.
The algorithm
extends

the concept of the B
l
ock
T
runcation
C
oding

(BTC), and thus it can further
improve the processing efficiency by enjoying its low complexity advantage. In detail, the
block
-
based stage can remove the most noise without reducing the True Positive (TP) rate, yet it has
low precisi
on.
T
o overcome this problem, the pixel
-
based stage is adopted to enhance the precision,
which also can reduce the False Positive (FP) rate.
In addition to

the basic algorithm
, we combine
short term information to
improv
e background updating for adaptive c
urrent
environment
.
As
documented in the experimental results, the
proposed

algorithm can provide superior performance to
that of the former approaches.

Experimental Results

For
measuring

the
accuracy

of the results, the criterions FP rate, TP rate, Precision, and Similarity
[12] are employed as defined below:

tn
fp
fp
rate
FP




,
fn
tp
tp
rate
TP




,
fp
tp
tp
Precision



,
fn
fp
tp
tp
Similarity



,

where tp denotes the total number of true
positives; tn denotes the number of true negative; fp denotes
the number of true positives; fn denotes the number of false negative; (tp + fn) indicates the total
number of pixels presented in the foreground, and (fp + tn) indicates the total number of pix
els
presented in the background.

And

we
implementing in C program language with Intel core 2, 2.4GHz
CPU, 2G RAM, and Windows XP SP2 operating system.



Experimental results for
foreground

detection using the

proposed
method
. Here, we describe
some
different sequences,

and

compared with former
MOG [7],
Rita’s method [4]
, CB [11],
Chen’s method
[9]

and
Chiu’s method [22]

schemes.
I
n our e
xperimental results
is without any post processing and
short term information for
measur
ing

the
accuracy

of the res
ults
.

A
ll result for different sequence can
download with
ftp://HMFD@140.118.7.72:222/


1.
Sequence

IR, Campus, Highway_
I

and Laboratory

Size:
320
*
240

Source
:[
19
], file n
ame:
IR (row 1), Campus
(row 2), Highway_
I

(row 3) and Laboratory (row 4)

To provide a better understanding about the detected r
esults, four colors, red, green

and

blue
, are
employed to represent shadows,
highlight

and foreground, respectively.





(a)


(b)


(
c
)

Fig.
1
. Classified results of sequence [1
9
]

for IR (row 1), Campus (row 2), Highway_
I

(row 3) and
Laboratory (row 4)

with

shadow (red), highlight (green), and foreground

(
blue
). (a) Original image, (b)
block
-
based stage only with block of size 10x10, and (c) proposed method.


2
.
Sequence

Waving Trees

(WT)

Size: 160*120

Source
:[
21
], file name:
Waving Trees






(a) (b) (c)

(d)





(e) (f) (g) (h)





(i) (j) (k) (l)




(m) (n) (o)

Fig.
2
. Foreground (white) classified
results with sequence WT [
2
1
]. (a) Original image

(frame 247)
,
(b) ground truth, (c) MOG

[7]
,

(d)
Rita

s method [4],

(
e
) CB,

(f)
Chen

s method [9], (g) Chiu

s
method [22],

(
h
)
-
(
k
) block
-
based only with block of size (
h
) 5x5, (
i
) 8x8, (
j
) 10x10, (
k
) 12x12,
(
l
)
-
(
o
)
proposed cascaded method with block of size (
l
) 5x5, (
m
) 8x8, (
n
) 10x10, and (
o
) 12x12.


0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
200
203
206
209
212
215
218
221
224
227
230
233
236
239
242
245
248
251
254
257
260
263
266
269
272
275
278
FP rate (%)
Frame number
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
242
244
246
248
250
252
254
256
258
TP rate ( %)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
Precision (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
Similarity (%)
Frame number
(
a
)
(
b
)
(
c
)
(
d
)

Fig.
3
.
T
he accuracy values in each frame for sequence WT [21]. (a) FP rate, (b) TP rate, (c)
Precision

and (d)
Similarity
.


TABLE
1
.
THE
A
VERAGE OF

ACCURACY VALUES FOR SEQUENCE W
T


FP

TP

Precision

Similarity

fps

MOG

[
7
]

0.0913

0.9307

0.6955

0.6729

40
.13

Rita

猠浥瑨潤⁛㑝

〮㌰㐱

0.901

0.4628

0.4422

30.56

CB

[
11
]

0.0075

0.9434

0.9290

0.8913

102.43

Chen

猠浥瑨潤⁛㥝

〮Mㄶ1

0.8562

0.6450

0.5962

64.35

Chiu

猠浥瑨潤⁛㈲O

〮〶〳

0.5641

0.7037

0.4599

320.05

block
-
based stage
5x5

0.0208

0.9755

0.8413

0.8276

269.36

block
-
based stage
8x8

0.0164

0.9674

0.8511

0.8294

320.88

block
-
based stage
10x10

0.0158

0.9749

0.8379

0.8199

365.29

block
-
based stage
12x12

0.0167

0.93

0.8077

0.7691

394.08

proposed

method

(5x5)

0.0027

0.9517

0.97

0.9266

165.28

proposed

method

(8x8)

0.0020

0.9408

0.9767

0.9204

186.56

proposed

method

(10x10)

0.0018

0.9474

0.9795

0.9285

197.04

proposed

method

(12x12)

0.0018

0.9059

0.9718

0.8853

205.61


3
.
Seque
nce

WATERSURFACE

[20]

Size: 160*12
8

Source
:[
20
], file name:
WATERSURFACE






(a) (b) (c) (d)





(e) (f) (g)

(h)





(i) (j) (k) (l)




(m) (n) (o)

Fig.
4
. Foreground (white) classified results with
WATERSURFACE

[20]
. (a) Original image

(frame
529)
, (b)

ground truth, (c) MOG

[7]
,

(d)
Rita

s method [4],

(
e
) CB,

(f)
Chen

s method [9], (g) Chiu

s
method [22],

(
h
)
-
(
k
) block
-
based only with block of size (
h
) 5x5, (
i
) 8x8, (
j
) 10x10, (
k
) 12x12, (
l
)
-
(
o
)
proposed cascaded method with block of size (
l
) 5x5, (
m
) 8
x8, (
n
) 10x10, and (
o
) 12x12
.


(
a
)
(
b
)
(
c
)
(
d
)
0
0.01
0.02
0.03
0.04
0.05
0.06
480
485
490
495
500
505
510
515
520
525
FP
rate (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
481
486
491
496
501
506
511
516
521
526
TP rate (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
481
486
491
496
501
506
511
516
521
526
Precision (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
481
486
491
496
501
506
511
516
521
526
Similarity (%)
Frame number

Fig.
6
.
T
he accuracy values in each frame for sequence
WATERSURFACE

[20]. (a) FP rate, (b) TP
rate, (c)
Precision

and (d)
Similarity
.


TABLE
2
.
THE
A
VERAGE OF

ACCURACY VALUES

FOR SEQUENCE
WATERSURFACE


FP

TP

Precision

Similarity

fps

MOG

[
7
]

0.0431

0.8969

0.5515

0.5183

46.26

Rita

猠浥瑨潤⁛㑝

〮〲㘵M

0.8122

0.6370

0.5595

30.23

CB

[
11
]

0.0038

0.8118

0.9247

0.7639

101.01

Chen

猠浥瑨潤⁛㥝

〮〲㈸M

0.8215

0.6680

0.5835

62.48

Chiu

猠浥瑨潤⁛㈲O

〮〰ㄲM

0.7153

0.9539

0.6965

284.36

block
-
based stage
5x5

0.0399

0.9588

0.5835

0.5722

213.52

block
-
based stage
8x8

0.0549

0.9568

0.5144

0.5052

273.97

block
-
based stage
10x10

0.0580

0.9291

0.4893

0.4754

320.05

block
-
based stage
12x12

0.0723

0.9355

0.4417

0.4340

348.83

proposed

method

(5x5)

0.0049

0.9087

0.8983

0.8283

147.65

proposed

method

(8x8)

0.0043

0.9030

0.9098

0.8331

182.92

proposed

method

(10x10)

0.0051

0.8800

0.8947

0.8026

192.01

proposed

method

(12x12)

0.0051

0.8812

0.8923

0.8080

202.02


4
.
Sequence

CAMPUS

[20]

Size: 160*12
8

Source
:[
20
], file name:
CAMPUS








(a)



(b)






(c)



(d)





(e
)




(
f
)


(g
)



(h)







(
i
)




(
j
)


(
k
)



(
l
)







(m) (n) (o)

Fig.
7
. Foreground (white) classified results with
CAMPUS

[20]
. (a) Original image

(frame 695)
, (b)
ground truth, (c) MOG

[7]
,

(d)
Rita

s method [4],

(
e
) CB,

(f)
Chen

s method [9], (g) Chiu

s method
[22],

(
h
)
-
(
k
) block
-
based only with block of size (
h
) 5x5, (
i
)
8x8, (
j
) 10x10, (
k
) 12x12, (
l
)
-
(
o
) proposed
cascaded method with block of size (
l
) 5x5, (
m
) 8x8, (
n
) 10x10, and (
o
) 12x12.


(
a
)
(
b
)
(
c
)
(
d
)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
600
610
620
630
640
650
660
670
680
690
700
710
720
FP
rate (%)
Frame number
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
636
646
656
666
676
690
700
710
TP rate (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
636
646
656
666
676
690
700
710
Precision (%)
Frame number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
636
646
656
666
676
690
700
710
Similarity (%)
Frame number

Fig. 8.
T
he accuracy values in each frame for sequence
CAMPUS

[20]. (a) FP rate, (b) TP rate, (c)
Preci
sion

and (d)
Similarity
.


TABLE
3
.
THE
A
VERAGE OF

ACCURACY VALUES FOR SEQUENCE
CAMPUS


FP

TP

Precision

Similarity

fps

MOG

[
7
]

0.1478

0.8811

0.2862

0.2725

53.26

Rita

猠浥瑨潤⁛㑝

〮ㄷ㠱

0.7225

0.2310

0.2030

23.16

CB

[
11
]

0.0342

0.9219

0.5567

0.5280

85.87

Chen

猠浥瑨潤⁛㥝

〮ㄶㄴ

0.7517

0.2562

0.2295

51.81

Chiu

猠浥瑨潤⁛㈲O

〮〶〴

0.4926

0.3533

0.2406

278.16

block
-
based stage
5x5

0.0447

0.9243

0.4971

0.4796

174.14

block
-
based stage
8x8

0.0383

0.9256

0.5042

0.4884

278.16

block
-
based stage
10x10

0.0358

0.9272

0.5176

0.5023

304.64

block
-
based stage
12x12

0.0433

0.8564

0.4455

0.4260

336.71

proposed

method

(5x5)

0.0125

0.9061

0.7195

0.6712

110.81

proposed

method

(8x8)

0.0095

0.9025

0.7672

0.7125

141.44

proposed

method

(10x10)

0.0093

0.9037

0.7708

0.7169

156.26

proposed

method

(12x12)

0.0084

0.8349

0.7820

0.6965

161.03



5
.

Sequence

MO

[21]

Size: 160*12
0

Source
:[
2
1
], file name:

moving object


Figure
9

shows the sequence
MO

[
2
1] with

a moving object,
containing
1745

frames of size 160x120.

The sequence MO is employed to test the
adaptability

of the background model. When the chair is
moved
at

frame 888 in
Fig.
10
, after a period of time this chair becomes a part of background in
background model.
W
e achieved this by applying short term info
rmation in background model to
improve its adaptation
, and
T_add set 100
.
I
n
Fig.
10, frame 986

shows a

good
result without any noise
or foreground regions.


Frame
600
Frame
650
Frame
700
Frame
750
Frame
800
Frame
850
Frame
888
Frame
950
Frame
980
Frame
982
Frame
984
Frame
986

Fig.
9
. Foreground (
blue
) classified results with MO [
2
1]
,

and
process
ed result with the proposed
method with short term information.


Conclusions

Table
4

organizes the average of accuracy results from Table 1
-
3

with the
three

test sequences. It is
clear that the proposed algorithm provides the highest accuracy performance among the
various

compared methods. Moreover, the fps of the proposed method is also superior to the
five

former
approaches
.

In general, the larger block can

achieve a higher
processing

speed, yet lower TP rate, and
vice versa, as indicated in Table
4
. We would like to recommend a processing
-
speed
-
oriented
application to choose a larger block, while a smaller block would be a promising choice for TP
rate
-
orien
ted application.

A
hierarchical
method for
foreground detection

has been proposed by using block
-
based and
pixel
-
based. The block
-
based can enjoy high speed processing speed and detect most of the foreground
without reducing TP rate, while pixel
-
based can
further improve the precision of the detected
foreground object with reducing FP rate Moreover, a color model and match function have also been
introduced in this study that can classify a pixel into shadow, highlight, background, and foreground.
As docume
nted in the experimental results, the
hierarchical
method provides high efficient for
background subtraction which can be a good candidate for vision
-
based applications, such as human
motion analysis or surveillance systems.


TABLE

4
.

THE
A
VERAGE OF
ACCURA
CY VALUES
.


FP

TP

Precision

Similarity

fps

MOG

[
7
]

0.0941

0.9029

0.5111

0.4879

64.22

Rita

猠浥瑨潤⁛㑝

〮ㄶ㤶

0.8119

0.4436

0.4016

27.98

CB

[
11
]

0.0152

0.8924

0.8035

0.7278

96.44

Chen

猠浥瑨潤⁛㥝

〮㄰〲

0.8098

0.5231

0.4698

59.55

Chiu

猠浥瑨潤⁛㈲O

〮〴〶

0.5907

0.6703

0.4657

294.19

block
-
based stage
5x5

0.0351

0.9529

0.6407

0.6265

219.01

block
-
based stage
8x8

0.0365

0.9499

0.6233

0.6077

291.00

block
-
based stage
10x10

0.0366

0.9438

0.6149

0.5992

329.99

block
-
based stage
12x12

0.0441

0.9073

0.5650

0.5431

359.87

proposed

method

(5x5)

0.0067

0.9222

0.8626

0.8087

141.25

proposed

method

(8x8)

0.0053

0.9154

0.8846

0.8220

170.31

proposed

method

(10x10)

0.0054

0.9104

0.8817

0.8160

181.77

proposed

method

(12x12)

0.0051

0.8740

0.8821

0.7966

189.55



References

[1]

K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: principles and practice of
background maintenance,”
I
n Proc.

IEEE Conf. Computer Vision, vol. 1, pp. 255

261
, Sept.
1999
.

[2]

T. Horprasert, D.

Harwood
,

and L. S. Davis, “A statistical approach for real
-
time robust
background subtraction and shadow detection,” IEEE ICCV Frame
-
Rate Applications
Workshop, Kerkyra, Greece
, Sept. 1999
.

[3]

R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and S. Sirotti,
“Improving
shadow suppression
in moving object detection with HSV color information
,


IEEE Conf. Intelligent
Transportation Systems, pp. 334
-
339, Aug. 2001.

[4]

R. Cucchiara, M. Piccard, and A. Prati,

Detectin moving objects, ghosts, and shadows in
video streams,


IEEE T
rans. Pattern Analysis and Machine Intelligence
, vol. 25, no. 10, Oct.
2003.

[5]

M.
Izadi, and
P.
Saeedi, “Robust region
-
based background subtraction and shadow removing
using color and gradient information,”

I
n proc. 19th International Conference on Pattern
R
ecognition, art. no. 4761133
, Dec. 2008.

[6]

M.
Shoaib
, R.
Dragon
, and J.
Ostermann
,
“Shadow detection for moving humans using
gradient
-
based background subtraction
,


IEEE
Conf.

Acoustics, Speech and Signal Processing
,
art.
N
o.
4959698
, pp. 773
-
776, Apri. 2009
.

[7]

C. Stauffer and W.E.L Grimson,

“Adaptive background mixture models for real
-
time tracking
,


IEEE International Conference on Computer Vision and Pattern Recognition
, vol.
2
, pp.
246

52
, June, 1999.

[8]

C. Stauffer and W.E.L Grimson,
“Learning patterns of act
ivity using real
-
time tracking
,


IEEE
Trans. Pattern Analysis and Machine Intelligence
, vol. 22, pp. 747
-
757, Aug. 2000.

[9]

Y. T. Chen, C. S. Chen, C. R. Huang, and Y. P. Hung,

Efficient hierarchical method for
background subtraction,


Pattern Recognition, v
ol. 40, pp. 2706
-
2715, Oct. 2007.

[10]

N
.
Martel
-
Brisson, and A
.

Zaccarin, “Learning and removing cast shadows through a
multidistribution approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29,
no.7
, pp. 1133
-
1146, July, 2007.

[11]

K
.

Kim, T
.
H. C
halidabhongse, D
.

Harwood, and L
.

Davis, “Real
-
time foreground
-
background
segmentation using codebook model,” Real
-
Time Imaging, vol
.

11,
no.

3,
pp
. 172
-
185
, June.
2005.

[12]

L
.

Maddalena, and A
.

Petrosino, “A self
-
organizing approach to background subtraction for
visual surveillance applications,” IEEE Trans. Image Processing, vol. 17, no. 7
, pp. 1168
-
1177,
July, 2008.

[13]

L. Massalena, and A. Petrosino,
“Multivalued background/

foreground separation

for moving
object detection
,


Lecture Notes in Computer Science, vol. 5571, pp.263
-
270, 2009.

[14]

T. Kohonen, Self
-
organization and Associative Memory, 2nd ed. Berlin,
Germany:Springer
-
Verlag, 1988.

[15]

K.

A
. Patwardhan
, G
.

Sapiro, and V
.

Morellas, “Robust foregr
ound detection in video using
pixel layers,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 4
, pp.
746
-
751,
April
, 2008.

[16]

M
.

Heikkila, and M
.
Pietikainen, “A Texture
-
Based Method for Modeling the Background and
Detecting Moving Objects,
” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28,
no. 4
, pp. 657
-
662, April, 2006.

[17]

E. J. Delp and O. R. Mitchell, “Image compression using block truncation coding,” IEEE
Trans. Co
mmunications Systems
, vol. COMM
-
27, no. 9, pp. 1335
-
1342,
Sep
t.
1979.

[18]

E. J. Carmona, J. Martinez
-
Cantos and J. Mira,

A new video
segmentation

method of
moving objects based on blob
-
level knowledge,


Pattern Recognition Letters, vol. 29, issue 3,
pp. 272
-
285, Feb. 2008.

[19]

H
ttp://cvrr.ucsd.edu/aton/shadow/index.html

[20]

http://perception.i2r.a
-
star.edu.sg/bk_model/bk_index.html

[21]

http://research.microsoft.com/en
-
us/um/people/jckrumm/WallFlower/TestImages.htm

[22]

C. C. Chiu, M. Y. Ku and L. W. Liang,

A robust object
segmentation

system using a
probability
-
based background extraction algorithm,


IEEE Trans. Circuits and Systems for
Video Technology, vol. 20, no. 4, April, 2010.

[23]

C. Benedek and T. Sziranyi,

Bayesian foreground and shadow detection in uncertain frame
tate surveillance videos,


IEEE Trans. Image Processing, vol. 17, no. 4, April, 20
08.

[24]

W. Zhang, X. Z. Fang, X. K. Yang and Q. M. J. Wu,

Moving cast shadows detection using
ratio edge
,”

IEEE Trans. Multimedia, vol. 9, no. 6, Oct. 2007.