A technical review of image processing and computer vision techniques to implement real-time video analytics

acceptablepeasSecurity

Nov 30, 2013 (3 years and 8 months ago)

240 views

A technical review of image processing and computer
vision techniques to implement real
-
time video analytics

Maryam Majareh

mm12e10@ecs.soton.ac.uk

University of Southampton


ABSTRACT

The paper presents a technical review of various computer vision
techniques used in real
-
time video processing. The domain
focuses on the assessment of human behaviour in crowded scenes
such as train stations, airports, parking lot
s, etc. The technology
aims to transform a basic video camera feed into a live learning
and detection tool in order to process video frames. The main
objective is to detect activities such as abandoned objects,
illegally parked cars, trespassing and even r
emote biometrics.
There are a number of challenges faced by the research
community to process video frame sequences including
background subtraction, object (blob) segmentation, sequence
feature extraction and AI modelling that are actively being
investiga
ted at present.


Background subtraction involves training of a specialised model
to detect foreground objects from a static background taken from
a static camera. Blob
-
tracking involves the utilization of image
processing techniques to isolate and bound fo
reground objects.
Sequence feature extraction involves processing of temporally
distributed video frames to gain an understanding of various
foreground objects present within. As the frames are time
-
based
and context sensitive, the core information is extr
acted at two
unique stages


firstly by pre
-
processing the frames via suitable
image processing techniques to efficiently extract regions
-
of
-
interests (ROIs) and secondly by utilising robust artificial
intelligence (AI) routines such as Hidden Markov Model
s,
Bayesian Learning, or Neural Networks to model and train
detection classifiers.

Based on the level of investigation going
-
on in this area, this
paper presents a review of the current state
-
of
-
the
-
art in the field
of image processing, video analytics and

computer vision. In
doing so, the review presents existing research and the ongoing
challenges faced by the research community. The paper also
presents future directions into each of the three core areas of
background subtraction, object segmentation and
frame
-
based
video
-
tracking.


Categories and Subject Descriptors

I.4.8 [
Scene Analysis
]: Image Processing and Computer Vision


Computing methodologies, artificial intelligence, computer vision,
image and video acquisition, motion capture.

General Terms

Algorithms, Measurement, Performance, Experimentation,
Security, Human Factors, Standardization, Verification.

Keywords

Computer Vision, Image Processing, Video Analytics, Machine
Learning,

1.

INTRODUCTION

The paper reviews a wide range of investigation doma
ins involved
in the real
-
time processing and modelling of images from image
acquisition and processing to device calibration, segmentation and
artificial intelligence (AI)
-
based modelling. Computer vision is
regarded as the domain that deals with the proce
ssing of image
-
based data utilised by a computer. The domain comprises of a
number of core phases including image acquisition, processing
and classification
[1]
. Real
-
time video
-
feed processing is a
practical example of image processing where image sequenc
es
from a video source such as a CCTV camera are extracted and
manipulated in order to extract useful information. This
information can be in the form of car number plates from high
-
speed motor vehicles to human faces from pedestrians entering a
building h
allway.


Contrary to humans’ outstanding calibration ability in modelling
real
-
world video scenes, processing and classification of computer
vision
-
based data in existing vision hardware is an extremely
cumbersome task. An image in computing is regarded a
s a two
-
dimensional function

(



)

where the amplitude of any of the
pixels within the image is called the intensity or the grey scale
level of the specific picture. For a colour RGB image, this grey
-
scale level divides into three unique channels (Red, G
reen and
Blue) with each represented by an 8
-
bit binary value ranging from
0 to 256. Therefore a pixel containing an RGB representation of
255, 0, and 0 would show a red colour due to the remaining two
channels containing zero value representation. The pro
cessing of
these finite values by a digital computer is called digital image
processing. Based on the immediate application, image processing
generally comprise of two main categories:



Enhancing or optimising image quality for human viewing



Preparing image
s for computer vision
-
based feature
extraction

The scope of this paper addresses the later part of image
processing where the ongoing research in the geometrical
composition, relevant measurements and image interpretations are
analysed to critically discus
s and analyse the current state
-
of
-
the
-
art of the domain.


This paper is primarily divided into three core sections: Section 2
discusses the domain of image processing and its current state
-
of
-
knowledge in the analyses of raw camera
-
based images. Section 3

discusses the challenges and limitations of extracting real
-
time
video
-
based images from live camera feeds for real
-
world
applications. Section 4 ultimately proposes and discusses a new
frontier of temporal computer vision techniques via a novel
infrared
depth sensing domain to be used for various applications.
The paper ultimately concludes with a discussion into any future
extensions and applications of these techniques.


2.

IMAGE PROCESSING AND ANALYSIS

Image segmentation is regarded as the first pre
-
proc
essing stages
of image preparation to make it legible
-
enough for a computing
system to extract important features from. The core stages of an
image analysis system can be divided into the following three sub
-
stages:


2.1

Image pre
-
processing

This phase is prim
arily used to improve the quality of an image by
removing noise due to various factors such as uneven light
intensity, dirt, poor device quality, etc. Digital images particularly
are prone to different noise types which result into pixel
intensities that d
o not reflect the true intensities of the original
scene. There are several ways in which noise can be introduced
into a scene as follows:



Images scanned from photographic films generally contain
noise due to the presence of grains on film material. Images

acquired in low
-
quality scanning equipment generally result
in such images.



Scanners and damaged film may also lead to poor image
qualities with a high sound
-
to
-
noise (SNR) ratio. Images
acquired from old library records are one example that suffer
the mo
st from this kind of noise



If the image data is transmitted via an electronic
transmission, noise may be introduced due to the in
-
built
compression mechanisms. Images taken from JPEG
compression devices such as digital cameras introduce noise
due to
loss
-
based compression of image data



Finally, if an image is acquired directly in digital format
from, the data gathering mechanism may introduce noise


An image enhancement is generally achieved by the following
core methodologies
[2]
:




Removal of additiv
e noise and interference



Elimination of multiplicative interference



Regulation of image contrast and



Decrement of blurring


A number of methods are used for noise removal including
smoothing via low
-
pass filtering, sharpening via high
-
pass
filtering, histo
gram equalisation and the usage of generic de
-
blurring algorithms.


The effect of various noise types are shown in
Figure
1

where (a)
shows an original lab image taken

from a Samsung Galaxy S3
phone, (b) image induced with a zero
-
mean Gaussian white noise
contain a variance of 0.01, (c) image induced with Poisson
distribution with a mean of 10 and (d) with image containing Salt
& Pepper noise with 0.05 pixel density. Th
e images were created
via the “
imnoise
” function provided by Matlab 2012a.



Figure
1
: A simulated comparison of various noise types via
Matlab Image Processing based noise induction (a) Original
image, (b) Gaussian noise, (c) Poi
sson noise, (d) Salt & Pepper
noise

2.2

Image noise removal

A wide number of image noise removal techniques are reported in
the literature as follows
[3]
:



Linear filtering

The technique is used to eliminate only certain types of noises via
Gaussian or averaging filters. The technique is used to remove
noise by either removing or enhancing certain spatial frequencies
within an image
[2]
.



Median filtering

Median filters are ge
nerally used to remove impulsive noise due
to its ability to preserve edge information or step
-
wise
discontinuities in the signal.



Adaptive filtering

Adaptive linear filters work on the concept of extracting the
desired information (the actual image) via a
n estimation
operation. According to
[3]
, an adaptive linear filter is generally
used not only to remove noise but channel filtering as well.


2.3

Image segmentation

With a pre
-
processed image, the next stage in image processing is
the segmentation of the region of interest (ROI). ROI in an image
can contain any type of elements ranging from humans
[4]

to a
wide array of non
-
living objects such as luggage moving over a

conveyer belt or even vehicles for the purpose of license plate
recognition
[5]
.


Nonetheless, image segmentation from real
-
world objects suffer
from a completely new array of challenges compared to noise
removal. Images taken in open environments are ne
ver the same.
A picture taken at a certain time of the day is generally different
from one taken under different conditions such as cloud cover,
time
-
of
-
day, moving trees or other objects. These challenges
generally divide the area of image segmentation in
to two distinct
domains


the static image segmentation with no background
information available and the dynamic image segmentation based
on a sequence of video images.

Based on the type of image segmentation case being addressed,
the following section pre
sents a number of techniques that are
generally performed to extract foreground pixels from the
background data:

2.3.1

Edge detection kernels

The purpose of edge detection is to extract the outlines of
different regions in an image
[2]
. This technique can be use
d
fairly for both the static and dynamic segmentation cases. The
objective is to divide an image into a set of ROIs based on
brightness or colour similarities.


One of the simplest methods of segmentation is the application of
histogram equalisation or th
resholding technique over an image.
This is generally achieved by plotting or grouping pixels on the
basis of their specific intensity values. Conceptually, an image
histogram is a probability density function (PDF) of a grey
-
scale
image.


Figure
2
: An intensity histogram (b) of the lab
-
view grey
-
scale
image shown in (a)

It can be understood from the image shown in
Figure
2

that the
right
-
hand
-
side portion of the image in (a) contains a fairly high
number of pixels (> 1200) that lie in the higher intensity domain
whereas the left
-
hand
-
side image portion mainly contains darker
pixels due to the presence of the monitor, lower
-
intensity wall
porti
on and the bag. This very concept of “histogramming” has
routinely been used in applications where certain objects within a
complex background are to be extracted based upon the
underlying intensity criteria. The concept is frequently used in
applications
such as character segmentation in the domain of
optical character recognition
[6]
. The adaptively thresholded
image created based on the histogram profile shown in
Figure
2

is
shown in
Figure
3
.

However, the domain gets further challenging when a degree of
dynamism is induced within the image due to it being part of a
sequence of f
rames gathered from a generic or CCTV camera.
Images thus taken continuously change their pixel
-
level intensities
thereby making it impossible for hard
-
threshold
-
based image
histogram techniques as those stated above to fail. As discussed
before, these cha
nges generally occur due to different day times,
variable cloud cover, occlusions and dynamic foreground pixels.
Dynamic foreground pixels generally occur due to the presence of
moving objects that are part of a video image sequence. These can
be trees, wa
ves or even sand particles due to an ensuing dust
storm.



Figure
3
: A binary image created based on the intensity scale
profile shown in
Figure
2

(a) thresholded adaptively at a
median calculated via the histogram shown in
Figure
2
(b)


The most state
-
of
-
the
-
art challenge faced by the research
community i
n the segmentation of such images therefore comes
from these foreground pixels that act as part of a foreground but
are in
-
effect to be eliminated as background pixels. The next
section discusses on various “background subtraction”
methodologies that have
recently been employed in the literature
to solve the issue of foreground modelling in the presence of
dynamic background pixels.

3.

IMAGE PROCESSING IN DYNAMIC
VIDEO
-
BASED FRAMES

Predominantly termed as “background subtraction”, the technique
is increasingly

being used in real
-
time video
-
frame
-
based image
segmentation to detect, subtract and segment critical ROIs such as
moving vehicles and individuals. Due to the presence of moving
background objects such as trees or other dynamic objects, the
classification

of various ROIs in images requires careful
modelling to minimise false alarms. The situation is further
complicated when if the image contains sudden intensity
variations such as shadows, occlusions and objects moving at
variable speeds
[7]
. A variety of
techniques with their own
limitations and benefits are used in recent literature to robustly
locate foreground pixels as discussed below.

3.1.1

Background modelling via Gaussian mixture
models

A robust background methodology aims at the construction of
model tha
t is capable of eliminating dynamic background objects
while efficiently keeping track of genuine foreground objects over
a temporal sequence of video frames. Gaussian mixture models
(GMM) are one of the oldest methods utilised to learn from time
-
based pix
el variations.
[8]

utilised a probabilistic GMM
architecture to train each pixel based on its intensity variations
over time. The methodology was further extended by
[9]

to
include statistical Bayesian modelling
-
based artificial intelligence
(AI). However,

the two technologies predominantly suffered from
two major setbacks. Firstly, the models could not incorporate
object shadows as background pixels and secondly, if the model
were trained for slow intensity variations, it would fail for abrupt
intensity ch
anges and vice
-
verse.
Figure
4

shows a sample video
sequence taken from the ChangeDetection repository where the
standard GMM algorithm implemented in an OpenCV installation
fails for the bus
-
station video
[10, 11]
.
[12]

d
id try to incorporate
a time
-
adaptive system where the pixels were able to integrate
variable intensity rate. In order to further improve the technique
[13]

adopted a hierarchical approach to integrate colour and
gradient information to further improve and

differential on
overlapping foreground and background pixels with matching
intensity and colour profiles. Yet the issue of shadow
-
incorporate
still remained at
-
large with most of these GMM
-
based models.



3.1.2

Code
-
book
-
based background subtraction

The issues
with shadow and abrupt intensity variations were
predominantly addressed by another genre of algorithms based on
a pixel
-
level time
-
based codebook methodology. The technique
keeps a record of intensity
-
variation behaviour of pixels over a
time
-
based codebo
ok. Perhaps the most groundbreaking
implementation in this domain is by
[7]

who introduced a
technique termed as the maximum negative runtime length
(MNRL). The algorithm classifies a pixel’s behaviour by learning
its change rate over a set period of frame
s and thereby keeps a
codebook of a number of its parameters as follows:




The minimum and maximum brightness



The frequency with which a codeword has occurred in the
database



The maximum negative runtime length



The first and last access time of the codeword


The technique has presented promising outcomes in the domain of
background subtraction, particularly in highly changing scene
modelling such as traffic videos, pedestrian motion tracking and
even in gesture and gait recognition
[4, 9, 14
-
18]
.

4.

ANALYSIS OF

RECENT
TECHNOLOGICAL ADVANCEMENTS
INTO IMAGE PROCESSING


Yet, the biggest shortcoming of majority of histogram, GMM and
codebook
-
based algorithms lay in their capability to only process
a 2D image realisation of an image. With the rapidly changing
technol
ogies, the advent of 3D scanners did introduce a sense of
novelty and promise in the image and video processing domain
however, the overwhelmingly tedious process of calibration and
the need of willing subjects severely limited their usage in real
-
time ima
ge processing. Moreover, scenes captured via moving
cameras require a further overhead of using separate models for
each camera position in order to efficiently differentiate
foreground pixels. The current state
-
of
-
the
-
art substantially lacks
in terms of m
oving camera object recognition in the absence of a
robust and supervised AI model.

With the latest induction of infra
-
red sensing devices such as
Microsoft Kinect, the domain of background subtraction has taken
a new aspect where the pixels are not merely

realised in a 2D
intensity domain but in 3D point
-
cloud space where the distance
can be measured and modelled with respect to an infra
-
red camera
present on the device itself. The technology has already
revolutionised the XBOX gaming domain and with the l
aunch of
Windows
-
based Kinect version in February, 2012 along with its
SDK, it is now possible for conventional programmers to access
the depth
-
map and sensing APIs to a wide
-
range of real
-
world
applications including gesture recognition, motion sensing, f
ilm &
animation and high
-
resolution 3D surface regeneration.


Figure
4
: An implementation of MNRL
-
based codebook
algorithm given in
[7]

via the OpenCV library presenting the
inherent weaknesses of GMM
-
based background
segmentation

evaluated against a benchmarking video taken
from
[10, 11]


Work in the domain of point cloud processing for graphical
reconstruction for the objective of 3D surface matching has
increasingly been used to compare and identify objects such as
human faces,
vehicles and aerial scans as 3D surface plots
[1, 2]
.
The field is increasingly finding i
ts applications in forensics
[3]

and is very likely to be extended to real world applications of
multi
-
dimensional aerial scanning
[4]
,

beyond visual recognition
biometrics
[5]
, fire detection in smoke
[6]
, industrial conditional
monitoring
[7]

and most importantly, in medical and surgical
applications of tumo
ur detection, advanced magnetic resonance
imaging (MRI) as well as gait analysis
-
based physical
abnormality detection
[8, 9]
.

Despite the promising nature of depth
-
sensing, infra
-
red and
thermogra
phic devices in computer vision, the technology is still
not used substantially in everyday real
-
world usage. However, as
discussed earlier, with the advent of low
-
cost depth
-
sensing
devices such as Microsoft Kinect, the domain can now be
explored for ever
yday touch
-
free applications.
Figure
5

presents
samples of (a) skeletal joint mapping machine, (b) a Delaunay
triangulation used to capture 3D face wireframe (c) gray
-
scale
depth profile from the Kinect sensor for distance measurement
and (d) a thermograph to capture temperature information from
distant objects.


Figure
5
: Diagrammatic representation of a Kinect depth map
profile with distant images represented by higher gray
intensity mapping and closer objects such as the hand showed
with intensity values closers to 255



Feature vectors from the streams shown in
Figure
5

can be used in
a wide range of real
-
world applications including sign language
recognition
[10]
, gait identification
[11]
, touch
-
free biometrics, and
3D face recognition
[12]

and zero
-
visibility motion sensing (via
infra
-
red sensing)
[13]
.

Moreover, as the device’s uniqueness is in the single
-
directional
capability, it makes it possible to embed the technology in future
hand
-
handheld devices such as smart phones and
tablets. Such an
integration is likely to introduce opportunities into 3D
photography, animation and film industry, robotics, augmented
reality, education and virtual reality. Ultimately, the only
limitation with the current state
-
of
-
the
-
art lies with the
computational capability of conventional handheld hardware
which is still in integration phase for high
-
quality rendering that is
involved in multidimensional processing.

5.

CONCLUSION

The paper presents a detailed analysis into the core concepts of
image pro
cessing and segmentation in real
-
world applications.
Having discussed these, the review moves to dynamic, video
-
based image processing where the majority of recent
investigations are now concentrated. A detailed review of video
acquisition and processing t
echniques in the backdrop of recent
depth
-
processing and 3D pixel
-
cloud abilities of released
hardware present a wide range and promising set of applications.
Most importantly, a 3D infrared depth
-
map of is expected to
present a set of features that, if co
mbined with latest AI
techniques are likely to increase the overall detection and
classification accuracies of existing systems.

Most importantly, as the camera itself does not require multiple
view points, it is envisaged that future integration of this
camera
into mobile devices and smart phones is likely to revolutionise the
way picture are taken from handheld devices. Moreover, a further
integration of infrared
-
based thermographs is foreseen to
completely change remote diagnoses and treatment of patien
ts.
The technology is very likely to enable a GP or even an artificial
diagnosis software in a smart phone to detect and identify body
temperature changes, breathing problems, heart and pulse rates
merely by non
-
invasive, touch
-
free body scans. Moreover, i
n the
industrial domain, real
-
time sparse point clouds can be compared
to regular point clouds of a machine’s motion to pre
-
emptively
diagnose operational anomalies such as excess vibrations or
abnormal noise patterns. To wrap
-
up, depth
-
scan and 3D sensing

capabilities built in “single
-
directional” devices like Kinect are
widely expected to wide range of real
-
world domains.

6.

REFERENCES

[1]

Szeliski, R. and SpringerLink,
Computer vision : algorithms
and applications
. Texts in computer science. 2011, London ;
New
York: Springer. xx, 812 p.

[2]

Petrou, M., C. Petrou, and I. Wiley,
Image processing : the
fundamentals
. 2nd ed. 2010, Chichester: Wiley. xxiii, 794 p.

[3]

Vaseghi, S.V.,
Advanced digital signal processing and noise
reduction
. 4th ed. 2008, Chichester: J. Wiley &
Sons. xxx,
514 p.

[4]

Moeslund, T.B. and SpringerLink,
Visual analysis of humans
[electronic resource] : looking at people
. 2011, London ;
New York: Springer
-
Verlag London Limited. 1 online
resource (xxi, 632 p.).

[5]

Chang, S.L., et al.,
Automatic license plate r
ecognition.

Ieee
Transactions on Intelligent Transportation Systems, 2004.
5
(1).

[6]

Rice, S.V., G. Nagy, and T.A. Nartker,
Optical character
recognition : an illustrated guide to the frontier
. The Kluwer
international series in engineering and computer scienc
e.
1999, Boston, Mass. ; London: Kluwer Academic Publishers.
vi, 194 p.

[7]

Kim, K., et al., Real
-
time foreground
-
background
segmentation using codebook model. Real
-
Time Imaging,
2005.
11
(3).

[8]

Stauffer, C. and W.E.L. Grimson,
Learning patterns of
activity using

real
-
time tracking.

IEEE Transactions on
Pattern Analysis and Machine Intelligence, 2000.
22
(8): p.
747
-
757.

[9]

Lee, D.S., et al.,
A Bayesian framework for Gaussian mixture
background modeling.

2003 International Conference on
Image Processing, Vol 3, Procee
dings, 2003: p. 973
-
976.

[10]

ChangeDetection.
ChangeDetection Video Database
. 2012
[cited 2012 19th August, 2012]; Available from:
http://www.changedetection.net/
.

[11]

Goyette, N., et al. changedetection.net: A new change
detection benchmark dataset. in Proc.
IEEE Workshop on
Change Detection (CDW’12). 2012. Providence, RI.

[12]

Harville, M., A framework for high
-
level feedback to
adaptive, Per
-
Pixel, Mixture
-
Of
-
Gaussian background
models. Computer Vision
-

Eccv 2002 Pt Iii, 2002.
2352
: p.
543
-
560.

[13]

Javed, O., K. Sha
fique, and M. Shah,
A hierarchical
approach to robust background subtraction using color and
gradient information.

Ieee Workshop on Motion and Video
Computing (Motion 2002), Proceedings, 2002: p. 22
-
27.

[14]

Buch, N., S.A. Velastin, and J. Orwell,
A Review of
C
omputer Vision Techniques for the Analysis of Urban
Traffic.

Ieee Transactions on Intelligent Transportation
Systems, 2011.
12
(3).

[15]

Cristani, M., M. Bicego, and V. Murino,
Integrated region
-

and pixel
-
based approach to background modelling.

Ieee
Workshop on

Motion and Video Computing (Motion 2002),
Proceedings, 2002: p. 3
-
8.

[16]

Ilyas, A., et al.,
Real Time Foreground
-
Background
Segmentation Using a Modified Codebook Model.

Avss:
2009 6th Ieee International Conference on Advanced Video
and Signal Based Surveilla
nce, 2009: p. 454
-
459.

[17]

Diamantopoulos, G. and M. Spann,
Event detection for
intelligent car park video surveillance.

Real
-
Time Imaging,
2005.
11
(3): p. 233
-
243.

[18]

Xiang, T., S.G. Gong, and S.O.C. Ieee Computer. Video
behaviour profiling and abnormality detec
tion without
manual labelling. in 10th IEEE International Conference on
Computer Vision (ICCV 2005). 2005. Beijing, PEOPLES R
CHINA.


1.

Pauly, M., R. Keiser, and M. Gross,
Multi
-
scale feature
extraction on point
-
sample
d surfaces.

Computer
Graphics Forum, 2003. 22(3).

2.

Schnabel, R., R. Wahl, and R. Klein,
Efficient RANSAC
for point
-
cloud shape detection.

Computer Graphics
Forum, 2007. 26(2).

3.

Vanezis, P., et al.,
Facial reconstruction using 3
-
D
computer graphics.

For
ensic Science International,
2000. 108(2).

4.

Guo, L., et al.,
Relevance of airborne lidar and
multispectral image data for urban scene
classification using Random Forests.

Isprs Journal of
Photogrammetry and Remote Sensing, 2011. 66(1).

5.

Moreno
-
Moreno, M., J. Fierrez, and J. Ortega
-
Garcia,
Biometrics beyond the Visible Spectrum: Imaging
Technologies and Applications.

Biometric Id
Management and Multimodal Communication,
Proceedings, 2009. 5707.

6.

Kolaric, D., K. Skala, and A. Dubravic,
In
tegrated
system for forest fire early detection and
management.

Periodicum Biologorum, 2008. 110(2).

7.

Omar, M., K. Kuwana, and K. Saito,
The use of infrared
thermograph technique to investigate welding related
industrial fires.

Fire Technology, 2007. 43(
4).

8.

Lee, M.
-
Y. and C.
-
S. Yang,
Entropy
-
based feature
extraction and decision tree induction for breast
cancer diagnosis with standardized thermograph
images.

Computer Methods and Programs in
Biomedicine, 2010. 100(3).

9.

Selvarasu, N., et al.,
Abnormality Detection from
Medical Thermographs in Human Using Euclidean
Distance based color Image Segmentation.

2010
International Conference on Signal Acquisition and
Processing: Icsap 2010, Proceedings, 2010.

10.

Keskin, C., et al.,
Real Time Hand Pose

Estimation
using Depth Sensors.

2011 Ieee International
Conference on Computer Vision Workshops (Iccv
Workshops), 2011.

11.

Stone, E. and M. Skubic,
Evaluation of an inexpensive
depth camera for in
-
home gait assessment.

Journal of
Ambient Intelligence and

Smart Environments, 2011.
3(4).

12.

Mahoor, M.H. and M. Abdel
-
Mottaleb,
A multimodal
approach for face modeling and recognition.

Ieee
Transactions on Information Forensics and Security,
2008. 3(3).

13.

Elangovan, V. and A. Shirkhodaie.
Recognition of
Huma
n Activity Characteristics Based on State
Transitions Modeling Technique
. in
Conference on
Signal Processing, Sensor Fusion, and Target
Recognition XXI
. 2012. Baltimore, MD.