( )T ( )T

movedearΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

96 εμφανίσεις

INFORMATIONAL TECHNOLOGIES

AND COMPUTER ENGINEERING
Наукові праці ВНТУ, 2009, № 3
1
M. M. Bykov, Sc. (Eng.), Prof.; V. V. Kovtun, Sc. (Eng.), Assist. Prof.; N. F. Savinova
EVALUATION OF NOISE INFLUENCE ON THE RELIABILI
INFORMATION-MEASURING VOICE RECOGNITION SYSTEM
OPERATION
The paper considers the influence of noise in the speech signal on the reliability of the information-
measuring system of voice recognition operation. Analytical expressions for the evaluation of recognition
errors are obtained , calculation results of the probability of error classification of two classes of voices at
different levels of noise in the signal are given .
Key words: voice recognition, information-measuring system, reliability, noise, mathematical model,
transmission channel, distribution hyperplane, normal distribution law, rule of minimum distance.
The transmission channel through which the distribution of speech signals occurs, will be used for
voice recognition is influenced by all kinds of noise [1], main among them are the noise of
equipment and the environment. The given influence leads to decrease of the reliability of the
information-measuring system for voice recognition operation . Therefore, evaluation of this
impact, taking into account the structural features of information-measuring system of this type is
an urgent task.
Using the linear model [2, 3, 4], the speech signal can be regarded as quasi-determined process for
the same classes – images at time interval of
3010
÷
=
τ
mi汬楳散潮l献⁉渠瑨攠灲敳敮捥映慤摩瑩癥s
湯楳攠獰敥捨⁳楧湡氠睩汬⁨n 癥⁴桥⁦潬汯睩湧⁦潲v㨠Ω
=
(
)
(
)
(
)
.,,,
*
τξττ ttyty += (1)
In most cases the noise
( )
τ
ξ
,t
has zero average value and is non-correlated with the speech
signal
( )
τ
,ty
. Consequently, the vector of the signal can be represented as:

,
ξ
XXX
y
+
=
(2)
where
( )
T
yyyy
n
xxxX,,,
21
K=
,
(
)
T
n
xxxX
ξξξξ
,,,
21
K=
– are the vectors of speech signal
description and noise respectively ;
T
– is symbol of transposition ;
22
ii
xxx
yi ζ
+=
– is spectral power at the
i
th
frequency band.
For the leveling of the influence of speech signal volume on the results of recognition , we use
normalized description vector


2
1
c
n
i
i
xX σ==

=
, (3)
where
2
c
σ - is the dispersion at the input of the device of spectral analysis of the signal.
As a result of vector (2) normalization , respectively, from (3) we obtain:
ξ
X
r
X
r
r
X
y
~
1
1
~
1
~
22
2
+
+
+
=
,
where
ξ
XXX
y
~
,
~
,
~
are normalized vectors
ξ
XXX
y
,,
respectively,
2
2
ξ
σ
σ
y
r =
–is "signal / noise” ratio at the input of recognition system.
INFORMATIONAL TECHNOLOGIES

AND COMPUTER ENGINEERING

Наукові праці ВНТУ, 2009, № 3
2
Taking into account that in practice,
1
2
>>
r
the equation will have the form:

ξ
X
r
XX
y
~
1
~~
2
+=
. (4)
Operation reliability of information-measuring system classifier for voice recognition will be
defined define on the example of two classes of speakers
1
Ω
慮搠
2
Ω
⁴桥n=楴i睩汬l湯n⁢攠摩∞∞i捵汴⁴漠
来湥牡汩穥⁦o爠瑨攠捡獥r= 污牧rr= 湵nber= 潦ocl慳 獥献⁁⁴祰楣= 氠捬ls獩si敲Ⱐ睨楣栠楳⁵獥搠楮i
楮io牭慴楯渭a敡獵物湧⁳祳瑥e=∞o爠癯楣攠牥ro杮楴楯測g 楳i愠捬慳獩ci敲e∠批i湩num映摩獴=nce∠ξ5ζ.=
䡥湣攬⁦o爠瑨楳⁴祰攠潦⁣污獳楦i敲e 捬cssi∞ica瑩潮⁲畬e⁨慳⁴桥⁦o牭㨠Ω
=
(
)
(
)
,2,1,†††=,,
~
浩m,
~
=Ω∈⇒= jiXXdXd
ijeie
μμ
(5)
where
{
}
ii
XXE Ω∈=
~
|
~
0
μ
is the average value of vectors
X
~
, which belong to class
i
Ω

(
i
Ω
is the reference of the class);
( ) ( ) ( )
[
]
i
T
iie
XXXd μμμ −−=
~
~
,
~
is Euclidean distance from vector
X
~
to vector
i
μ
.
In terms of hyperplane that separates classes of speakers
1
Ω
慮a
2
Ω
Ⱐ慮搠瑨攠牵汥
㔩a景爠獩杮慬s
睩瑨潵琠湯楳e⁷楬氠桡癥⁴桥⁦o牭㨠
=
( ) ( )
(
)
(
)
(
)
0
~
~
~
~
~
221112
=−−−−−=
μμμμ
y
T
yy
T
y
XXXXXH
, (6)
or

( )
(
)
0
~
2
~
22111212
=−+−=
μμμμμμ
TTT
y
XXH
. (7)
Having substituted (4) into (7), we will form the equation of the hyperplane in the presence of
noise, assuming that training of the classifier was performed on the signal without noise

( )
( )
( )
,0
~
2
~
2
~
221112
2
1212
=−+−+−=
μμμμμμμμ
ξ
TTTT
y
r
X
r
XXH
(8)
or
( ) ( )
( )
.0
~
2
~~
12
2
1212
=−+= μμ
ξ
Tr
X
r
XHXH

From equation (8) it is seen that the effect of noise leads to the shift of the hyperplane to one of
the classes
1
Ω
or
2
Ω
, depending on the sign of the vector
(
)
12
μ
μ

⸠.i湣攠煵qs楤it敲ei湥搠獰n散栠
獩杮慬⁰潷敲映楮獴慮瑥湥潵猠獰散瑲畭⁩渠瑨攠
i
th
frequency band is a random function [6], the
spectral description vector
X
~
is a random vector, which is characterized by a multidimensional
normal distribution. Then (7), (8) are defined by th e density of one-dimensional normal distribution.
The average value for the decision function (7) will be:

(
)
{
}
{
}
(
)
.
~
2
~
221112
0
12
0
μμμμμμ
TT
T
y
XEXHE
−+−= (9)
And for the decision function (8), assuming
{
}
TT
X
r
XE
r
ξξ
~
2
~
2
2
0
2

,

( ){ } { }
( )
( )
,
~
2
~
2
~
221112
2
12
0
12
0
μμμμμμμμ
ξ
TTT
T
y
X
r
XEXHE
−+−+−=
(10)
Dispersion of the decision function (7) will be:

( )
[ ]
{
}
( )
[
]
{ }
,
~
2
~
2
2
221112
0
221112
02
μμμμμμμμμμμμσ
TT
T
y
TT
T

XEXE −+−−−−+−=
(11)
INFORMATIONAL TECHNOLOGIES

AND COMPUTER ENGINEERING
Наукові праці ВНТУ, 2009, № 3
3
After simplification we obtain

( ) ( )
,2,1 ,2
0
1212
2
=−−=

i
i
T
н
μμμμσ
(12)
where

0
i
– is covariance matrix of the i
th
class.
For simplification of the expression we assume
∑∑∑
==
00
2
0
1
.

Having substituted (8) into (11), we obtain the equation of the decision function
(
)
XH
r
~
12
,
analogous to equation (12).
Consequently, when
1
~
Ω∈
X
the decision functions
(
)
XH
~
12
and
(
)
XH
r
~
12
are calculated by normal laws
(
)
н
mN
σ
,
0
11
and
( )






−+
н
Tr
X
r
mN
σμμ
ξ
,
~
2
12
2
0
11
, respectively. The average value of
0
1
m
we obtained
substituting in (11) the value
{
}
1
0
~
μ
=
y
XE
:


(
)
.2
11212
0
1
μμμμμ
TT
m
−−=
(13)
Similarly, if
2
~
Ω∈
X
, decision functions
(
)
XH
~
12
and
(
)
XH
r
~
12
are calculated by normal laws
(
)
н
mN σ
,
0
22
and
( )






−+
н
Tr
X
r
mN
σμμ
ξ
,
~
2
12
2
0
22
, respectively. The average value of
0
2
m
we
obtained substituting in (9) the value
{
}
2
0
~
μ=
y
XE
:

(
)
.2
22211
0
2
μμμμμ
TT
m −−=
(14)
Thus, the probability of errors of the first and second order for decision function (17) in the
absence of noise are calculated by the equation

( ) ( )
(
)
(
)
(
)
(
)
(
)
,|
~
|
~
21
~
122
~
121
Ω∈Ω∈
>Ω+<Ω=
XX
XHPPXHPPeP θθ
(15)
where
θ
ₖ⁩猠瑨攠瑨牥獨潬搠潦⁲散潧湩瑩潮⸠
呡歩θg⁩湴漠慣捯畮琠⠱㌩⁡3搠⠱㐩Ⱐ睥扴慩≤Ω㨠
=
( )
( )
(
)
[
]
( )
,
~
2
1
|
~
12
2
~
~
12
2
0
112
1

∞−
−−
Ω∈
=<
θ
σ
πσ
θ XdHeXHP
н
mXH
н
X
(16)

( )
( )
(
)
[
]
( )
.
~
2
1
|
~
12
2
~
~
12
2
0
212
2


−−
Ω∈
=>
θ
σ
πσ
θ XdHeXHP
н
mXH
н
X
(17)
Substituting (16), (17) into (15), we obtain:

( ) ( )
( )









Ω+









Ω=
нн
m
ФP
m
ФPeP
σ
θ
σ
θ
0
2
*
2
0
1
*
1
, (18)
where
( )

*
–is Laplace function.
Choosing binary function of losses (0 - correct recognition, 1 - error), the threshold value
θ
睩汬l
INFORMATIONAL TECHNOLOGIES

AND COMPUTER ENGINEERING

Наукові праці ВНТУ, 2009, № 3
4
be determined by the ratio:

(
)
(
)
( )
( )
( )
( )
.
|
~
|
~
0
2
1
~
12
~
12
2
1
θ
θ
θ
=
Ω
Ω
=
=
=
Ω∈
Ω∈
P
P
XHP
XHP
X
X
(19)
Substituting (15) and (16) into (19), we obtain :

.
ln
2
0
2
0
1
0
20
2
0
1
mm
mm
н

+
+
=
θσ
θ
(20)
Replacing in (20) values,
0
1
m
,
0
2
m
and
2
н
σ
by their meanings, we obtain the threshold value of
θ
=
=
.汮
0
0

=
θθ
(21)
In case of noise threshold value
θ
⁩s⁷r楴i敮⁡e㨠Ω
=
( )
.ln
4
0
012
2
θθμμθ
ξξξ
+Δ=+−=

T
X
r
(22)
Taking into account (22), the probability of errors of the first and second order for decision
function (18), in case of noise present, will be:

( )
( )
( )
.
0
2
*
2
0
1
*
1








−Δ−
Ω+








−Δ+
Ω=
нн
m
ФP
m
ФPeP
σ
θ
σ
θ
ξξ
ξ
(23)
Equation (23) determines the dependence of recognition error on the presence of noise in the
speech signal. The advantage of the proposed method of the account of the impact of noise
available in speech signal on the operation reliability of the information-measuring system for voice
recognition is the fact that there is no need to calculate the integral of the probability distribution
density of information characteristics features in
n
--dimensional space.
Using formulas (18) and (23), the study of the dependence of voice recognition error on noise
available in the speech signal is carried out. The results of research are presented in Fig. 1.
INFORMATIONAL TECHNOLOGIES

AND COMPUTER ENGINEERING
Наукові праці ВНТУ, 2009, № 3
5

Fig. 1. Dependence of recognition errors on the magnitude of noise available in the speech signal: 1: 2:
;1 ,1
0
1
==
н
m σ
2:
5,0 ,1
0
1
==
н
m σ

CONCLUSIONS
As a result of research mathematical model of the influence of noise on the reliability of voice
recognition, which eliminates the need for integration of erroneous decisions area into a
multidimensional characteristic space is elaborated. Analytical expositions, carried out and
experimental results showed:
– reduction of noise influence on the operation reliability of the information-measuring system
for voice recognition applying the method of their filtration is efficient for the information
characteristics with high separation of speakers classes, that is, if
2
0
1
m
н
≤σ;
– filtration of noise with large amplitude is more efficient while voice recognition, than filtration
of errors with small amplitude.
REFERENCES
1. Давенпорт В. Б., Рут В. Л. Введение в теорию случайных сигналов и шумов. – М.: ИЛ, 1970. – 498 с.
2. Рамишвили Г.С. Автоматическое опознавание говорящего по голосу. – М.: Радио и связь, 1981. – 224 с.
3. Фант Г. Акустическая теория речеобразования. – М.: Наука, 1964. – 284 c.
4. Рабинер Л. Р. Шафер Р. В. Цифровая обработка речевых сигналов: Пер. с англ. – М.: Радио и связь,
1981. – 496 с.
5. Ротштейн А. П. Интеллектуальные технологии идентификации: нечеткие множества, генетические
алгоритмы, нейронные сети. – Винница: УНИВЕРСУМ-Вінниця, 1999. – 320 с.
6. Харкевич А. А. Спектры и анализ. – М.: Физматиз, 1962. – 320 с.
Bykov Mykolay – Cand.Sc.(Eng), Professor, Department of Computer Control Systems.
Kovtun Vyacheslav – Cand.Sc. Assist. Professor, Department of Computer Control Systems.
Savinova Natalia – Post-graduate, Department of Computer Control Systems.
Vinnytsia National Technical University.
1
2