1
Discriminating between Nasal and Mouth
Breathing
Peng Yuan
B00541592
BSc(Hon’s) Computing Science
Supervisor: Dr. Kevin Curran
Interim Report,
April
20
10
2
Acknowledgements
I would like to extend sincere thanks and appreciation to my project supervisor, Dr. Kevin
Curran, for the initial idea of the project and his invaluable help and guidance throughout this
year.
I would also like to show my appreciation to all the staff in the library and labs, for their kind
help and assistance during the last few months. Lastly I would like to thank my parents for
their unending support during my studying abroad and all my fel
low classmates for helping
me in many ways throughout the project.
3
Abbreviations
3D
Three Dimension
AFS
Amplitude Frequency Spectrum
ANN
Artificial Neural Network
BPNN
Back

propagation
Neural Network
CDSS
Clinical Decision Support System
CORSA
Computerized Respiratory Sound Analysis
DFT
Discrete Fourier Transform
DTFT
Discrete Time domain Fourier Transform
EPD
End

point Detection
FFT
Fast Fourier Transform
LPC
Linear Predictive Coding
LPCC
Linear Predictive Cepstrum Coefficients
MFCC
Mel Frequency Cepstrum Coefficients
UML
Unified Modeling Language
HOD
High

order Difference
HR
Heart Rate
RIP
Respiratory Inductive Plethysmography
RR
Respir
atory Rate
STFT
Short Time Fourier Transform
ZCR
Zero Crossing Rate
4
Table of Contents
Acknowledgements
................................
................................
................................
2
Abbreviations
................................
................................
................................
........
3
Table of Contents
................................
................................
................................
..
4
Declaration
................................
................................
................................
...........
8
Abstract
................................
................................
................................
.................
9
1.
Introduction
................................
................................
...............................
10
1.1
Aims and Objectives
................................
................................
................................
................................
.............
10
1.2
Outline of Thesis
................................
................................
................................
................................
...................
10
2.
Lit
erature Review
................................
................................
......................
12
2.1
Introduction to the exploitation of biological sound
................................
................................
...............................
12
2.2
The study of breathing pattern
................................
................................
................................
..............................
12
2.3
Respiratory rate monitoring
................................
................................
................................
................................
..
13
2.4
Sound Analysis and Feature Extraction
................................
................................
................................
...............
13
2.4.1
Filter Analysis of Sound Signal
................................
................................
................................
..................
14
2.4.2
Feature Extraction
................................
................................
................................
................................
..........
14
2.5
Digital signal Characteristics in Time Domain
................................
................................
................................
......
14
2.5.1
Short

time Energy
................................
................................
................................
................................
..........
15
2.5.2
the Average Zero Crossing Rate of Short

time
................................
................................
....................
15
2.6
Digital signal Characteristics in
Frequency Domain
................................
................................
.......................
15
2.6.1
the Linear Predictive Coding (LPC) parameters
................................
................................
..................
15
2.6.2
the algorithm for LPC parameters
................................
................................
................................
.............
16
2.6.3
the Linear Predictive Cepstrum Coefficients (LPCC)
................................
................................
.........
18
2.6.4
the Mel Frequency Cepstrum Coefficients (MFCC)
................................
................................
............
18
2.7
Artificial
Neural Network
................................
................................
................................
................................
.......
20
3.
Requirements Analysis and Functional Specification
...............................
22
3.1
Functional requirements
................................
................................
................................
................................
.......
22
3.2
Non

functional
requirements
................................
................................
................................
................................
24
3.2.1
The frequency range to use
................................
................................
................................
........................
25
3.2.2
Placement of the sensor
................................
................................
................................
..............................
25
3.3
Summary
................................
................................
................................
................................
..............................
25
4.
Design
................................
................................
................................
........
27
4.1
MATLAB
................................
................................
................................
................................
...............................
27
4.1.1
Introduction of the related Matlab function
................................
................................
.............................
27
4.1.1.1
Short time spectrum analysis
................................
................................
................................
.....................
27
4.2
Equipment
................................
................................
................................
................................
............................
28
4.2.1
Acoustic sensor
................................
................................
................................
................................
...............
2
8
4.2.2
Sound Recorder
................................
................................
................................
................................
..............
29
5
4.3
System architecture
................................
................................
................................
................................
..............
29
4.4
Data modeling
................................
................................
................................
................................
......................
31
4.5
Analyzing methods
................................
................................
................................
................................
...............
32
4.6
So
und Signal Pre

processing
................................
................................
................................
...............................
33
4.6.1
Cutting off frequency band
................................
................................
................................
..........................
33
4.6.2
Filter Design
................................
................................
................................
................................
.....................
34
4.6.3
En
d

point Detection of the Signal
................................
................................
................................
.............
36
4.6.3.1
Short

time average amplitude method
................................
................................
................................
....
37
4.6.3.2
Short

time energy method
................................
................................
................................
...........................
37
4.6.3.3
Short

time average zero crossing rate method
................................
................................
....................
37
4.7
the principle of Back

propagation Neural Network
................................
................................
..............
38
4.7.1
Feed

forward Calculation
................................
................................
................................
.............................
39
4.7.2
the rules of weights adjustment in BP Neural Network
................................
................................
......
39
4.7.3
The breath pattern classification flowchart
................................
................................
.............................
40
5.
Implementation
................................
................................
..........................
41
5.1
Pre

processing
................................
................................
................................
................................
................
42
5.1.1
Digital Filter Applications
................................
................................
................................
..............................
42
5.1.2
Apply filter to the digital signal
................................
................................
................................
....................
43
5.2
Principles of Spectrum Analysis and Display
................................
................................
........................
44
5.2.1
Short Time FFT Spectrum Analysis of Discrete Signal
................................
................................
......
44
5.2.2
the dynami
c spectrum display of the Pseudo

color coded Mapping
................................
.............
45
5.2.3
Broad

band spectrum and Narrow

band spectrum
................................
................................
.............
46
5.2.4
Pseudo

color mapping and display of the spectrum
................................
................................
...........
46
5.2.5
Implementation within Matlab
................................
................................
................................
.....................
47
5.2.5.1
function
specgram(FileName, Winsiz, Shift, Base, Coltype);
................................
...........................
47
5.2.5.2
display the pseudo

color mapping graph
................................
................................
................................
48
5.3
Feature Extraction
................................
................................
................................
................................
..........
49
5.3.
1
Introduction to End

point Detection
................................
................................
................................
..........
49
5.3.2
End

point Detection Error
................................
................................
................................
............................
50
5.3.3
the Zero Crossing Rate (ZRO)
................................
................................
................................
...................
50
5.3.4
High

order Difference
................................
................................
................................
................................
....
51
5.4
Back

propagation Neural Network Algorithm and Implementation
................................
................
53
5.4.1
design of the artificial neural network
................................
................................
................................
......
53
5.4.2
Back

propagation neural network implementation
................................
................................
..............
53
5.4.2.1
initialization of the network
................................
................................
................................
..........................
53
5.4.2.2
training samples
................................
................................
................................
................................
..............
54
5.4.2.3
calculate the actual output of the network
................................
................................
..............................
54
5.4.2.4
adjust the weights
................................
................................
................................
................................
...........
54
6.
Evaluation
................................
................................
................................
..
58
6.1
Interface and Controls
................................
................................
................................
................................
..
58
6
6.2
End

point Detection Evaluation
................................
................................
................................
.................
61
6.3
Breath Pattern Detection Evaluation
................................
................................
................................
........
63
7.
Conclusion
................................
................................
................................
.
65
7.1
Future Work
................................
................................
................................
................................
......................
65
References
................................
................................
................................
...........
66
Appendix A
................................
................................
................................
..........
68
Appendix B
................................
................................
................................
..........
69
Appendix C
................................
................................
................................
..........
70
Appendix D
................................
................................
................................
..........
72
Appendix E
................................
................................
................................
..........
73
Appendix F
................................
................................
................................
..........
76
7
Table of Figures
Figure 1 Requirements analysis is in the first stage (Wikipedia)
................................
..........................
22
Figure 2 System Use Case Diagram
................................
................................
................................
....
23
Figure 3 System Sequence Diagram
................................
................................
................................
....
24
Figure 4 An example of MATLAB simulation
................................
................................
........................
27
Figure 5 Acoustic Sensor
................................
................................
................................
......................
29
Figure 6 Recorder
................................
................................
................................
................................
.
29
Figure 7 Proposed System Architecture
................................
................................
...............................
30
Figure 8 Proposed System Collaboration Diagram
................................
................................
..............
31
Figure 9 Data Flow Diagram
................................
................................
................................
.................
32
Figure 10 Audio Signal Analysis
................................
................................
................................
...........
33
Figure 11 Low Pass Filter at cutoff frequency 1000 (Hz)
................................
................................
......
35
Figure 12 Bandpass filter at frequency range from 110 to 800 (Hz)
................................
.....................
35
Figure 13 Frequency Response o
f several low

pass filters
................................
................................
..
36
Figure 14 Classify the breathing pattern
................................
................................
...............................
40
Figure 15 the overall procedure flowchart
................................
................................
.............................
41
Figure 16 Pre

processing flowchart
................................
................................
................................
......
42
Figure 17 the signal pass through a low

pass filter
................................
................................
..............
44
Figure 18 the spectrum of the audio signal
................................
................................
...........................
48
Figure 19 the spectrum of the audio signal
................................
................................
...........................
48
Figure 20 Feature Extraction flowchart
................................
................................
................................
.
49
Figure 21 Zero Crossing Rate
................................
................................
................................
...............
51
Figure 22 End

point Detection using the ZCR and HOD
................................
................................
......
52
Figure 23 design of the tow

layers artificial back

propagation neural network
................................
.....
53
Figure 24 BP Neural Network training algorithm flowchart
................................
................................
...
55
Figure 25 The main interface
................................
................................
................................
................
58
Figure 26 The Control part of the interface
................................
................................
...........................
58
Figure 27 One sound file opened by the program.
................................
................................
...............
59
Figure 28 Prompt that inf
................................
................................
................................
.......................
59
Fi
gure 29 Displaying spectrum
................................
................................
................................
.............
60
Figure 30 Result displaying text box
................................
................................
................................
.....
60
Figure 31 Detection result showing hint
................................
................................
................................
61
Figure 32 Original signal wave display
................................
................................
................................
.
61
Figure 33 Mouth sound End

point Detection
................................
................................
........................
62
Figure 34 Nasal sound End

point Detection
................................
................................
.........................
62
Figure 35 Mixed breath pattern sound End

point Detection
................................
................................
.
63
Figure 36 Nasal breath only breathing pattern detection
................................
................................
......
64
Figure 37 Breath pattern detection for mixed breathing
................................
................................
.......
64
8
Declaration
“I hereby declare that for a period of two years following the date on which the dissertation is
deposited in the Library of the University of Ulster, the
dissertation shall remain confidential
with access or copying prohibited.
Following the expiry of this period I permit the Librarian of the University of Ulster to allow
the dissertation to be copied in whole or in part without reference to me on the
understanding
that such authority applies to the provision of single copies made for study purposes or for
inclusion within the stock of another library. This restriction does not apply to the copying or
publication of the title and abstract of the disser
tation. IT IS A CONDITION OF USE OF
THIS DISSERTATION THAT ANYONE WHO CONSULTS IT MUST RECOGNISE
THAT THE COPYRIGHT RESTS WITH THE AUTHOR AND THAT NO QUOTATION
FROM THE DISSERATION CAN NO INFORMATION DERIVED FROM IT MAY BE
PUBLISHED UNLESS THE SOURCE IS P
ROPERLY ACKNOWLEDGED.”
[Peng Yuan]
9
Abstract
The suggestion
s
of changing from the mouth breathing to the nose breathing
have been well
recognized by either patients or healthy person and have a positive impact on the daily life. This
project is trying to discriminate the nasal and mouth breathing patters in a pre

recorded sound file by
an acoustic sensor, and is further aim
s to detect or classify the mouth or nasal breathing pattern
by a
artificial back

propagation neural network.
Two participants involved in this experiment to record the breath sound file, and several recordings
have been done over approximate half a minu
te period for each file sitting on a chair in a quiet room.
The first purpose of the this project is to investigate the recognition rate of classifying the breathing
patterns, and if that is high enough for identifying the differences in the two patterns t
he second issue
is to detect them and then try to integrate the result into a intelligent device with a alarming system as
well as the motivational feedback for the patients to help them change the pattern from mouth to
nasal.
The result in this project i
llustrate that the breath pattern could be discriminated in certain place of
the body both by visual spectrum or the BP neura
l network classifier self built. The sound file recoded
in from the sensor placed on hollow show the most promising accuracy as hig
h as above 90%
.
H
owever,
the performance for the mixed breath pattern is not as good as the single breath pattern
either with nasal or mouth and the reasons have also been analysis theoretically.
10
1.
Introduction
It is well known that
change the breathing pattern from the mouth to the nose is good for the
individual health as recommended by the doctors either for a healthy person or a patient. The
purpose of this project is firstly to investigate the
principles of automated discrimination of
breathing patterns using an acoustic sensor, and if the two breathing types can be classified
with high accuracy for some certain locations, this project is secondly trying to program and
integrate it to a decision
support system device so that it can discriminate those difference,
and also try to optimise the algorithms to make the motivational feedback system more
intelligence and to make sure that it can work in various environments with improved
classification a
ccuracy.
1.1
Aims and Objectives
Two participants involved in this experiment to record the breath sound file, and several
recordings have been done over approximate half a minute period for each file sitting on a
chair in a quiet room.
The first purpose of
the this project is to investigate the recognition rate of classifying the
breathing patterns, and if that is high enough for identifying the differences in the two
patterns the second issue is to detect them and then try to integrate the result into a in
telligent
device with a alarming system as well as the motivational feedback for the patients to help
them change the pattern from mouth to nasal.
1.2
Outline of Thesis
There are
seven
chapters of all in the
final
report, an overview of the content of each c
hapter
is listed as below:
Chapter 2 presents the
development
of using biological sound as reference to diagnose
disease along with the attempt to monitor the status inside the bod
y to assist curing. In the
mean
while some related information about the back
ground of this project, include relevant
research of
sound signal analysis and processing method, the artificial neural network.
11
Chapter 3 performs the requirements analysis, which includes functional requirement and
non

functional requirement as well as
the use case diagrams and sequence diagrams based on
the user requirements. A summary of the specification of this project is also presented.
Chapter 4 gives a briefly introduction of the programming language and analysis tools as well
as the equipments th
at will be used in this project. And then presents the initial overall system
architecture design and data flow diagram along with the proposed specification of
implementation
, the digital sound signal analysis methods have also been initially designed as
well as the BP neural network
.
Chapter 5
detailed the
implementation of each technology
pre

designed
that used within this
project such as band

pass filter, end

point detection, pseudo

colour display
and the back

propagation neural network.
Chapter 6 is
the evaluation stage that tests the performance of this application. The interface
has been briefly introduced in this section and the result of the detection is displayed and
analyzed with several figures.
Chapter 7
is the final chapter in this report th
at concludes a summary of this project and the
proposed work remained to do in the
future
.
The summary conclude all the things have been
done so far and give a briefly discussion about the result obtained as well as the problem that
still exits to be a goo
d issue for the future work.
12
2.
Literature Review
2.1
Introduction to the exploitation of biological sound
The normal lung sounds have the interaction variations which is
universally known.
Meanwhile the lung sounds have the variability both in a single day and several days
continued (Mahagna and Gavriely, 1994). As fundamental of these major variations, nasal
and mouth of specific changes will only be seen from a larger
amount of discussion of
subjects. The aim that analysis the breathing pattern within the help of the computer is to
understand and store them objectively. As the hearing system of human beings attenuate at a
lower frequency, especially below 300 (Hz), a co
mputer

aided devices that could record the
sound within that low frequency will be essential helping for sound recognition. Since the
equipment is invasive and the whole procedure does not cost much, it has the potential to be
used for a healthy person ev
en for pneumonia patients which are regards as a high risk group.
Thus the analysis of lung sound spectra graph could be used for incipient pneumonia patients
to avoid the appearance of any radiologic abnormality or turned into even worse situation in
the
daily life.
2.2
The study of breathing pattern
(Forgacs et al., 1971) illustrate that the intensity of breath sounds at the mouth has a
relationship with the forced expiratory volume within one second of the unhealthy person
with chronic bronchitis and asthm
a. There lies a large potential possibility to assess whether
the ventilatory system have
anything
to do with the respiratory sound signal due to
development of the modern signal processing high technology. The major issue is focusing
on the variable whee
zing sounds that the distribution of the frequency band could clearly be
seen from the lung sound spectra graph. (Malmberg et al., 1994) has a study of investigating
the connection of the ventilatory system and frequency spectrum of the breathing sound
be
tween a heathy person and a patient with asthmatic. They found that in asthmatics, the
breath sound frequency distribution in terms of median frequency reflected acute changes in
airways obstruction with high sensitivity and specificity (Malmberg et al., 1
994) found that
the frequency distribution of the breathing sound, especially in the middle frequency band,
shows the acute diversification of the obstacles in airways. The patient emergency are system
13
has been influenced by a large number of signs informa
tion (Brown, 1997). The respiratory
rate is a vital sign in the way that reflect the potential illness accurately and if be correctly
used is a essential marker of metabolic dysfunction to help decision making in a hospital
(Gravelyn and Weg, 1980; Kriege
r et al., 1986). Primary clinical courses show the
importance of the changes in breathing rate and the requirement for long day use, invasive
and reliable respiratory monitoring devices has long been arisen.
2.3
Respiratory rate monitoring
T
he Respi

Check Re
spiratory Rate Monitor (Intersugical, 2009) is suitable for adults and
children in
an
older age to use. This electronic device apply an infrared sensor to the
breathing Respi

Check mask as a indicator (Breakell and Townsend

Rose, 2001). Respiratory
rate is
then detected continuously and the result is shown on the digital screen. The pre

installed audio and visual alarms system will be activated if no breathing has been detected
for a continue 10 seconds. Another particular alarming system works for cable di
sconnection
and battery changing. Researchers at the University of Arkansas established and measured
two similar with slightly difference biosensors that has the function of detecting important
physiological signs. Smart vests and fabrics are the typical o
rganic semiconductors which
enable the manufactures to make light, flexible devices that could be integrated with
biomedical applications easily.
2.4
Sound
Analysis
and Feature Extraction
Based on the research of the characteristics of the human voice and
hearing, a lot of
theoretical models of the sound signal analysis have been developed by researchers, such as
Short Time Fourier Transform, Filter Analysis, LPC Analysis and Wavelet
Analysis(
Jedruszek; Walker, 2003
). These theories have been widely used
for Sound Coding,
Sound Synthesis and Feature extraction of the Sound. In Linear Predictive Coding (LPC), the
feature of the sound could be extracted by calculating the coefficients in different order of the
linear predictive; The Filter Analysis Theory fi
rst filter out the frequency of the sound signal
by using a bandpass filter, then extract the frequency feature based on simulating the function
of the hearing system of the biological nerve cells.
14
2.4.1
Filter Analysis of Sound Signal
The main part of the Fil
ter Analysis Technology is a bandpass filter which is used to
separate and extract the different and useful frequency band of the signal. However, a
complete and perfect Filter Analysis Model should be a bandpass filter that followed by non

linear process
ing, low

pass filter, resampling by a lower sampling rate and compression of
the signal’s amplitude process. The common function for non

linear processing is Sin
function and Triangular windowing function. In order to smooth the sudden changing part of
the
signal, it should pass through a low

pass filter after the non

linear processing. The
Alternative process which re

sample the signal by a lower sampling rate or compress the
amplitude aims for reducing the calculation at later stage.
2.4.2
Feature Extraction
The fundamental problem of the sound signal recognition lies on what and how to choose the
reasonable features and characteristics of the signal. Sound signal is a typical time

varying
signal, however if zoom in the time of the signal to observe it at a m
illi

seconds level the
sound signal shows some certain periods that seems to be a stable signal to some extents, then
the features are extracted from several stable signals to represents the original signal.
In general, the characteristics parameter of the
sound signal fall into two types, one is the
features in time domain and the other is the features in frequency domain after transformation
of the signal. Usually, just the sampling values in one frame, such as the average amplitude of
short

time or the a
verage zero crossing rate of short

time, could constitute the characteristics
parameter in time domain. But the another type of features are obtained by transforming the
original signal into frequency domain, for example Fast Fourier Transform, to get the
LPCC
or MFCC features of the signal. The form type has the advantage of simple calculation but
has a large dimensions of feature parameters and not suitable for represent the amplitude
spectrum features. On the contrary, the latter type has a quite complex
calculation of
transforming, but could characterise the amplitude spectrum of the signal from several
different angles.
2.5
Digital signal
Characteristics in
Time Domai
n
15
2.5.1
Short

time Energy
The short

time energy of the sound signal reflects the characterist
ics of amplitude over the
time, the mathematical formula description is:
∑
(
)
(
2.5.2
the Average Z
ero Crossing Rate of Short

time
In the discrete time domain signal, if the adjacent sampling value has a different algebraic
symbol, such as 3
followed by

1 or

2 followed by 2, the situation is called zero crossing
(Mocree and Barnwell; Molla, and Keikichi, 2004)
. As the sound signal is a broad

band
signal, in order to extract the feature precisely a short

time transform should apply to the
ori
ginal signal named Short

time Zero Crossing Rate, defined as:
∑
(
)
(
)
(
)
W
here
(
)
{
(
)
(
)
A
nd
(
)
{
2.6
Digital signal Characteristics in
Frequency Domain
2.6.1
the Linear Predictive Coding
(LPC) parameters
The LPC analysis is based on the theory that the signal in this moment could be
16
approximately figured out by the linear combination several signals before. By minimise the
average variance between the actual sampling value and the linear
predictive sampling value,
the LPC parameters could be obtained.
2.6.2
the algorithm for LPC parameters
For a linear predictive model, the value of the n point S(n) is expressed by the linear
combination of several (p) sample points before n:
(
)
(
)
(
)
(
)
where a1, a2, ... ap is constant, then the above equation could be abstracted as:
(
)
∑
(
)
(
)
where Gu(n) is the normalised impulse response and the product of the gain coefficients.
Then the approximate system output is defined as:
̅
(
)
∑
(
)
so the approximate error of the system is:
(
)
(
)
̅
(
)
(
)
∑
(
)
As the linear predictive error is equals to the production of impulse and gain coefficients, that
is:
(
)
(
)
define the short

time sound signal and error as below:
(
)
(
)
(
)
(
)
17
then the sum of the squares error is:
∑
(
)
∑
(
)
∑
(
)
calculate the derivation of the LPC parameters that in the above equation in different orders
and set the result to zero respectively, the following equation is then obtain
ed:
∑
(
)
(
)
∑
̅
∑
(
)
(
)
Based on the related function:
(
)
∑
(
)
(
)
then:
(
)
∑
̅
(
)
the above equation contains several equations and variables, and the LPC parameters could
be obtained by solving the equations
The
minimum average squares error of the system is expressed as:
̅
∑
(
)
∑
̅
∑
(
)
(
)
(
)
∑
̅
(
)
There are many ways to solve the above equations, such as autocorrection (by Durbin),
Covariance method, etc. The recurrence formula of Durbin’s method
is:
(
)
(
)
(
)
∑
(
)
18
(
)
(
)
(
)
(
)
(
(
)
where the superscript i represent the i time iteration, and only calculate and update the a1,
a2, ... , ai in each iteration
until i = p.
2.6.3
the Linear Predictive Cepstrum Coefficients (LPCC)
In the sound recognition system, the LPC parameters are seldom used directly but instead
using the Linear Predictive Cepstrum Coefficients (LPCC) derived by the LPC, and the
cepstrum could
increase the stability of the coefficients. Then recurrence relation between
LPC and LPCC are as below:
∑
∑
where C0 is the DC component.
2.6.4
the Mel Frequency Cepstrum Coefficient
s (MFCC)
The LPC parameters are the acoustic feature derived by the research on the voice mechanism
of human beings, but the Mel Frequency Cepstrum Coefficients (MFCC) are obtained by the
achievement of research on the human hearing system. The basic theo
ry is that when two
tones of the similar frequency appears at the same moment, only one tone could be heard by
the human. The Critical bandwidth is the exactly bandwidth boundary where the human feel
the sudden change objectively, when the frequency of the
tone is less than the bandwidth
boundary, people usually mistake hearing the two tones as one, and that is called the
19
shielding effect.
Mel calibration is one of the methods to measure the critical bandwidth, and the calculation
of the MFCC is based on
the Mel frequency, the transformation with linear frequency is as:
(
)
MFCC is calculate per frame, first get the power spectrum s(n) of the frame by Fast Fourier
Transform, then transform it to the power spectrum under the Mel frequency
. But before
doing that, the original signal should pass through several bandpass filters:
(
)
where M is the number of the filters, N is the frame length.
The process of MFCC calculation is (
Mocree, 1995
);
a.
first get the
frame length (N), then apply the Discrete FFT to the signal after pre

emphasising each frame S(n), the power spectrum S(n) is obtained by squaring the
modulated the result calculated before.
b.
get the power value of the S(n) that pass through the M filters
Hm(n), that is calculate the
sum of the production of the S(n) and Hm(n) in each discrete frequency, the result is the
parameter Pm, m = 0, 1, ... , M

1.
c.
calculate the natural logarithm of the Pm to get Lm, m = 0, 1 ... , M

1, then apply the
discrete c
osine transform to the result Lm to get Dm, m = 0, 1, ... , M

1.
d.
leave out the D0 that represent the DC component, then D1, D2, ... Dk are regarded as the
MFFC.
However, the standard MFCC only show the static features of the sound signal, to get the
dyn
amic features, which is more sensitive to human hearing, the differential cepstrum should
be involved which is shown as below:
(
)
√
∑
∑
(
)
20
where c and d the parameter of one frame, k is a constant, then the differential coefficient of
the
current frame is called the linear combination of the former two frames and latter two
frames, and the equation above is named the first order differential of MFCC. Apply the same
equation to the first order differential of MFCC result in the second differ
ential of it, and so
on. In the actual usage, merger the MFCC with different order difference to form a vector as
the real MFCC of one frame signal.
In the description of the acoustic features, lower order coefficients could not represent the
sound signal
precisely but higher order leads to more complicated calculations, so it is very
essential to choose a appropriate order, most sound recognition system use the order range
from 10 to 15 for the LPC, LPCC or MFCC.
2.7
Artificial Neural
Network
The Artificial
Neural Network (ANN), which is composed of a large number of micro

processing units, is an interconnected nonlinear and self adaptive information processing
system. Inspired by the modern neural science research achievements, the ANN is trying to
simulate
the way that the biological neuron network does to process and store a large amount
of information simultaneously.
In the artificial neural network, each processing unit represents different objects such as
features, letters, sentences or meaningful abstr
act patterns. The whole network is built up by
three main components which is the input layer, the hidden layer and the output layer,
input layer receives the signal and data from the outside world and the output layer gives the
result of that being proce
ssed by the network. The hidden is placed between the input and
output layers that could not be observed from the outer system directly, and the number of the
hidden layer is flexible, the more quantity the hidden layers have the more complex
computation t
he network will have and thus could deal with a larger amount of intricate
information.
The weights between the neurons reflect the connection strength of the processing units, and
the connected relationships between them represent the expression and proce
ssing of the
input information. Artificial Neural Network, in general, is a non

procedural, have the self
adaptability and the biological brain style alike information processing system. It is
essentially to adjust the connections and weight between the un
its to have the ability of
21
parallel and distributed information processing. It is an interdisciplinary fields that involves
neural science, thinking science, artificial intelligence, computer science, etc.
22
3.
Requirements Analysis and Functional Specification
The user requirements analysis means to determine the product

specific performance and
document the customer needs that generate the functional and non

functional requirements.
An initial
analysis is in a trying to define the project clearly and specify the proposed solution
approach.
Figure
1
Requirements analysis is in the first stage (Wikipedia)
Requirements analysis is the first stage in the systems
development process followed by
function specification, software architecture, software design, implementation, software
testing, software deployment and software maintenance.
3.1
Functional requirements
Functional requirements define the function of
certain
software based on the user requirement
analysis which specifies
the inputs, outputs, external interface and other special management
information needs. The basic functional requirements are listed as bellow:
a.
Collect the record sound of breath
either
with
single nasal pattern or mouth pattern or
mixed patterns in different place of the body with acoustic sensor.
b.
Collect the record sound of breath
either with single nasal pattern or mouth pattern or
mixed patterns
in
two participants.
c.
Find the
place that has
the best performance with the sensor and the highest accuracy in
the pattern detection
.
23
d.
Try to find
whether
the time when record the sound file has an effect to the detection
result, say in the morning or in the late after, etc
.
e.
Discriminate the nasal an
d mouth breath
with mixed patterns in a short time to identify
the shortest time needed for the analysis
.
f.
Improve the
of
algorithm to
detect
the nasal and mouth breath
pattern in different
of
situations
with
better performance.
Here is the system Use Case
diagram:
Figure
2
System Use Case Diagram
24
Here is the System Sequence Diagram:
Figure
3
System Sequence Diagram
3.2
Non

functional requirements
The non

functional requirements means the requirement
does not relate to the functionality
but how it performs its task concerning the attributes such as reliability, efficiency, usability,
maintainability and portability. For this project the non

functional requirement is shown as
bellow:
a.
The pre

recorded
sound should be discriminated no matter when it was recorded and who
it was recorded from.
25
b.
The discrimination should perform within the specific time such as when breathe with a
certain pattern last for 3 seconds it can then be detected
c.
The device should b
e easy to use for end users, for non

invasive to human body and non

obstruct to the daily life.
d.
Easy to
transfer
the program from one device to another.
e.
Facilitate to maintenance in the follow

up usage.
3.2.1
The frequency range to use
For a healthy person,
the frequency band of the vesicular breathing sounds is range from 0 all
the way to 1000 (Hz), meanwhile the power spectrum shows the main energy lies between
the frequency 60 and 600 (Hz) (Pasterkamp et al., 1997). (Gross et al., 2000) also illustrate
so
me other sound like wheezing has been carried by the frequency over 2000 (Hz). The
general lung sound detection using the frequency in low, middle and high band which is 100
to 300, 300 to 600, 600 to 1200 (Hz) respectively (Gross et al., 2000; Sanchez and
Pasterkamp, 199; Soufflet et al., 1990; Shykoff et al., 1988). So this project focus on the
frequency band 1 to 1200 (Hz).
3.2.2
P
lacement of the sensor
This project focus on the frequency below 1200 (Hz), in order to explore the different voices
in different locations, the sensor has been placed in five areas of the body, the chest, chin,
hollow, right shoulder and left shoulder. And the performance in ea
ch place
should be
assessed before
feature extraction and pass through the BP neural network.
As a very
sensitive sensor is used, the noise really misleading the judgments of the detection, so get rid
of the noise is the first consideration before the pre

processing
3.3
Summary
The first step is to find the best place for the senor and build the recording system that record
the breath sound inside the body as digital file for analyzing. Before the sound analysis there
is a pre

processing to get rid of the noi
se to facilitate the detection in a later stage.
Some
place has a good performance for the analysis system and others do not, this project also has
a assessment of different people and recorded in different time of the day. Certain frequency
26
band has diffe
rent usage in different steps of the whole project, this is the major problem
need to be figured out. Discriminate the breath pattern in the daily left outside the laboratory
is the final attempt of the report, but to all (Baird and Neuman, 1992) know, suc
h a frequently
used device have not come out into the products level.
27
4.
Design
4.1
MATLAB
MATLAB is a numerical computing environment and fourth generation programming
language (Wikipedia, MATLAB). It is
actually a software package for engineering analysis
and powerful enough to meet almost anything an engineer could desire. MATLAB has a
great graphics capability that simple for users to draw just about anything which are required
and also a powerful simul
ating capability that the analysis toolbox consist of hundreds of
simulators with which the engineer can simulate a program to see how mimic it can perform,
and hence do the modification and improvement.
Figure
4
An example of MATLAB simulation
4.1.1
Introduction of the related Matlab function
This section introduces several Matlab functions that related to spectrum analysis and
display, as well as the equation derivation and the way to choose appropriate parameters.
4.1.1.1
Short time spectrum analysis
A.
frame and windowing function
(Eva and Anders, 1999)
28
The
windowing function that provided by the matlab is hamming(N), hanning(N),
blackman(N) and bartlett(N), N is the window length (frame length). Every windowing
function
has its featured characteristics and usage that be used in different situations as
requested, in this case the Hamming window has been added to the original audio signal.
Normally a power of 2 has been chosen as the frame length (N), such as 512 and 1024,
to
facilitate the calculation the Fast Fourier Transform (FFT), though any constant could be
applied.
B.
Fast Fourier Transform (FFT) function
Function fft(S) is provided by Matlab, where parameter ‘S’ is one frame of the windowed
signal. Notice that the
frequency domain sampling value of the real

time signal after FFT is
symmetrical about the mid

point (that is half the sampling frequency), so only the first half of
the result matrix of expression fft(S) is useful.
C.
Get the conjugate of the Complex
Ma
tlab provide function conj(Z) to get the conjugate of the complex ‘Z’, Here parameter ‘Z’
is the result matrix of fft(S). This function could also be used to calculate the amplitude
(X(m, k)) of expression X(m, k) which is a complex.
4.2
Equipment
The eq
uipments needed to build the system are provided by Axelis Ltd. Here is a briefly
introduction of such equipments.
4.2.1
Acoustic sensor
The acoustic sensor used was supplied by Axelis Ltd. and is covered by United State Patent
No: US 6,937,736 filed by Measur
ement Specialties Inc in 2005.
29
Figure
5
Acoustic Sensor
4.2.2
Sound Recorder
Along with the acoustic sensor there is a recorder connect to it which can record the sound
that the acoustic sensor sensed directly into a flash drive
plugged in.
Figure
6
Recorder
The recorder is designed in an easy to use approach with a flash drive plugged in by USB
port and has a rechargeable battery that enable the user to do the record everywhere
appropriately. And also
it will save the sound automatically when begin to record in a high
quality of Windows Audio file to the flash drive ready for analyzing.
4.3
System architecture
System architecture is the overall hardware/software configuration and database design
including
subsystem components supporting a particular application. There usually is a
mapping of functionality onto the hardware as well as software components.
30
By creating the system architecture the system disintegrated into small structural elements,
and also s
ubsystems that simplify the problem by dividing the whole system into reasonably
independent pieces that can therefore be solved separately.
Figure
7
Proposed System Architecture
Figure
7
shows how this proposed system works.
First the acoustic sensor is wore by a person
needed, then the acoustic sensor could sense the sound inside the body of the user and pass
the sensed data to a sound recorder, after the sound has been processed into a proper type the
sound analyzer will ana
lyze the sound file, if a inappropriate breath pattern has been detected
the analyzer will inform the alarm system to give proper recommendations.
31
Figure
8
Proposed System Collaboration Diagram
4.4
Data
modeling
Data modeling is
to create a data model that descripts the data flow in a software. It defines
and analyzes the data requirements needed to support a specific software, it also present the
data that associated and define the relationship between the data components and str
ucture.
Here is the data flow diagram:
32
Figure
9
Data Flow Diagram
Figure
9
presents how the data flows inside the whole process. After the sound data has been
sensed by the acoustic sensor it will be stored in the sound recorder ready for processing, then
the processed data will be passed to the analyzer for analyzing.
4.5
Analyzin
g methods
In the last few years, MATLAB has become the main tool to process data and models
mathematically and been generally used by universities academic research as well as business
product, and its power in dealing with mathematics has been well approved (Brandt
, 2005).
One great advantage of using MATLAB for analysis of audio signal is that the user will be
forced to understand the process result more comprehensibly than in the beginning when
even do not know how to use the menu

based software
33
Figure
10
Audio Signal Analysis
This also means once the user passes the threshold that exists originally, he will become
specialized in a particular field of analyzing with MATLAB. In the mean time the
MATLAB’s path is automatically traceable w
hich means the user can have a clear clue about
what happened anywhere in the middle of processing, this is especially important in some
analysis requirement.
4.6
Sound Signal Pre

processing
4.6.1
Cutting off frequency band
Research shows the frequency of lung sound lies mainly below 1000 (Hz) and this project
focus the frequency under 1200 (Hz). Different frequency bands are used for different
function for this experiment, the much lower frequency under 100 (Hz) is cut off t
o extract
acoustic features and the higher frequency above 1200 (Hz) carry a lot of noise which has
been filtered out at the first stage, the rest in between is used for end

point detection. So the
34
original signal should pass through several bandpass filte
rs to cut out the specified frequency
band.
However, unlike the speech signal come out from the lips that has a attenuation of 6dB/oct,
the pre

recorded sound signal do not have to be pre

emphasized as they came from the
acoustic sensor attached to the ho
llow.
4.6.2
Filter Design
Based on the theory above, a filter called Butterworth could be designed by the Matlab
function ‘butter’:
(
)
where parameter ‘order’ is the order of the filter that results in a better filter effect when
use a
larger order but also brings in a larger quantity of calculations, and the length (L) of the
parameter vectors ‘u’ and ‘v’ have a relationship with parameter ‘order’ that:
Parameter ‘Wn’ is the normalization value of the frequency that
to be filtered. When the
sample frequency is expressed as fs, as the highest frequency that could be processed is fs/2,
if it is the frequency (f) 2000 that aims to be filtered out then:
(
)
Parameter ‘function’ is a string that indicate the specifi
ed function of the filter. For example
function = ‘low’ means a Low Pass Filter, function = ‘high’ represents a High Pass Filter.
35
Figure
11
Low Pass Filter at cutoff frequency 1000 (Hz)
As shown in the above figure Frequency
Response, when the original signal pass through the
filter, the different frequency will be applied to the decaying rate from 1 to 0 accordingly. It
could be seen obviously that it is a Low Pass Filter at cutoff frequency 1000 (Hz).
Figure
12
Bandpass filter at frequency range from 110 to 800 (Hz)
Based on the formula Lu,v = order + 1, the higher the order of the frequency is the more
effective the filter is as the parameter vectors ‘u’ and ‘v’ will have a larger leng
th, but it will
36
requires a more complex calculation as well. On the contrary, decreasing the order of the
filter means a smaller length of the ‘u’ and ‘v’, reduced calculation that lead to worse filter
effect.
Figure
13
Frequenc
y Response of several low

pass filters
It is evidently to see from the above figure, the filter becomes increasingly effective when
applies a larger order gradually from 1 to 8.
4.6.3
End

point Detection of the Signal
The features of the sound signal effect
the performance of the whole recognition system, end
point detection, that is detect the beginning as well as the ending of the meaningful sound
signal, is the premise of the feature extraction. Many ways could be used for end point
detection in time domai
n, typical ones are short

time average amplitude, short

time average
zero crossing rate and short

time energy.
37
4.6.3.1
Short

time average amplitude method
Apparently, when the meaningful part of the signal come out, the short

time average
amplitude change
obviously. According to this change end points could be detected, and the
equation for calculating short

time average amplitude is:
(
)
∑
(
)
4.6.3.2
Short

time energy method
In most of the actual experiment, the concept of short

time average
amplitude is usually
substitute by short

time energy to describe the amplitude features of the sound signal, several
ways to calculate the energy are:
(
)
∑
(
)
which is called the absolute energy, and
(
)
∑
(
)
which is called square energy, or
(
)
∑
(
)
which is named logarithm energy.
The short

time energy increase a lot when the useful part of the signal begins until the end of
that, it then reduces gradually. As discussed the short

time energy is also a good way to
detect the end point
.
4.6.3.3
Short

time average zero crossing rate method
38
In the general acoustic sound signal most of the energy lies on the higher frequency band, and
high frequency means a higher zero crossing rate, then the energy would have some
relationship with the zero
crossing rate. but the breathing sound is very unlike the normal
speech signals. Firstly, a large part of the recored file are noise as the equipment used is a
very sensitive sensor which recorded the breathing sound inside the body as well as the noise
fr
om the skin and airflow through the skin, sometime the noise is much larger than the useful
breathing sound. Secondly most of the energy lies on the frequency band below 100 (Hz)
which is a frequency band the human beings could barely hear it, this frequen
cy band is also
the most useful band that the features extracted from. So it is uncommon to see the much
lower frequency band has a higher zero crossing rate as the changing rate is larger in that
particular band.
4.7
the principle of Back

propagation Neural
Network
Back

propagation learning algorithm is also called BP algorithm, and the artificial neural
network that related is also known as BP network. The BP learning is a supervised multi

layered feedforward neural network algorithm.
In a single ANN witho
ut hidden layer, the δ learning algorithm could be applied to the input
and output sampling data to train the network. However, the multilayered feedforward
perceptron introduces one or several hidden layers in between whose target output is
unknown to the
network, therefore the output error of the hidden layer could not be
calculated directly and the supervised learning algorithm for the single layer perceptron
training could not work out either.
It is vital important that the back propagation refers to th
e output errors but not feedback the
result of the output to the hidden layers or even to the input layer. The network itself does not
have the feedback function but back propagate the output errors to adjust the connection
weights of the hidden layers and
output layer, so the BP network could not be regarded as
nonlinear dynamic systems but a nonlinear mapping system.
39
4.7.1
Feed

forward Calculation
Consider a tow layers neural network, that introduces one hidden layer between the input and
output layers, the
direction of the arrows indicates the way that the information flow through
the network. The node pointed by the arrow is called the low layer of the arrow and the node
in the arrow tail is the upper layer of the arrow, then the output of the j node in a
given
training samples could be expressed as:
∑
where oi is the output of the i node in the upper layer, wij is the connection weight between
the i node in the upper layer and the j node in current layer, as for the input layer the input is
always equals to the output at any node.
The output (oj) of the j node is the transformation of its input by the expression given below:
(
)
where the output (oj) is taken as the input of nodes in the lower layers.
Abstract the above expre
ssion to get the function:
(
)
then the differential expression of the output (oj) is given as:
(
)
(
)
(
)
(
)
(
)
4.7.2
the rules of weights adjustment in BP Neural Network
If set the target output of the j node in the output layer as tj
, the output error is then obtained
as tj

oj, back propagate this error value from the output layer to the hidden layers and
continually adjust the weights according to the principle of the amendment to decrease the
errors. The error function for the net
work is:
40
∑
(
)
In order to have the error (e) a decrease trend, the amendment of the weights should follow
the gradient descent of the error function, that is:
where η is a gain coefficient that greater than zero.
4.7.3
The breath pattern
classification flowchart
Figure
14
Classify the breathing pattern
41
5.
Implementation
The whole procedure of this project involves five stages. In the user stage the audio files that
contain the breathing sound are recorded by the acoustic sensor by two people, and there are
three types of sound files one is mouth breathing only and one is
nose breathing only and the
third type mix breath patterns with mouth and nose. Both the two of us record the file in a
quiet room with smooth breathing sitting on a chair for about half a minute.
The second stage is called pre

processing, before the
feature extraction the pre

recorded
sound file has been filtered out certain frequency band and added window to smooth the
signal as well as Fast Fourier Transform.
Characteristic extraction is in the third stage that involves process like End

point Detect
ion,
passing through band

pass filter, etc.
The Back

propagation Neural Networks is established independently after the feature
extraction, about fifty training data including mouth breathing and nose breathing are trained
by the Neural Network to adjust t
he weights.
When come to the recognition stage, the testing data are input the Neural Network for
detection using the weights obtained in the training process, and some expert experience are
add to the result of the detection manually. The whole process a
re shown int the flowchart
below:
Figure
15
the overall procedure flowchart
42
5.1
Pre

processing
The pre

processing procedure also has several steps, after load data from the audio file the
first step is to filter out the noise in th
e higher frequency above 1200, after that the signal
should be cut into smaller frame and windowed for each one, then apply the Fast Fourier
Transform to the windowed frame, finally we get the spectrum map by pseudo

color
mapping. The steps are illustrated
clearly by the graph:
Figure
16
Pre

processing flowchart
5.1.1
Digital Filter Applications
Theoretically, a filter is constituted by two vectors ‘u’ and ‘v’ with the length ‘m’ and ‘n’ for
the vector ‘u’ and ‘v’ respectively, the
expression is shown as below:
, u1 = 1
43
When apply the digital filter with the parameter vectors ‘u’ and ‘v’ to a discrete audio signal
s(t), result in the filtered signal S(n) shown as below:
(
)
(
)
after merging the polynomial, S(n) is expressed like this:
(
)
For instance, choose the particular vector value u = [1] and v = [1/3, 1/3, 1/3, 1/3], then the
output of the filter is:
(
)
(
)
(
)
(
)
(
)
This customized filter take the average value of the previous four points, thus has the effect
of Low Pass, in other words it filters off the high

frequency band of the original signal by
averaging out them but leaving out the low

frequency band relatively
. Such kind of filter is
named Low Pass Filter.
5.1.2
Apply filter to the digital signal
In order to filter off the noise in the sound signal, pass the signal through a low

pass filter that
the frequency below 1200 (Hz) will get pass while the frequency
above 1200 (Hz) get filtered
out. The original signal and filtered signal are shown below:
44
Figure
17
the signal pass through a low

pass filter
5.2
Principles of Spectrum Analysis and Display
5.2.1
Short Time FFT Spectrum Analysis of
Discrete Signal
The spectrum analysis of signal is based on Short Time Fourier Transform (STFT) analysis
of discrete time domain. Discrete time domain sampling signal can be expressed as x(n)
where n = 0, 1, … , N

1 means the sampling point number and N i
s the signal length. In the
process of the digital signal people usually frame the signal by adding window on it, then
x(n) could be expressed as Xm(n) where n = 0, 1, … , N

1 and 'm' means the number of the
frame, 'n' is the time number of the synchronous
frame, N is the sampling points within one
frame known as the frame length. the Discrete Time domain Fourier Transform (DTFT) of
windowed signal Xm(n) could be illustrated as below:
(
)
∑
(
)
(
)
in order to simplify the discrete calcula
tion, the Discrete Fourier Transform (DFT) of wm(n)
* xm(n) has usually been used:
45
(
)
∑
(
)
(
)
then the X(m, k) is the estimated value of short

time amplitude in terms of one frame
Xm(n). Take m as the time variable, k as
the frequency variable then X(m, k) is the dynamic
spectrum of signal x(n). Since the Decibel (DB) could be calculated as:
(
(
)
)
(
(
)
)
we can get the dynamic spectrum of the signal displayed by DB. Again simplify the
calculation of the X(m,
k) by Fast Fourier Transform (FFT). (Cooley and Tukey, )
5.2.2
the dynamic spectrum display of the Pseudo

color coded Mapping
Take 'm' as the abscissa, 'k' as the ordinate and the value of X(m, k) as the pseudo

color
mapping on the two

dimensional plane, we get the dynamic spectrum of the signal x(n).
Mapping the value of X(m, k) to the pseudo

color enables better resolution
and visual
effects of the dynamic spectrum as well as the improvement of the diagram's readability. The
method is firstly mapping the minimum value (Xmin) of X(m, k) to the normalized zero, the
maximum value (Xmax) of X(m, k) to the normalized 1 and t
he rest of them to the Ci
between 0 and 1 linearly. Secondly, display the the Ci by the mapped color on the monitor. In
order to make full use of the dynamic range of the color space, the appreciated base spectrum
value should be chosen. The value that les
s than the base is limited on the base and that
greater than the base then be normalized linearly. The color value matrix is expressed as C =
{c(m, k)} then the mapping from X(m, k) to c(m, k) is illustrated mathematically as below:
(
)
(
)
(
(
)
(
)
)
where:
(
)
{
(
)
(
)
(
)
46
5.2.3
Broad

band spectrum and Narrow

band spectrum
According to the Discrete Fourier Transform (DFT) analysis principle, the frequency
resolution of the spectrum refers to the in
terval between the discrete frequency, that means
the frequency interval (f0) represented by variable ‘k’ in the expression X(m, k). The value
depends on the frame length N and the sampling frequency of the signal fs. Based on the
Nyquist sampling theorem,
f0, fs and N fall into the relationship as below:
As the formula suggested, the frequency interval (f0) has nothing to do with the frequency
that the signal contains. As long as the sampling frequency is a constant, increase the frame
length (N)
will result in the higher resolution of the spectrum or the smaller bandwidth that
represented by the ‘k’ in the expression X(m, k), in that case the spectrum will tend to be a
Narrow

band one, otherwise it will be a Broad

band spectrum.
Increase the resol
ution in frequency domain by using a larger value of N will result in a lower
resolution in time domain of the spectrum. The way to resolve the contradiction is to
introduce the sub

frame by frame shift (N1, N1 < N ) while choosing a larger but appropriate
frame length (N), in this way a spectrum that with balanced resolution in frequency domain
and time domain will be obtained. the sub

frame shift could be illustrated as below:
(
)
(
)
5.2.4
Pseudo

color mapping and
display of the spectrum
Pseudo

color mapping function colormap(MAP) is built in Matlab, parameter ‘MAP’ is the
vector, which is a 64 rows by 3 columns matrix, that used for pseudo

color mapping, each
column represents the saturation of the color red, gre
en and blue respectively. For instance,
expression MAP = [0 0 0] means the pure black mapping, expression MAP = [1 1 1] means
the pure white mapping and MAP = [1 0 0] means the pure red mapping. Meanwhile
Comments 0
Log in to post a comment