Bibliography - GTH - Universidad Politécnica de Madrid

skillfulwolverineΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 6 μήνες)

121 εμφανίσεις

UNIVERSIDAD POLITÉCNICA DE MADRID

ESCUELA TÉCNICA SUPERIOR DE INGENIEROS

DE TELECOMUNICACIÓN


P
ROYECTO FIN DE
M
ASTER


I
MPLEMENTATION OF AN
AFFECTIVE
CONVERSATIONAL AGENT

FOR
CONTROLLING A
H
I
-
F
I
SYSTEM

A
UTHOR
:

J
USTO
J
AVIER
S
AAVEDRA
G
UADA

T
UTOR
:

D
R
.

J
UAN
M
ANUEL
M
ONTERO
M
ARTÍNEZ

S
UPERVISOR
:

S
YAHEERAH
L
UTFI





M
ADRID
,

201
1


C
ONTENTS


1.

I
NTRODUCTION

................................
................................
.................

1

1.1.

O
BJECTIVES

................................
................................
................

2

2.

S
YSTEM ARCHITECTURE

................................
................................
.......

3

2.1.

GALAXY
-
II:

A

REFERENCE ARCHITECTU
RE FOR CONVERSIONAL
SYSTEM DEVELOPMENT
...............................

3

2.2.

NEMO

A
RCHITECTURE
:

O
UR ARCHITECTURE

................................
................................
........................

6

2.2.1.

P
REVIOUS DIALOGUE SYS
TEM ARCHITECTURE

................................
................................
................

7

2.2.2.

NEMO

A
RCHITECTURE
:

N
EW ARCHITECTU
RE

................................
................................
................

8

3.

S
YSTEM
C
OMMUNICATIONS

................................
...............................

13

3.1.

C
ONCEPTS

................................
................................
................................
................................
......

13

3.1.1.

P
EER
-
TO
-
PEER

................................
................................
................................
........................

14

3.1.2.

CLIENT
/
SERVER MODEL

................................
................................
................................
............

14

3.1.3.

A
DVANTAGES
/D
ISADVANTAGES

................................
................................
................................

15

3.2.

C
OMMUNICATION
M
ETHODS
&

T
OOLS

................................
................................
..............................

16

3.2.1.

S
OCKETS

................................
................................
................................
................................

16

3.2.2.

S
ERVICE ORIENTED ARCH
ITECTURE
(
SOA
)

................................
................................
....................

17

3.2.3.

WEB SERVICES AND SOA

................................
................................
................................
...........

18

3.2.3.1.

S
IMPLE OBJECT ACCESS
PROTOCOL
(
SOAP
)

................................
................................
..............

19

3.3.

F
INAL IMPLEMENTATION

................................
................................
................................
..................

21

4.

E
MOTIONAL
S
YSTEM

................................
................................
........

23

4.1.

E
MOTIONAL
T
HEORIES

................................
................................
................................
.....................

23

4.1.
1.

T
WO FACTOR THEORY OF
EMOTION

................................
................................
...........................

24

4.1.2.

O
RTONY
,

C
LORE
,

AND
C
OLLINS

................................
................................
................................
.

24

4.1.3.

T
HEORY OF ROSEMAN

................................
................................
................................
..............

28

4.1.4.

T
HEORY OF
F
RIJDA

................................
................................
................................
..................

30

4.1.5.

T
HEORY OF
O
ATLEY AND
J
OHNSON
-
L
AIRD

................................
................................
..................

33

4.2.

THEORIES BEHIND OUR
EMOTIONAL SYSTEM

................................
................................
........................

34

4.2.1.

M
ASLOW

S HIERARCHY OF NEEDS

................................
................................
..............................

35

4.2.2.

A
PPRAISAL THEORY

................................
................................
................................
.................

37

4.3.

N
EMO
:

E
MOTIONAL
S
YSTEM

................................
................................
................................
............

39

4.3.1.

N
EEDS

................................
................................
................................
................................
...

40

4.3.2.

A
PPRAISAL
S
................................
................................
................................
............................

40

4.3.3.

E
MOTIONS

................................
................................
................................
.............................

4
3

4.3.3.1.

C
ONSTANT WEIGHT


F
(
W
)

................................
................................
................................
...

44

4.3.3.2.

E
MOTION TIMING
................................
................................
................................
................

44

4.3.3.3.

N
EUTRAL EMOTION

................................
................................
................................
.............

44

4.3.4.

M
APPING APPRAISALS

................................
................................
................................
.............

45

5.

O
BJECT ORIENTED ANALY
SIS

................................
...............................

46

5.1.

O
BJECTS

................................
................................
................................
................................
........

46

5.2.

CLASSES

................................
................................
................................
................................
.........

47

5.2.1.

M
ETHODS OF A CLASS

................................
................................
................................
..............

47

5.
2.2.

E
NCAPSULATION
&

A
CCESSIBILITY

................................
................................
.............................

48

5.2.3.

I
NHERITANCE

................................
................................
................................
..........................

48

5.2.4.

P
OLYMORPHISM

................................
................................
................................
.....................

49

5.3.

CLA
SSES IN THE
N
EMO
:

E
MOTIONAL
S
YSTEM

................................
................................
......................

50

5.3.1.

N
EED CLASSES

................................
................................
................................
.........................

53

5.3.2.

E
MOTIONS
C
LASSES

................................
................................
................................
.................

57

6.

H
I
-
F
I SYSTEM

................................
................................
.................

61

6.1.

I
NTRODU
CTION TO THE HI
-
FI DIALOG SYSTEM

................................
................................
......................

61

6.2.

H
I
-
F
I
A
PPLICATION

[
36
]
................................
................................
................................
...................

63

6.3.

M
ENU

................................
................................
................................
................................
...........

63

6.4.

S
YSTEM
M
ESSAGE

................................
................................
................................
...........................

64

6.5.

C
ONFIGURATION

................................
................................
................................
.............................

65

6.5.1.

S
ETTINGS THAT AFFECT
THE
K
NOWLEDGE
M
ANAGEMENT
M
ODULE

................................
................

66

6.6.

S
TATE OF THE SYSTEM

................................
................................
................................
......................

68

6.7.

R
ECOGNITION

................................
................................
................................
................................
.

69

6.7.1.

V
OICE ACTIVITY DETECT
OR SETUP DEDICATED C
ONTROLS

................................
...............................

69

6.7.2.

D
EDICATED CONTROLS FO
R RECORDING AND PLAY
BACK

................................
................................

71

6.7.3.

O
SCILLOSCOPE SETUP DE
DICATED CONTROLS

................................
................................
...............

72

6.7.4.

R
ECOGNITION DEDICATED

CONTROLS

................................
................................
.........................

72

6.8.

L
ANGUAGE
U
NDERSTANDING
M
ODULE

................................
................................
..............................

74

6.9.

D
IALOGUE MANAGER

................................
................................
................................
.......................

75

6.9.1.

D
IALOGUES
G
OALS DEDICATED CONTR
OLS

................................
................................
..................

76

6.9.2.

D
IALOGUE CONCEPTS DED
ICATED CONTROLS

................................
................................
...............

76

6.9.3.

T
HRESHOLD CONFIGURATI
ON CONTROLS

................................
................................
....................

77

6.9.4.

D
IALOGUE MEMORY CONTR
OLS

................................
................................
................................
.

78

6.10.

E
XECUTION MODULE

................................
................................
................................
....................

79

6.11.

R
ESPONSE
G
ENERATION MODULE

................................
................................
................................
..

80

6.12.

S
YNTHESIS

................................
................................
................................
................................
..

84

6.13.

D
IALOGUE FEATURES USE
D IN THE EMOTIONAL S
YSTEM

................................
................................
....

84

7.

S
UPERVISOR

................................
................................
..................

86

7.1.

H
OW IT WO
RKS
?

................................
................................
................................
.............................

86

7.2.

I
NTERFACE

................................
................................
................................
................................
.....

87

7.3.

U
SER
C
ASE
M
ODEL
................................
................................
................................
..........................

88

C
ONCLUSI
ONS

................................
................................
.....................

89

F
UTURE
W
ORK

................................
................................
....................

89

A.

C
OMPUTER
V
ISION

................................
................................
..........

90

A.1


F
ACIA
L EXPRESSIONS

................................
................................
................................
........................

91

A.2

T
ECHNIQUES

................................
................................
................................
................................
...

93

A.2.1

H
AAR
C
LASSIFER
:

V
IOLA
J
ONES METHOD

................................
................................
....................

93

A.2.2

M
OTION FLOW

................................
................................
................................
.......................

95

A.3

O
UR
S
MILE
D
ETECTOR

................................
................................
................................
.....................

95

A.3.1

T
RAINING

................................
................................
................................
...............................

96

A.3.2

P
ICTURE
T
EST

................................
................................
................................
.........................

98

A.3.4

R
EAL TIME TEST

................................
................................
................................
.......................

99

B.

H
OW
T
O
R
UN
T
HE
S
YSTEM

................................
.............................

101

C.

R
ESPONSE
G
ENERATION
T
EMPLATES
E
XAMPLES

................................
...

115

B
IBLIOGRAPHY

................................
................................
...................

117


F
IGURES

Figure 1: Galaxy
-
II Architecture

[2]

................................
................................
................................
...............

4

Figure 2: Dialogue System basic architecture

................................
................................
...............................

8

Figure 3: Architecture

................................
................................
................................
................................
.

11

Figure 4: Architecture data flow

................................
................................
................................
.................

12

Figure 5: Classification of Computer Systems
[5]

................................
................................
.......................

13

Figure 6: Peer
-
to
-
Peer Architecture

................................
................................
................................
............

14

Figure 7: Client
-
Server Architecture

................................
................................
................................
...........

15

Figure 8: SOAP request example

................................
................................
................................
.................

20

Figure 9: SOAP response example

................................
................................
................................
..............

20

Figure 10:
Example rule.xml

................................
................................
................................
........................

21

Figure 11: sendemotionstohifi Jabon
-
xml message

................................
................................
...................

22

Figure 12: Two Factor Theory

[17]

................................
................................
................................
..............

24

Figure 13: Structure of
emotion types in the theory of Ortony, Clore and Collins

[18]

.............................

26

Figure 14: Frijda Emotional System

[24]

................................
................................
................................
.....

33

Figure 15: Maslow Hierarchy of Needs

................................
................................
................................
.......

36

Figure 16: Nemo Architecture

................................
................................
................................
....................

39

Figure 17: Nemo UML diagram

................................
................................
................................
...................

50

Figure 18: LevelTimeHistory class

................................
................................
................................
...............

51

Figure 19: CIOProcess class

................................
................................
................................
.........................

53

Figure 20: CNeedProcess class

................................
................................
................................
....................

54

Figure 21: TNIFV class

................................
................................
................................
................................
.

54

Figure 22: CSurvival class

................................
................................
................................
............................

55

Figure 23: CSafety class

................................
................................
................................
...............................

55

Figure 24: CSocial class

................................
................................
................................
...............................

56

Figure 25: CSuccess class

................................
................................
................................
............................

56

Figure 26: CEthics class

................................
................................
................................
...............................

57

Figure 27: CEmotion class

................................
................................
................................
...........................

58

Figure 28: CSurprise class
................................
................................
................................
............................

59

Fi
gure 29: CSad class

................................
................................
................................
................................
...

59

Figure 30: CHappy class

................................
................................
................................
..............................

59

Figure 31: CFear class

................................
................................
................................
................................
..

59

Figure 32: CAngry class

................................
................................
................................
...............................

60

Figure 33: CShame class

................................
................................
................................
..............................

60

Figu
re 34: Dialogue System Architecture
................................
................................
................................
....

61

Figure 35: Hi
-
Fi Main Windows

................................
................................
................................
...................

63

Figure 36: Modules submenu

................................
................................
................................
.....................

64

Figure 37: Ver submenu

................................
................................
................................
..............................

64

Figure 38: Configuration dialog

................................
................................
................................
...................

66

Figure 39: Overview of the dialog box "HiFi State"

................................
................................
....................

68

Figure 40: Overview of the dialog box "H
iFi Recognition"

................................
................................
..........

69

Figure 41: Representation of the energy of an audio signal, along with levels of the detector and the
frames m
arked as the start and end

................................
................................
................................
...........

71

Figure 42: Dedicated controls for recording and playback of audio files, located in the dialog box
"HiFi
Recognition"

................................
................................
................................
................................
................

71

Figure 43: Dedicated oscilloscope control settings, located in the dialog box "HiFi Recognition"

............

72

Figure 44: Recognition dedicated controls itself, located in the dialog box "HiFi Recognition"

................

73

Figure 45: Overview of the dialog box "Comprension Hi
-
Fi"

................................
................................
......

74

Figure 46: Overview of the dialog box "HiFi Dialogue Manager"

................................
...............................

75

Figure 47: Dialogue Objectives

................................
................................
................................
...................

76

Figure 48: Controls dedicated to present the classified dialogue concepts

................................
...............

77

Figure 49: Threshold controls of the dialogue manager

................................
................................
.............

78

Figure 50: Dialogue Memory Controls

................................
................................
................................
........

79

Figure 51: Execution Module

................................
................................
................................
......................

80

Figure 52: Response Generation Box

................................
................................
................................
..........

81

Figure 53: Synthesizer

................................
................................
................................
................................
.

84

Figure 54: Supervisor in the system architecture

................................
................................
.......................

86

Figure 55: Supervisor Interface

................................
................................
................................
...................

87

Figure 56: Use
r Case Model

................................
................................
................................
........................

88

Figure 57: Haar features

................................
................................
................................
.............................

94

Figu
re 58: Cascade of classifiers

................................
................................
................................
.................

95

Figure 59: Flowchar to filter the images

................................
................................
................................
.....

96

Figure 60: Comparison of the methods (Left: Picture1 Left,Right: Picture2)

................................
.............

97

Figure 61: Opencv application

................................
................................
................................
..................

100

Figure 62: IR USB Driver Task Bar

................................
................................
................................
.............

103

Figure 63: Test Command IR Driver

................................
................................
................................
..........

103

Figure 64: HI
-
FI System Loading

................................
................................
................................
................

104

Figure 65: HI
-
FI System Ready

................................
................................
................................
..................

105

Figure 66: IR Activation Box

................................
................................
................................
......................

105

Figure 67: HI
-
FI System Ready and IR Activated

................................
................................
.......................

106

Figure 68: Supervisor not ready

................................
................................
................................
................

107

Figure 69: Supervisor (Agent Face) Neutral

................................
................................
..............................

108

Figure 70: Emotional System Running

................................
................................
................................
......

109

Figure 71: Supervisor System Ready

................................
................................
................................
.........

110

Figure 72: Run the Recognition Module

................................
................................
................................
...

111

Figure 73: Recognition Dialogue Box

................................
................................
................................
........

112

Figure 74: Recognition Dialogue Box Running

................................
................................
..........................

113



T
ABLES

Table 1: Advantages/Disadvantages P2P vs. Client
-
Server

................................
................................
.........

16

Table 2: Ortony,

Clore, and Collins variables

................................
................................
..............................

27

Table 3: Plan junctures

[27]

................................
................................
................................
........................

34

Table 4: Mapping Appraisal Weights into Emotions

................................
................................
...................

45

Table 5: System message

................................
................................
................................
............................

65

Tabl
e 6: Combinations for links between modules Knowledge Manager, Dialogue Manager and
Performance.

................................
................................
................................
................................
..............

67

INTRODUCCTION


1


1.

I
NTRODUCTION


The purpose of this work is the design, development, and adaptation of an emotional model to
be used
in

a dialogue system that provides several functions from speech recognition. The main goal is
to improve the existing dialogue
system

by adding new feat
ures related to the emotional model that is
aimed
at creating

a robotic servant with basic human
-
like emotions.



In the near future, emotional systems will play an important role in the development of
intelligent/affective systems. Either recognition of t
he user emotions or emotional response from the
system is highly desirable. To create a system/robot with such features
,

several
aspects

have to be
considered:

-

Emotional model

-

Dialogue System



Speech recognition



Language understanding module



Dialogue
manager



Emotional response generator



Emotional Speech Synthesizer

-

User emotion recognition by the use of a webcam

We can create a good emotional model, but without a good dialogue system adapted to the
emotional model, the users
will not

notice
,

for examp
le
,

the emotions “felt” by the system if we
do not

have an emotional speech synthesizer. Another important feature is the capacity to sense user
emotions, interaction, and movements in order to supply with more information our emotional systems
so that
the
y

can generate better responses.

Our experimental work is based on a hi
-
fi audio system in which the dialogue system was adapted
and it provide
s functionality to control its CD

player, two tapes, am/fm radio, and others. In the past,
this hi
-
fi audio system was tested, and evaluated by users rating its performance without the use of the
emotional model.
It

i
s

part of this work

to

improve the dialogue system, adapt the new emotio
nal
features, and set the basis for the future re
-
evaluation of the Hi
-
Fi dialogue system.


INTRODUCCTION


2


1.1.

O
BJECTIVES

-

Adaptation of a human

affective and
cognitive

model
to the servant robotic system.

-

Improvement and adaption of the existing dialogue system that controls

a Hi
-
Fi system.

-

Design and Development of the Architectural Communication System to be used.

-

Emotion recognition by webcam in order to feed the emotional model.


SYSTEM ARCHITECTURE

3


2.

S
YSTEM ARCHITECTURE

In this chapter
the reference model Galaxy
-
II
is present
ed
.
Galaxy
-
II

is used

for conversational
system developments

and it

guided us in the creation of our system
architecture
;

secondly

the basic
dialogue system

(
used as a base for our work
)

is presented,

and finally our final architecture that adapts
the new dialogue system,
and
the emotional model.


2.1.

GALAXY
-
II:

A

REFERENCE ARCHITECTU
RE FOR CONVERSIONAL
SYSTEM
DEVELOPMENT

This architecture was developed by the Spoken Language Systems Group from the
Massach
usetts Institute of Technology. Through their experience in designing spoken dialogue systems,
they have realized that an essential element in being able to rapidly configure new systems is to allow as
many aspects of the system design as possible to be sp
ecifiable without modifying source code

[
1
]
. By
doing so, they have been able to configure multi
-
modal, multi
-

domain, multi
-
user, and multilingual
systems with much less effort than previously. As new and

increasingly complex spoken dialogue
systems are built, the task of evaluating and expanding these systems becomes both more important
and more difficult. Typically, a spoken dialogue system comprise
s

multiple modules, each of which
performs its task wit
hin an overall framework, sometimes completely independently but most often
with input from other modules. Secondly, once a mechanism is in place for running data through an off
-
line system, a simple
p
re
-
processing

of data with a new version of any compone
nt can lead to an
incoherent interaction, as only one side of a two
-
sided conversation has changed. Finally, what to
evaluate (e.g., individual component vs. overall system
behaviour
) and how (error rates vs. some
measure of usability).


The GALAXY
-
II arc
hitecture consists of a central hub that controls the flow of information among
a suite of servers, which may be running on the same machine or at remote locations

[
1
]
. Figure 1 shows
a typical hub configu
ration for a generic spoken dialogue system. The hub interaction with the servers is
controlled via scripting language. A hub program includes a list of the active
servers
, specifying the host,
port, and set of operations each server supports, as well as a set of one or more
programs
. Each
program consists of a set of
rules
, where each rule specifies an
operation
, a set of
conditions
under
SYSTEM ARCHITECTURE

4


which that rule should “fire”
,

a l
ist of INPUT and OUTPUT
variables
for the rule, as well as optional
STORE/RETRIEVE variables into/from the discourse history. When a rule
is
fire
d
, the input variables are
packaged into a
token
and sent to the server that handles the operation. The hub exp
ects the server to
return a token containing the output variables at a later time. The variables are all recorded in a hub
-
internal
master token
. The conditions consist of simple logical and/or arithmetic tests on the values of
the typed variables in the m
aster token. The hub communicates with the various servers via a
standardized frame
-
based protocol.



Figure
1
:
Galaxy
-
II Architecture

[
2
]


A simple communication protocol has
been adopted and standardized for all hub/server
interactions. Upon initiation, the hub first handshakes with all of the specified servers, confirming that
they are up and running and sending them a “welcome” token that may contain some initialization
info
rmation, as specified in the hub script

[
3
]
. The hub then launches a wait loop in which the servers
are continuously polled for any “return” tokens. Each token is named according to its corresponding
pro
gram in the hub script, and may also contain a rule index to locate its place in program execution,
and a “token id” to associate it with the appropriate master token in the hub’s internal memory. The
rule is consulted to determine which “OUTPUT” variables

to update in the master, and which variables,
if any, to store in the discourse history. Following this, the master token is evaluated against the
SYSTEM ARCHITECTURE

5


complete set of rules subsequent to the rule index, and any rules that pass test conditions are then
execute
d. In the current implementation, the usual case is that only one rule
is
fire
d
, although
simultaneous rule executions can be used to implement parallelism, a feature that is
used
. Servers other
than those that implement user interface functions are
typically stateless; any history they may need is
sent back to the hub for safekeeping, where it is associated with the current utterance. Common state
can thus be shared among multiple servers. To execute a given rule, a new token is created from the
mast
er token, containing only the subset of variables specified in the “INPUT” variables for the rule in
question. This token is then sent to the server assigned to the execution of the operation specified by
the rule. If it is determined that the designated s
erver is busy (has not yet replied to a preceding rule
either within this dialogue or in a competing dialogue) then the token is queued up for later
transmission. Thus the hub is in theory never stalled waiting for a server to receive a token. The hub
then

checks whether the server that sent the token has any tokens in its input queue. If so, it will pop
the queue before returning to the wait loop. Fo
r

example,
t
he recognizer sends to the hub each
hypothesis as a separate token, signalling completion with
a special final token. The selected token is
processed through discourse inheritance via the hub script and sent on to the dialogue manager. The
dialogue manager usually initiates a subdialogue in order to retrieve information from the database. The
retrie
ved database entries are returned to the dialogue manager for interpretation. These activities are
controlled
at

a separate program in the hub script, which we refer to as a module
-
to
-
module
subdialogue. Finally, the dialogue manager sends a reply frame to

the hub which is passed along to
generation and synthesis. After the synthesized speech has been transmitted to the user, the audio
server is freed up to begin listening the next user utterance

[
3
]
.


On
e of the challenges that is well addressed by the Galaxy
-
II architecture is
m
anaging

multimodal interactions. In a multimodal interaction separate control threads need to manage the
various input/output modalities. These threads need to be coordinated and
synchronized. In this
architecture, the execution model of having a set of active tokens for which rules are fired as their
conditions are matched has proven effective in supporting the multiple threads. Different tokens
correspond to activity in different

threads. Tying all the threads for a given user session together is a
session identifier in every token.


SYSTEM ARCHITECTURE

6


All of these multimodal interactions are handled in a straightforward manner within the Galaxy
-
II architecture. The parallel execution, multiple toke
n programming model supports these interactions in
a simpler manner than would be possible in a traditional programming language such as C.


The GALAXY
-
II architecture has proven to be a powerful tool for evaluation. It has made possible
a wide range of sy
stem configurations specifically designed for monitoring system performance resulting
in a suite of hub programs concerned with evaluation

[
2
]
. In some cases, it can be used only to evaluate
a particular aspect of system performance, such as recognition or understanding. In other cases to
evaluate the performance of the entire system, perhaps comparing a new version with the version that
existed

at the time a log file was first created. At other times can be useful to look at ways of measuring
system performance as it relates to user satisfaction, along measurable dimensions.


2.2.

NEMO

A
RCHITECTURE
:

O
UR ARCHITECTURE

Based on the study of the
Galaxy
-
II architecture presented previously, we adapted our work to
perform in a similar way using available tools in the Speech
Technology
Group. The main objectives were
to reuse as much as possible previous work in the dialogue system architectures, and adapt

it to
architecture

capable of managing multiple servers interacting with each other. The servers in our
dialogue system
(
similarly to the
Galaxy
-
II
)

can be in the same machine or in different machines
,

giving
broad flexibility to operate and distribute co
mputational processing.

Main objectives:



Multimodal interface:

The system must be capable
of

interact
ing

with the user in
several ways. It’s helpful in order to test, and evaluate the system as well.



Scalable architecture
: Must provide easiness to
add/delete new modules and
functionalities.



Emotional
behaviour
:

The system needs to address the emotional
behaviour

independent of task.
The emotional state of the system is calculated b
y giving certain
information related to the performance of the tasks
involved in the dialogue system,
and
information from the

new modules. It follows a human psychological theory
adapted to an affective agent.

SYSTEM ARCHITECTURE

7




Multi
-
server interaction:

Must provide an architecture and protocol capable of
managing the different server inter
actions.


2.2.1.

P
REVIOUS DIALOGUE SYS
TEM ARCHITECTURE

The
previous
dialogue system architecture is presented in Figure 2. It contains six basic blocks

that were
designed
, developed, and adapted in previous works to control the Hi
-
Fi system by voice
commands
[
4
]

:



Speech recognition:

It translates the input voice from a microphone into a set of
recognized words derived from a vocabulary previously trained.




Language Understanding Module:

It extracts semantic concepts from the recognize
d

words
. This module is constituted by a set of context dependent rules, handcrafted by
an expert in the system application domain, and the concept dictionary, a list with those
relevant semantic concepts th
at each word can
be related to
. The output of the
language understanding module is a list of
attribute
-
value
pairs
[
4
]
.




Dialogue Manager:
This list of semantic concepts is the input to the dialog manag
er.
From that list, the dialog manager fills an execution frame with the required information
in order to execute the different actions present in the query utterance. It establishes an
explicit confirmation mechanism giving some feedba
ck to the user about

the action

that
it’s going to be executed

[
4
]
.




Execution Module:

It is in charge of sending the infrared commands to the system and
keeping track of the actions executed by the mini hi
-
fi.
It

should mention that the
commercial system is not abl
e to give us

a

feedback of the commands executed, what
can produce a synchronization los
s

[
4
]
.




Generation Module:
The generation module creates dif
ferent sentences for the same
dialog goal to give some feedback to the user about the actions carried out and to ask
the user about some needed information in order to achieve the dialog
ue

goal or to
SYSTEM ARCHITECTURE

8


perform the request
ed

action. In order to make the dialo
g
ue

more natural it uses
different sentences
at
each time. It’s based
on

specific templates for each possible
dialog
ue

goal

[
4
]
.




Synthesiser:
The text to speech module synthesizes the speech from the
sentence
proposed by the generation module. As we are developing a
n

emotional system, the
synthesizer used is capable of
expressing emotions.










Figure
2
: Dialogue System basic architecture



2.2.2.

NEMO

A
RCHITECTURE
:

N
EW ARCHITECTURE

Our work consisted on adapti
ng and expanding the functionality

of the dialogue system by adding
the emotional model. Our work consisted on:



Analyze the different blocks of the basic dialogue system in order to detect what information
could be useful to provide information to the emotional model
.



Modify the basic bloc
ks in order to supply the information necessary to compute the emotions.



Expand the functionality of the generation module by adding more concepts such as
relationship (Friend, Known, Unknown, etc), emotions (sadness, happiness, fear, surprise,
Front
End

Speech
Recognition

Dialogue
Manager

Language
Understanding
Module

Generation
Module

Synthesizer

Execution
Module




Text

Text

SYSTEM ARCHITECTURE

9


anger, disg
ust). Furthermore, new concepts could be added such as cultural background and
others in order to adapt phrases depending on the different concepts.




Research and develop the emotional model to be used.



Adapt a computer vision

application based on
OpenCV

(open source computer vision library)
that uses the webcam in order to recognize user movements, smiles, and presence in order to
improve our system.



Adapt a new synthesizer developed by our Speech Technology Group that has the capacity of
express
ing

emoti
ons by giving a phrase and the emotion.

The final architecture is presented in Figure 3. The idea consist of a central hub similar to the
one implement in Galaxy
-
II, this hub is implement
ed

with use of a SOAP (Simple Object Access Protocol)
called “Jabon”
developed by the Intelligent Control Research Group of the Universidad Politecnica
of

Madrid. It provides messages communication between nodes. These messages must have an envelope
that contains a compulsory body and an optional header field. This toolkit
provides an interface to
generate C++ web services that we used to communicate our different servers
-
clients.
The

implementation consists on defining
(
on the main loop of
the
principal program
)

the different rules and
calls to each server. In addition, eac
h server contains the services it provides and the input and output
variab
les required. To call a service
the input/output variable, the name of the service,
and
host and
port

must be
specified
.


Considering

previous work architecture,
in
where all the components worked on the same
machine

and the calls to each mod
ule were made by function calls;

It was

analyze
d which of those

components were needed to be changed in order to
set them

as servers providing services to other
components on the
dialogue system.


It
can

be
see
n

in Galaxy
-
II
,

all the components acted as servers, but it
does not

mean it

i
s
necessary to change all the previous modules in the dialogue system if it
does not

carry a substantial
improvement.

The new/changed components
of the emotional dialogue system are (see figure 3):




OpenCV

webcam client:

This program based on the
OpenCV

libraries proce
s
s
es

real time video
input from a webcam in order to detect movements, face recognition, and smiles. As soon as it
detects any of the previous events, this client communicates with the
OpenCV

webcam server
SYSTEM ARCHITECTURE

10


running
on

the main program.

This application represen
ts the beginning of an unexploited
potential in the field of computer vision. In the future, it c
ould recognize users, recognize

all the
emotions, user movements

(hand gesture, hand movements, head movements)
, etc.



OpenCV

webcam server:

This server runs on

the main program, and its functionality is to
receive events from the
OpenCV

client.

Once, the events are received the principal program
process
es

these events.



New emotional system server:
it is in charge of receiving the events used to compute and
upda
te
the emotional
state, and also returning the emotional state in order to
generate an

adequate response

through

the speech synthesiser
.



Synthesiser server:

This module is in charge of synthesizing the voice. Its input

parameters are
the phrase
and the em
otion. Due to having

a

different synthesizer
, it

was necessary to separate
this module from the previous dialogue system.



Synthesiser client:
It runs on the main program, and is in charge of sending the service
request

to synthesizer server



Response Generator:
This module even though is maintained in the main program
, it

was
changed to add more concepts such as relationship
to

the user and the emotions. It looks for a
given set of concepts and
finds

the appropriat
e

phrases depending on the co
ntext.

Setting more
concepts allow us to create more adequate and realistic responses.


In Figure 3, the modules in blue are located on the main
application
program while the modules in
green

are outside the main program and are accessed through the use of

a
server/service
by means of

SOAP. The
application

program acts as the hub of the architecture managing the different rules.


The new emotional system needs the performance results of the different modules of the dialogue
system in order to compute the em
otions. As i
t is

presented later, the idea behind the emotional model
is to create it independent
ly

of

the

task in order to adapt it to other systems in the future. The different
blocks/modules such as the recognizer, the language understanding module, the

dialogue module, and
newly introduced
OpenCV

video application supply with information the emotional model.


SYSTEM ARCHITECTURE

11




Figure
3
: Architecture

In Figure 4, the data flow of the new architecture

is shown
. First, the speech recognizer
translates the voice command into text, and also sends performance information

(c
onfidence of
detection, phrases
) to

the emotional system. Second, the language understanding
module extracts the
semantic concepts of the text and passes

this information to t
he dialogue manager, but also send
s

information
to
the emotional model. Then, the dialogue
manager works on the list of semantic concepts
from the previous and actual
interactions
,
and sends

information to the execution module that will
Main
Program
Loop (Hub)

Speech
recognizer

Language
Understanding
Module

Dialogue
Manager

Response
Generator

Synthesiser
server

New emotional
system server

Opencv Client

Opencv
Webcam server

Exectuion
Module (IR,
Roomba,
Andromotic)

Synthesiser
Client

SYSTEM ARCHITECTURE

12


operate the hi
-
fi s
ystem. Wh
ile all these steps occur, the O
pen
CV

webcam server in the main program is
passing information to the emotional model about events detected by the webcam. The emotional
system is parallel
y

computing this information, and then as the dialogue manager sends

the concepts to
the response generator

module
, the emotional system
updates the required information
so that the
response generator module will look into the appropriate response
depending

on the concepts
. Finally,

through the synthesizer client,
the response with the emotion is passed to the synthesizer server that
will reproduce the response.



Figure
4
: Architecture data flow
Speech
Recognizer


Language
Understanding
Module


Dialogue
Manager


Response
Genera
tor


OpenCV
Client


Synthesiser
Client


OpenCV
Server*


Emotional System


(Update Emotions)


Synthesis Server


Execution
Module (
IR,
Roomba,
Andromotic
)


Voice
Command

Text

Text

& Emotion

Face recognition /
Smile recognition

/
Motion detection

Emotional
Response

IR commands

Set emotions

Semantic
concepts

SYSTEM COMMUNICATIONS

13


3.

S
YSTEM
C
OMMUNICATIONS

In this chapter the research done about the different
communication

topologies

is
presented
, the
different tools available, and the considerations made to choose the implementation used in
the

final
architecture.

The communication
systems allow

the user

to implement efficiently the architecture
chosen, and provide communication between
all the modules
needed

to operate the whole system.

3.1.

C
ONCEPTS

All computer systems could be classified in two categories: centralized and distributed, see
Figure 5. Distributed systems can be divided into the Client
-
Server model and the Peer
-
to
-
Peer model.
The Client
-
Server model can be flat where all clients only communicate with a single server, or it can be
hierarchical for improved scalability. In a hierarchal model, the servers of one level are acting as clients
to higher level servers

[
5
]
.

The Peer
-
to
-
Peer architecture is split into pure and hybrid architectures. The pure architecture
works without a central server, whereas the hybrid architecture first contacts a server to obtain meta
-
information, s
uch as the identity of the peer, on which some information is stored, or to verify security
credentials

[
6
]
.


Figure
5
: Classification of Computer Systems

[
5
]



SYSTEM COMMUNICATIONS

14


3.1.1.

P
EER
-
TO
-
PEER

A peer
-
to
-
peer, commonly abbreviated to P2P, is any distributed network architecture
composed of participants that make a portion of their resources (such as processing power, disk
storage
or network bandwidth) directly available to other network participants, without the need for central
coordination instances (such as servers or stable hosts)

[
7
]
.

Peers are both suppliers and consum
ers of
resources, in contrast to the traditional
client

server

model where only servers supply, and clients
consume.

A peer is a network node that can act as a client or
a server, with or without centralized control, and
with or without continuous connectivity. The term “peer” can
be
appl
ied

to a wide range of device
types, including small handheld and powerful server
-
class machin
es that are closely managed

[
7
]
.


Figure
6
: Peer
-
to
-
Peer Architecture

3.1.2.

CLIENT
/
SERVER MODEL

The
client

server model

of computing is a
distributed application

structure that partitions tasks
or workloads between the providers of a resource or service, called
servers
, and service requesters,
called
clients

[
8
]
. O
ften clients and servers communicate over a
computer network

on separate
hardware, but both client and server may reside in the same system. A server machine is a host that

is
running one or more server programs which share their resources with clients. A client does not share
any of its resources, but requests a server's content or service function. Clients therefore initiate
communication sessions with servers which await
(
listen

for) incoming requests.

SYSTEM COMMUNICATIONS

15




Server:

It

is

a provider of
services
;

the server must compute requests and has to return the
results with an appropriate protocol. A server as a provider of services can be running on the
same device as the client is running

on, or on a different device, which is reachable over the
network. The decision to outsource a service from an application in form of a server can have
different reasons.



Client:

A client is typically a device or a process which uses the service of one or

more servers.
Since clients are often the interface between server
-
information and people, clients are
designed for information input and visualization of information. Although clients had only few
resources and functionality in the past, today most clien
ts are PCs with more performance
regardi
ng resources and functionality

[
5
]
.


Figure
7
: Client
-
Server Architecture


3.1.3.

A
DVANTAGES
/D
ISADVANTAGES

In the table below the
advantages/disadvantages of each model

are shown
.


PEER
-
TO
-
PEER

CLIENT
-
SERVER

A
DVANTAGES

In a pure Peer
-
to
-
Peer architecture there is no single
point of
failure, which

means, if one peer breaks down,
the rest of the peers are still able to communicate.


Peer
-
to
-
Peer provides the opportunity to take advantage
of unused resources such as processing power for
computations and storage capacity. In Client
-
Server
architectures, the centralized system bears the majority
of the cost of the system.

Data management is much easier
because the files are in one location.

This allows fast backups and efficient
error management. There are multiple
levels of permissions, which can prevent
users from doing damage to files.


The server hardware is designed to
serve requests from clients quickly. All
SYSTEM COMMUNICATIONS

16



Peer
-
to
-
Peer
prevent bottleneck such as traffic overload
using
central

server architecture, because Peer
-
to
-
Peer
can distribute data and balance request across the net
without using a central server.

the data are processed on t
he server,
and only the results are returned to the
client. This reduces the amount of
network traffic between the server and
the client machine, improving network

performance.


D
ISADVANTAGES

Today many applications need a high security standard,
which is

not satisfied by current Peer
-
to
-
Peer solutions.


The connections between the peers are normally not
designed for high throughput rates, even if the coverage
of ADSL and Cable modem connections is increasing.



A centralized system or a Client
-
Server syst
em will work
as long as the service provider keeps it up and running. If
peers start to abandon a Peer
-
to
-
Peer system, services
will not be available to anyone.


Client
-
Server
-
Systems are very
expensive and need a lot of
maintenance.


The server constitute
s a single point of
failure. If failures on the server occur, it
is possible that the system suffers heavy
delay or completely breaks down, which
can potentially block hundreds of clients
from working with their data or their
applications. Within companies

high
costs could accumulate due to server
downtime.


Table
1
: Advantages/Disadvantages P2P
vs.

Client
-
Server


3.2.

C
OMMUNICATION
M
ETHODS
&

T
OOLS

T
here
exist
several communication methods to pass data from one process
/application

to another

one
.
Due to
the

architecture
(
based on a client
-
server model as
it can be

see
n

in chapter II
)
,
there is
a

need to

review some of the
mechanism

used to communicate client
-
server.

3.2.1.

S
OCKETS

Internet sockets constitute a mechanism for delivering inc
oming data packets to the appropriate
application
process

or
thread
, based on a combination of local and remote
IP addresses

and
port
numbers
. Each socket is mapped b
y the operating system to a communicating application process or
thread.


To use plain

sockets a formatted message of
some

type
is
require
d
.
Sockets programming demands
a lot of time, and if we compare the programming benefits with the time

used to program and deal with
the

parsing, error
-
handling and related tasks for a custom message infrastructure against a SOAP

based
,

it

i
s found that is preferable to go for SOAP based
standard mechanism

that
is

more flexible, and hide all
the complexity

present in the programming of sockets
.


SYSTEM COMMUNICATIONS

17



3.2.2.

S
ERVICE ORIENTED ARCH
ITECTURE
(
SOA
)

A service
-
oriented architecture (SOA) is a flexible set of
design

principles used during the phases of
systems development

and
integration
. A

deployed SOA
-
based architecture will provide a loosely
-
integrated suite of
services

that can be used within multiple business domains.

Service
-
orientation

requires
loose coupling

of services with
operating systems
, and other
technologies that underlie applications. SOA separates functions into distinct units, or services
[
9
]
, which
developers make accessible over a network in order to allow users to combine and reuse them in the
production of applications. These services and their corresponding consumers communicate with each
other by passing data in a well
-
defined, shared fo
rmat, or by coordinating an activity between two or
more services
[
10
]
.

SOA also generally provides a way for consumers of services, such as web
-
based applications, to be
aware of available SOA
-
based se
rvices. For example, several disparate departments within a company
may develop and deploy SOA services in different implementation languages; their respective
clie
nts

will
benefit from a well understood, well defined interface to access them.
XML

is commonly used for
interfacing with SOA services, though this is not required.

Web services

can implement a service
-
oriented architecture. Web services make functional building
-
blocks accessible over standard Internet protocols independent of platforms and programming
languages. These services

can represent either new applications or just wrappers around existing legacy
systems to make them network
-
enabled.

Using SOA in the web service approach, each module can play one or both of the roles:



Service Provider
-

The service provider creates a
web service

and possibly publishes its interface
and access information to the service registry. Each provider must decide which services to
expose, how to make trade
-
offs between secu
rity and easy availability, how to price the
services, or (if no charges apply) how/whether to exploit them for other value. The provider also
has to decide what category the service should be listed in for a given broker service and what
sort of trading p
artner agreements are required to use the service. It registers what services are
SYSTEM COMMUNICATIONS

18


available within it, and lists all the potential service recipients. The implementer of the broker
then decides the scope of the broker. Public brokers are available through
the Internet, while
private brokers are only accessible to a limited audience, for example, users of a company
intranet. Furthermore, the amount of the offered information has to be decided. Some brokers
specialize in many listings. Others offer high level
s of trust in the listed services. Some cover a
broad landscape of services and others focus within an industry. Some brokers
catalogue

other
brokers. Depending on the
business

model
, brokers can attempt to maximize look
-
up requests,
number of listings or accuracy of the listings. The
Universal Description Discovery and
Integration

(UDDI) specification defines a way to publish and discover information about Web
services. Other service broker technologies include (for example)
ebXML

(Electronic Business
using eXtensible Markup Language) and those based on the
ISO/IEC 11179

Metadata Registry

(MDR) standard.

[
11
]



Service consumer

-

The service consumer or
web service

client locates entries in the broker
registry using various find operations and then binds to the service provider in order to invoke
one of its web services. Whichever service the service
-
consumers need, they have to take it into
the brokers, then bind i
t with respective service and then use it. They can access multiple
services if the service provides multiple services.

[
11
]


3.2.3.

W
EB SERVICES

AND SOA

Web services technology is a collection of standards (o
r emerging standards)

that can be used to
i
mplement an SOA. Web services technology is vendor and

platform
-
neutral, interoperable, and
s
upported by many vendors today.


Web services are self
-
contained, modular
applications that

can be
d
escribed,

published,

located,
and invoked over networks. Web services encapsulate

business functions, ranging from a simple
request
-
reply to full business process

interactions. The services can be new or wrap around existing
applications

[
11
]
.

The
W3C

defines a "Web service" as
a software system designed to support
interoperable

machine
-
to
-
machine

interaction over a
network
. It has an interfac
e described in a machine
-
process
able format
(specifically

WSDL
). Other systems interact with the Web service in a manner prescribed by its
SYSTEM COMMUNICATIONS

19


description using
SOAP

messages, typically conveyed using HTTP with an XML serialization in
conjunction with other Web
-
related standards
[
12
]
.

The Web Services Description Language (WDSL) is an
XML
-
based language that provides a model for
describing
Web services
. WSD
L is often used in combination with
SOAP

and an
XML Schema

to provide
web services

over the
Internet
. A client program connecting to a web service can read the WSDL to
determine what operations are available on the server. Any special
data types

used are embedded in
the WSDL file in the form of XML Schema. The client can then use SOAP to actually call one of the
operations listed in the WSDL.

3.2.3.1.

S
IMPLE OBJECT ACCESS
PROTOCOL
(
SOAP
)

SOAP is a lightweight p
rotocol for exchange of information in a decentralized, distributed
environment. It is an XML based protocol that consists of three parts:

1.

The format of a SOAP message is an envelope containing zero or more

headers and
exactly one body. The envelope is
the top element of the XML document, providing a container
for control information, the addressee of a message, and the message itself. Headers contain
control

information such as quality
-
of
-
service attributes. The body contains the

message
identification
and its parameters.

2.

Encoding rules are used for expressing instances of application
-
defined

data types. SOAP
defines a programming language independent data
-
type

schema based on an XML Schema
Descriptor (XSD), plus encoding rules

for all data
-
types defined

to this model.

3.

RPC representation is the convention for representing remote procedure

calls (RPC) and
responses.

[
11
]

SOAP tries to pick up where XML
-
RPC left off by implementing user defined data type
s, the ability to
specify the recipient, message specific processing control, and other features.

SOAP's greatest feature is its ability to step past XML
-
RPC's limitations and customize every portion
of the message. This ability to customize allows develop
ers to describe exactly what they want within
their message. The downside of this is that the more you customize a message the more work it will take
to make a foreign system do anything beyond simply parsing it.

SYSTEM COMMUNICATIONS

20


In the example below, a GetStockPrice reque
st is sent to a server. The request has a StockName
parameter, and a Price parameter that will be returned in the response. The namespace for the function
is defined in

"http://www.example.org/stock"

[
13
]
.









Figure
8
: SOAP request example









Figure
9
:
SOAP response example




<?xml version="1.0"?>

<soap:Envelope

xmlns:soap="http://www.w3.org/2001/12/soap
-
envelope"

soap:encodingStyle="http://www.w3.org/2001/12/soap
-
encoding">


<soap:Body xmlns:m="http://www.example.org/stock">



<m:GetStockPrice>





<m:StockName>IBM<
/m:StockName>



</m:GetStockPrice>

</soap:Body>


</soap:Envelope>


<?xml version="1.0"?>

<soap:Envelope

xmlns:soap="http://www.w3.org/2001/12/soap
-
envelope"

soap:encodingStyle
="http://www.w3.org/2001/12/soap
-
encoding">


<soap:Body xmlns:m="http://www.example.org/stock">



<m:GetStockPriceResponse>





<m:Price>34.5</m:Price>



</m:GetStockPriceResponse>

</soap:Body>


</soap:Envelope>

SYSTEM COMMUNICATIONS

21


3.3.

F
INAL IMPLEMENTATION

Due to the advantages, features, and

SOAP
work experience
in the GTH
, it was

the

chosen
protocol
to create web services. The modules in
the

architecture could be in the same or in different machines
over the network interacting with each other.

There exist a number of benefits by implementing SOAP, and it includes modularization,
transparency, and distribution of computational load. By havin
g specific web services, we can replace a
model in our architecture in order to evaluate and compare its performance.

The toolkit called “Jabon” developed by the Intelligent Control Research Group of the
Universidad Politecnica de Madrid, and base
d

on th
e SOAP protocol was used in
the

development. The
SOAP messages can be implemented in two ways:

1.

Using a central server:

A central server communicates clients and services. This hub contains
rules, locations, input / output data parameters of each service. C
lients make the request to the
central server, and it associates a specific request with
a

specific service. The central server
contains a set of rules (rules.xml) and agents (agents.xml) where all this information is stored.














Figure
10
: Example rule.xml


<
EsponjaRules

name
=
""

xmlns:ns1
=
"
urn:UrbanoEvents
"

xmlns:ns2
=
"
urn:UrbanoServices
"

xmlns:ns3
=
"
urn:NemoEvents
"

xmlns:ns4
=
"
urn:NemoServices
"
>


<
rule
>

<
input

name
=
"
ns1:initUrbano
"
/>

<
output

name
=
"
ns2:init
"
/>

</
rule
>


<
rule
>

<
input

name
=
"
ns1:urbanoFinished
"
/>

<
output

name
=
"
ns2:urbanoFinished
"
/>

</
rule
>


<
rule
>

<
input

name
=
"
ns3:initNemo
"
/>

<
output

name
=
"
ns4:init
"
/>

</
rule
>


</
EsponjaRules
>

SYSTEM COMMUNICATIONS

22


2.

Clients directly call a specific service in a known location:
Clients can request services to the
service provider
directly
. Clients know the location and the service input / output parameters.



In
the

final application,
this approach was used
. Different
modules

communicate with the main
program

that acts as a serve
r

without the use of rules, and the server communicates with
specific services
in specific locations when needed. The question is why

the central hub provided
by Jabon tools

was not used
,
and the answer is
that
the

main program acts a
s

server that
distributes the
information;

furthermore
,

another hub was not needed
.

Next,
it i
s

presented the
XML implementation of
a

SOAP message

that goes from the emotional module to the main
application. The
message is called sendemotionlevelstoHifi, a
nd consists

of several float
variables that store the emotional
information

such as
happy, sad, anger level
s
; these variables
are sent to the Hi
-
Fi (main program) in order to process
it
and use
it;

for example to
synthesize

and generate an adequate respons
e to
a
certain petition.


















Figure
11
:

sendemotionstohifi Jabon
-
xml message
<?xml version="1.0" ?><definitions name="nemohifi" targetNamespace="urn:nemohifi"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" <message name="sendemotionlevelstoHifi">

<part name="neutralLevel" type="xsd:float"/>

<part na
me="happyLevel" type="xsd:float"/>

<part name="sadLevel" type="xsd:float"/>

<part name="angryLevel" type="xsd:float"/>

<part name="surpriseLevel" type="xsd:float"/>

<part name="fearLevel" type="xsd:float"/>

<part name="shameLevel" type="xsd:float"/>

</mess
age>

<operation name="sendemotionlevelstoHifi">

<input message="tns:sendemotionlevelstoHifi"/>

<binding name="nemohifiBinding" type="tns:nemohifi">

<soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>

<operation name="sendemotionlev
elstoHifi">

<soap:operation soapAction=""/>

<input>

<soap:body use="literal" namespace="urn:nemohifi"/>

</input>

</operation>

</binding>

<service name="nemohifiService">

<port name="nemohifi" binding="tns:nemohifiBinding">

<soap:address location="http://12
7.0.0.1:12000"/>

</port>

</service>

</definitions>


EMOTIONAL SYSTEM


23


4.

E
MOTIONAL
S
YSTEM

Emotion is fundamental to human
experience, influencing cognition, perception, and everyday tasks
such as learning, communication, and even rational decision
-
making. However, technologists have
largely ignored emotion
s

and created an often frustrating experience for people, in part becau
se affect
has been misunderstood and hard to measure.

Nowadays,
the study and development of systems and
devices that can recognize, interpret, process, and simulate human
emotions

is know
n as affective
computing. It is an interdisciplinary field spanning
computer sciences
,
psychology
, and
cognitive science

[
14
]

[
15
]
.

This modern branch of computer science originated
by

Rosalind Picard
's 1995 paper
[
16
]

on
affective computing. The main
motivation behind affective computing is the ability to simulate empathy.
In order to do that the systems should be able to interpret emotional states of the users, and adapt its
behaviour

to them.

4.1.

E
MOTIONAL
T
HEORIES

There
are

several theories that try to explain
the emotional behaviour in human
beings
.

To a
pply

the
se theories
,

and adapt them to a

robotic
agent

represents
the first step to create an emotional agent.
The

emotional system is based on the
Appraisal

Theor
y
.
The
Ap
praisal theory is
a psychological theory
that establishes

that our
emotions are

based on the judgments
made

about the events happening
around us.

Therefore, t
he judgments/evaluations of a certain situation
determine

our emotional
response.

The another fo
undation of
the

emotional system is the Need

Theory based on the Maslow Hierarchy
of Needs, which
was

adapt
ed

to the emotional agent in order to feed the appraisal

model created.


This section provides the theoretical foundations
of the emotional theories. Many of these theories
complement with each other, and
their studies, results, and conclusion
have given important guidelines
to apply the theories to a robotic agent.

I
t is interesting to note that the majority of the computer
m
odels o
f emotions, if they refer expre
ssly

to psychological theories, are based on the so
-
called
appraisal theories.

These approaches can be converted into program code by adapting the models to
the

robotic agent
.

EMOTIONAL SYSTEM


24


4.1.1.

T
WO FACTOR THEORY OF
EMOTION

The Two
Factor theory or the Schachter
-
Singer Theory of Emotion is a social psychological theory of