group3(affective_computing)

bouncerarcheryΤεχνίτη Νοημοσύνη και Ρομποτική

14 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

53 εμφανίσεις

Affective Computing

A Seminar Presentation by

K
arthik Raman, 06005003

A
dith Swaminathan, 06005005

O
mkar Wagh, 06005006

S
amhita Kasula, 06D05014


There can be no knowledge without emotion. We may be aware of
a truth, yet until we have felt its force, it is not ours. To the
cognition of the brain must be added the experience of the soul.



Arnold Bennett
(
British

novelist, playwright, critic, and essayist,
1867
-
1931
)

Abstract

Affective Computing is a field of research in AI
dealing with emotions and machines. We
address


the impact of emotion on intellectual
processes,


propose a basic theory for recognizing
emotions,


survey a few existing techniques applied in
affective computing, and


motivate the reason for controlled integration
of these techniques in AI.

Motivation


AI (and Cognition) is very limited in scope if we
limit it to rational thought.


Can you quantify Fear? Can you tell whether I
am afraid?



If I had a computer that could read your facial
expressions, the tone of your voice, and
“barked” accordingly, will you accept it as having
a puppy
-
like “intelligence”?


How often have you used Emoticons in chat
messages? Did you feel hampered without
them?


If we pursued this to the end, could we have an
AI based NAZI propaganda?

Understanding Emotion:Hints from Psychology


Psychology focuses on three broad divisions : Affect,
Behaviour and Cognition (ABC)


Affect is the ability to
feel


Some contrasting theories of emotion



James
-
Lange theory : We act therefore we feel.


Neurological Theory : Emotion is a mental state due
to influence of certain neurochemicals (think
hormones) on the limbic brain


The Limbic part of brain is theorised to control
emotion, behaviour, long
-
term memory and
smell.


Recent findings show that the limbic system is
not central to emotion.

Theories of Emotion


Cognitive Theories : Emotions are a heuristic to
process information in the cognitive domain.



Two Factor theory : Appraisal of the situation, and the
physiological state of the body creates the emotional response.
Emotion, hence, has two factors.

What’s the take
-
away from all this? No one has a clear theory formulating Emotions!!

Emotion vs Emotion Display

: Such widely differing theories
for Emotion need not handicap our studies, since all of them are
agreed on the various observable properties of Emotions


Emotion Display (or Affect Display).

Typical Human Affect Display occurs through



Voice


Face


Gestures

Role of Emotion in Intellect

Three major areas of
Intelligent activity are
influenced by emotions



Learning


Long
-
term Memory


Reasoning


Popular (exaggerated)
examples of highly
intelligent, but
emotionally challenged
characters have been
shown here.

Images courtesy Google Images

Modelling Learning


Learning by Example


Nearest analogy in AI is PAC learnability


Parrot repeating English words, Infant learning language


Learning by Guidance


Nearest analogy in AI would be A* search (the heuristic is a
guide)


Our Educational System is based on this method


Learning by Feedback


Nearest analogy is Neural Network/Expectation Maximisation
(where the output is used to tweak parameters of the
system)


Dog learning new commands, typical carrot
-
and
-
stick
scenarios

Emotion and Intelligence


Somatic Marker Hypothesis


Real
-
life decision making situations may have many complex
and conflicting alternatives : the cognitive processes would
be unable to provide an informed option


Emotion (by way of somatic markers) aid us (visualisable as
a heuristic)


Reinforcing stimulus induces a physiological state, and this
association gets stored (and later bias cognitive processing)


Iowa Gambling Experiment


Designed to demonstrate Emotion
-
based Learning


People with damaged Prefrontal Cortex (where the semantic
markers are stored) did poorly.


Emotion in Reasoning


Minsky’s Ideas : An intelligent system should be able
to describe the same situation in multiple ways
(resourcefulness)


such a meta
-
description is
“Panalogy”


We now need meta
-
knowledge to decide which
description is “fruitful” for our current situation and
reasoning


Emotion is the tool in people that switches these
descriptions “without thinking”.


A machine equipped with such meta
-
knowledge will be more
versatile when faced with a new situation.

Emotional Computers

[xkcd] a webcomic

www.xkcd.com

Use of emotional computers


Musical Tutor for piano lessons


Is it maintaining interest?


Is the student making mistakes?


Is the lesson tough or the piano key stuck?


Should it just make the user happy?


Human teachers use affective cues


Imagine an emotionless tutor.

So how do we go about it?


Answer=Affective Theory of Computation


What are emotions? We don’t really know!


Avenues


Express Emotions


Influence Emotions


Act on Emotions


Percieve Emotions


Express Emotions


Display Emotions


Computer voices with natural intonation


Computer Faces


“How” to show I'm happy.


Example:
-

Animation


Model Emotions


React to events


Internal Representation of Emotion


Example:
-
Kismet

KISMET


Recognise stimuli


Intelligently display
emotion


Efficient model for
emotions(more on this
later)



Realistic(don't you get
that puppy dog feeling?)


[A,V,S] Emotion Model


[Arousal , Valence , Stance] :
-

A 3
-
tuple
models an “emotion”.


Arousal:
-

Surprise at high arousal, fatigue
at low arousal


Valence:
-

Content at high valence,
Unhappiness at low valence


Stance:
-

Stern at closed stance, accepting
at open stance

Kismet's Emotive Response Table

Influence Emotions


Computers(in fact all media) already do
this!!


E.g., a computer game makes one happy


Targeted marketing


Frequency and types of Ads


User profiling

Emotional Actions


Which action suits which emotion?


A decision must be made


Too many or too little parameters to evaluate
rationally


Intimately related to human psyche(e.g.,
choosing a gift for a loved one)



Humans ability


Represent the same thing in many ways


Representation depends on current emotion

Percieve Emotions


Observe a human and infer his/her emotion


Approaches:
-


Speech Tone Recognition


Facial Expression Recognition


Galvanic Skin Resistance(GSR), Electro
-
myograms(EMG) etc.


We'll talk about the first two (Speech and
Facial Expression).

Facial Expression Recognition:
Learning by Feedback


Classical Example of
Learning By Feedback.


Young children look at their
parents, and “learn” from
their facial expressions
what is right and what is
not

Image courtesy

Google Images

Expressions & Emotions


Although human beings can volunarily adopt a
facial expression, most of our expressions are
involuntary in nature


Especially true for our immediate/reflex emotions.
In such cases almost impossible to curtail our
expression.


The close link, between the two sometimes leads
to the reverse too, where assuming an expression
leads to the emotion.

Significance of Facial Expressions


The expression on a faces,
is the most basic form of
non
-
verbal communication.


Our impression of other
people, is highly dependant
on their expression.

Classes of Expressions


Broadly classified into
happy,sad, disgust, fear,
anger, surpise and
neutral.


Goal is to classify an
unknown expression into
one of these classes

Courtesy :

Google Images

AI and Facial Expression Recognition


A base of affective computing is recognition of
human expression.


Purpose is to introduce natural ways of
communication in person
-
to
-
machine interaction.


As in children, a robot, can learn better, when it
looks for feedback from a “non
-
expert” , in the
form of facial expressions.


More natural to us than “pushing buttons”.

General Machine Vision


First step in the process is “vision”.


After the image is acquired, some
preprocessing is done such as to reduce
noise, improve contrast.


Next features are extracted and areas of
interest are “detected”


Finally some high
-
level processing occurs.

Optical Flow


Used to capture motion of objects
due to relative motion between
object and observer.


Also used to derive “structure” of
objects.



Looks at intensity of “voxels” and
tries to solve a set of differential
equations.


Voxels = Volume Pixels = Think
Pixels in 3d

Methods of Facial Reocognition


Early methods used optical flow to capture
movement of features.(Such as facial muscles)



Broadly methods are Model
-
Based, Feature
-
Based
or Holistic Spatial Based.


Model & Feature
-
Based Methods have a set of
predefined features which are further used.


Though this is simple and reduces complexity,
there is a loss of information.

Holistic Spatial Analysis


Whole image is taken not just specific features.


No pre
-
defined features. Rather try to discover
intrinsic structural information. These are then used
to recognise the class of expression.


Further divided into unsupervised (examples PCA,
ICA) and supervised (example FDA). In supervised
training is done on class
-
specified samples.


Math behind this is quite complex, based on feature
subspaces.

Feature Selection


Selecting some features, assists in reducing
complexity of process.


Would want to select features that can “identify”
the class.


Hence the difference in the value of the feature
between samples of the class should be small
compared to those across classes.


Thus identify clasification ability of feature.

Weighted Saliency Maps


Simple example of such a method. Uses pixel
intensities of grayscale images.


Calculates ratio of variance between classes and
within a class.


σk = VarB/VarW , k = 1,..., n.


VarB=Sum of (ClassMean
-

OverallMean)
2,

for all
classes and VarW=Sum of (f
-
MeanofClassof(f))
2,

for all f. Here n is number of sample points.

Weighted Saliency Maps(Contd.)



These ratios are then sorted in descending order .


Above is an example for the top
500
features of
each class for a particular sample

Courtesy [6]

Speech Tone Recognition


Why have humanoid robots ?


Enjoyable interaction


Doesn't require training on humans part


Easier to teach then bot new tasks


Acoustic patterns contain :


Who the speaker is?


What the speaker said


How it was said


The third piece of information is a strong
indicator of the underlying intent.


Abstraction of the problem


Classify a given sentence to convey one of:


Approval : Good boy!


Prohibition : Don't do that.


Attention bidding : Hey Kismet, look here.


Soothing : It's okay, don't worry.


Neutral : This is a boo


Fernald's Prosodic Contours

Courtesy [7]

Robot specifications:


Aesthetics : Appearance should affect nature of
human communication with it.


Real Time Perfomance : Long delays are not
acceptable.


Voice : Humans should be able to use their
natural voice for training. It should be able to
recognize a vocalization as having affective
content when the intent of the sentence is to
approve/prohibit, etc.

Specifications, Contd.


Unacceptable vs Acceptable misclassification:
Shouldn't judge prohibition to be approval, but
to judge it as neutral is an acceptable error.


Expressive Feedback : Respond to emotion to
let the person know it has understood.


Speaker Dependence vs Independence:
Former for personalized bots, latter for those
that need to interact with many people.

Algorithm : Classify emotional
content in speech


Processing : tag sample with pitch, energy, percentage
periodicity.


Filter out noise : very high pitches (non
-
uniform), very
low pitches.


Calculate features (mean,variance of pitch,energy, pitch
range )



Pass to classifier for result.

Courtesy [
7
]

5
-
way classification in KISMET


Stage
1
: Energy parameters are used to differentiate.
(soothing, low
-
intensity neutral have low mean energy).


Stage
2
:


Using Fernald's prosodic contours, soothing shows a
smooth contour, frequency downsweep. Neutral is
coarser and flatter.

Courtesy [
7
]

Classification : Contd.



Approval &Attention shows high mean pitch,
high pitch and energy variance; Prohibition has
low mean pitch but high enery variation. Neutral
shows low energy and pitch variation.


Stage 3 : Approval vs Attention. Both have high
energy, and high pitch variation. But in approval,
there is an exaggerated rise
-
fall pitch contour. Yet,
this differentiation is difficult, and often the content
is required to disambiguate.

KISMET's response to emotion


Has a synthetic nervous system (SNS) to help react
to external stimulus.


The 'somatic marker' process to tag incoming
information with affective content.


Arousal : Level of emotional response


Valence : Is the stimulus+ve or
-
ve


Stance : How approachable is the percept?


This information is passed to the 'emotion elicitor'.


Emotional Elicitor : Each [A,V,S] input
contributes to some emotion process. Eg, A
large
-
ve valence might contribute to sad, anger,
fear, distress emotions.

Response Contd.


The winning emotion process affects the response if
its value is above some threshold.


Two thresholds, one for behavioural response, the
other for response through expression (the latter is
lower). This indicates that expression leads
behavioural response.



On praise, first comes interest, and then physical
alignment.




| Arousal

| Valence


| Stance



| Expression

------------------------------------------------------------------------------------------

Approval Med. high High +ve



Approach Pleased

Prohibition

Low



High
-
ve



Withdraw

Sad

Comfort

Low


Medium +ve


Neutral

Content

Attention

High



Neutral



Approach

Interest

Neutral


Neutral




Neutral



Neutral

Calm

Do we want ‘Emotional’ Machines?


Nazi Propoganda Machine?


A computer that knows how to influence emotions


The perfect politician


Computers with the ability to kill


Not a distant dream. Civilian aircraft is an example.


Choosing a sub
-
optimal (emotional) path.


Will an angry/insulted computer behave
dangerously?


Popular Example:
-

M5 of Star Trek, HAL 9000 of
“2001
-
A Space Odyssey”


The Example:
-

Marvin of “The Hitch
-
Hiker’s Guide”

Main Dilemna


Computers without emotions


not creative or
intelligent.


Computers acting on emotions may someday
wipe out their creators.



Possible solution : Give computers ability to
perceive, express and heuristically act on
emotions, but ensure that the emotions are
always visible

Conclusion


Affective Computing is a young field of research


For interactive systems, something far better than the
current crop of “intelligent” systems is needed.


Affective Computing has applications in improving the
quality of life in impaired people (successfully
demonstrated for Autism)


Ethical compromises need to be done to inculcate
affective computers


This field can really benefit from research into the
human brain/mind.

References

1.
R.W. Picard (
1995
), "Affective Computing“,
MIT Media Lab

2.
R.W. Picard (
1998
) , “Towards Agents that recognize emotions”,
Actes
Proceedings, IMAGINA

3.
http://www.ai.mit.edu/projects/humanoid
-
robotics
-
group/kismet/kismet.html

4.
Descarte’s Error : Emotion, Reason and the Human Brain, Damasio
(
1994
Edition)

5.
Automatic Facial Expression Recognition using L inear and Non
-
Linear
Holistic Spatial Analysis, Ma and Wang (
2005
)
Lecture Notes in CS

6.
Emotion and Reinforcement : Affective Facial Expressions facilitate Robot
Learning, Joost Brokens (
2007
)
Lecture Notes in CS

7.
Recognition of Affective Communicative Intent in Robot
-
Directed Speech,
Breazal and Aryananda,
MIT Media Lab

8.
en.wikipedia.org : Emotion, Somatic Marker Hypothesis, Vision, Optic
Flow.