Evaluation of perception and interaction Evaluation of perception and interaction of human of human- -friendly robots friendly robots

fencinghuddleAI and Robotics

Nov 14, 2013 (3 years and 1 month ago)

50 views

Evaluation of perception and interaction
Evaluation of perception and interaction
of human
of human
-
-
friendly robots
friendly robots
Nov. 29, 2007
Nov. 29, 2007
Rainer Stiefelhagen
Rainer Stiefelhagen
Research group
Research group


Computer Vision for Human
Computer Vision for Human
-
-
Computer
Computer
Interaction
Interaction


Interactive Systems Labs
Interactive Systems Labs
Universit
Universit
ä
ä
t
t
Karlsruhe (TH)
Karlsruhe (TH)
Outline
Outline


Our background
Our background
–Human-robot interaction
–Smart Rooms


Benchmarking for HRI
Benchmarking for HRI
–Why, Experiences, Comments
–The CLEAR evaluation workshop
–Evaluating components vs. human-robot interaction


Conclusion
Conclusion
Human
Human
-
-
robot
robot
interaction
interaction


Robots
Robots
need
need
to
to
perceive
perceive
and
and
understand
understand
humans
humans
,
,
their
their
activities
activities
,
,
interaction
interaction
–To cooperate
–To understanduserneedsand intention
–To behaveand respondappropriately
–To learn


Perception
Perception
and
and
interpretation
interpretation
of all
of all
communication
communication
cues
cues
is
is
necessary
necessary
–Speech, gestures, gaze, focusof
attention, affect, …
–Thewholecontextneedsto be
perceived:
–Who, What, Where, To whom, How?


Humanoid
Humanoid
Robot
Robot
project
project
in Karlsruhe
in Karlsruhe
Context Aware Computing
Context Aware Computing
•Visual
–Location
–Identity
–Gestures
–Body-language
–Gaze, Head pose
–Focus of Attention
–Facial expressions
Need Recognition and Understanding
of Multimodal Cues
•Verbal
–Speech
–Language
–Topic
–Handwriting
–Emotion


Why did Joe get angry at Bob
Why did Joe get angry at Bob
about the budget ?
about the budget ?


Weneedto understandthefullcontext: Who, What,
Where, When, Whyand How?
Perception for the Karlsruhe Humanoid
Perception for the Karlsruhe Humanoid


Components for the Karlsruhe
Components for the Karlsruhe
Humanoid
Humanoid
–3D Multi-person tracking (AV)
–Active camera control /
attention (AV)
–Person identification
–Estimation of head pose and
focus of attention
–3D Pointing gesture
recognition
–Speech recognition
–Noise classification


Multimodal Dialog
Multimodal Dialog
Perception Components
Perception Components


Tracking & Face recognition
Face recognition
Head Pose
System overview
System overview
Tasks
Tasks


Interaction in a kitchen
Interaction in a kitchen
scenario
scenario
–Tracking, gestures, dialogue
–Selection of objects etc.


Interactive recognition of
Interactive recognition of
people
people
–Tracking, identification,
dialogue
–Database continuously
extended


Interactive object learning
Interactive object learning
–Object detection, dialogue
Smart Environments
Smart Environments


Smart Rooms are similar to
Smart Rooms are similar to
human
human
-
-
friendly robots
friendly robots


Perception is necessary
Perception is necessary
–Tracking, identification, gestures,
head pose / attention, actions,
situations


Similar tasks
Similar tasks
–Room should understand what is
going on
–Room should provide services
–Dialogue and proactive behaviour
needed




EU project
EU project


Computers in the
Computers in the
Human Interaction Loop
Human Interaction Loop


(CHIL)
(CHIL)
–Integrated project in FP6
Benchmarking!
Benchmarking!


Benchmarking is most important !
Benchmarking is most important !
–Comparison of results and approaches is only possible on
common benchmarks
understanding of good approaches leads to faster research
progress


Common metrics need to be defined
Common metrics need to be defined
–Often not yet agreed in the community


Multimodal data is needed !
Multimodal data is needed !
–Both for development and benchmarking
–Data should fit the scenarios in mind, needs to be realistic
–Data collection and annotation is very expensive
–International joint efforts help to share cost!
Perceptual
Perceptual
Technolgies
Technolgies
Evaluations in CHIL
Evaluations in CHIL


Data collected in five smart rooms
Data collected in five smart rooms
–UKA, IBM, UPC, AIT, ITC


Common sensor setup
Common sensor setup
–4 fixedcameras, fish-eyecamera
(ceiling)
–5 microphonearrays, + close-talking
mics.
–all synchronized


Manual annotations of
Manual annotations of
–Speech/ acousticevents
–Head centroid, bounding box, 3D
location eyes, nose bridge, ID
•Vision technologies
–Face & Head tracking
–Visual Person Tracking
–Visual Person Identification
–Head Pose Estimation
–Pointing Gesture Recognition
•Audio technologies
–Speech Recognition (CTM / far-field)
–Acoustic Person Tracking (in space)
–Acoustic Speaker Identification
–Acoustic Event Detection
–Emotion recognition
•Multimodal technologies
–Multimodal Person Identification
–Multimodal Person Tracking
•Content processing
The CLEAR evaluation workshop
The CLEAR evaluation workshop


Classification of Events, Activities and Relationships (CLEAR)
Classification of Events, Activities and Relationships (CLEAR)
–Held first in 2006
–focus is on technologies for human activity and interaction analysis
–Combining Vision and other modalities


Goals
Goals
–Establish a common international evaluation forum for these technologies
–Create venue for the exchange of data, ideas, and sharing evaluation burden
across programs –leverage resources
–Standardize on evaluation tasks, metrics, and formats


Current tasks
Current tasks
–Face tracking, 3D Person tracking, identification, 3D head pose estimation
–Current scenarios: Smart Rooms (Lectures / Meetings), Surveillance


CLEAR jointly organized with the NIST Rich Transcription (RT) ev
CLEAR jointly organized with the NIST Rich Transcription (RT) ev
aluation
aluation
–Focus is on technologies for language transcription
–Speech-to-text, speaker-diarization, ..
–Current domain: meetings (& lectures)
–Organized by NIST
CLEAR
CLEAR
-
-
Workshops
Workshops


2006: 1st CLEAR evaluation workshop
2006: 1st CLEAR evaluation workshop
–Jointly organized by Univ. of Karlsruhe (TH) and NIST
–Supported by the European project CHIL and the US VACE project
–Workshop in April 2006, Southampton, UK
–15 participating labs (US and Europe)
–Jointly organized with the RT-workshop
–Results published with Springer (www.clear-evaluation.org)


2007: 2nd CLEAR evaluation workshop
2007: 2nd CLEAR evaluation workshop
–Jointly organized by Univ. of Karlsruhe (TH) and NIST
–Supported by the European projects CHIL, AMI and the US VACE
project
–Workshop in May 2007, Baltimore, US
–17 labs (US, Europe, Asia)
–Collocated with RT workshop
Benchmarking of Interactive Tasks
Benchmarking of Interactive Tasks


Quite different from component evaluation
Quite different from component evaluation
–User in the loop
–Cannot just us a common data set


Interactive tasks need to be defined
Interactive tasks need to be defined
–Kitchen
–Receptionist
–…?


Possible measures of success:
Possible measures of success:
–% of successfully completed tasks
–Time needed for completion
•Seconds, # of dialogue turns, …
–User satisfaction
•Ease of use, fun, etc.
•Questionaires
Conclusions
Conclusions


We need benchmarks!
We need benchmarks!
–Comparability faster progress
–Joint production of data resources and tools


First, we need to find scenarios that are interesting to many of
First, we need to find scenarios that are interesting to many of
us
us
–Homes, Kitchen? Smart Homes …
–Sensor setup
–Then: definition of metrics (can be very difficult / controverse)


Benchmarks must be aligned with our research projects
Benchmarks must be aligned with our research projects
–Need to be challenging, but not completely impossible
–Make it more difficult, as we progress


Evaluation of components vs. interactive tasks
Evaluation of components vs. interactive tasks
–Scenarios should support both types


Evaluations should be international and open
Evaluations should be international and open
–Maximize impact on and input from the community