Download - AusTalk

matchmoaningΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 4 χρόνια και 4 μήνες)

91 εμφανίσεις

Roland Goecke

Trent Lewis

Michael Wagner


Big ASC Meeting 15
16 April 2010

What is Calibration?

Calibration is not so much a data collection process,
although that will happen to some extent as well

Rather it is about ensuring that the hardware components
and black box setup are all correct

Or at least that the settings have been recorded for
subsequent analysis

Occurs before the actual data recording takes place.


Big ASC Meeting 15
16 April 2010

What is included?

Equipment checking

Is the audio and video capturing software running?

Are the lights set up correctly?

“Recording” of environmental settings

What is the light level?

What is the acoustic background noise level?

What are the distances between camera(s) and microphone(s)?

“Recording” of subject calibration sequences

Face turning

Lip movements

Big ASC Meeting 15
16 April 2010


Why is this Data Important?

The calibration data is potentially fundamental to everyone who will
use the corpus.

To name just a few research areas that will particularly pay attention to
the calibration data:

A and AV speech recognition

A and AV speaker recognition

Biometrics (face recognition, face
voice recognition)

Speech Perception/Psycho
Acoustics researchers

Speech and Hearing researchers

Big ASC Meeting 15
16 April 2010


Hardware and Software Requirements

Normal recording equipment and software

An additional light meter would be useful to measure the
‘global’ level of light in the recording environment

Do we need to do something similar for measuring the
acoustic background noise?

Swivel chair to place subject in

Assists the capturing of the face/head from different angles

We want the subjects to turn with the chair, not just turning their

This is more accurate

Masking tape to mark chair position, angles, etc.

Metronome (AV

Big ASC Meeting 15
16 April 2010


Collection Process

Step 1

step process

Step 1

Record environment without subject

At the beginning of each session or, in case of sessions over longer
periods of time, once every hour in case the environmental
conditions have changed

Audio and video recording of the recording environment without a
subject present (30s)

Audio and video recording of the metronome in the scene (30s)

Measurement of location of light sources and distance to
camera(s)(manual measurement)

Check camera output is being recorded

Check microphone output is being recorded

Time 5min

Big ASC Meeting 15
16 April 2010


Collection Process

Step 2

Step 2

Person specific calibration

At the beginning of each recording session with a subject

Sit subject on swivel chair. Measure distances camera(s) to subject
and microphone(s) to subject (manual measurement)

Turn subject to 90

left. It is important that the subject turns their
entire body on the swivel chair such that the face (nose?) points in
the required direction.

We will need both markers on the floor as well as on the walls in 15

intervals to facilitate the correct turning on the subjects.

Turn subjects to every 15

starting from

(left profile) to +90

right profile), take 2s at each position

Big ASC Meeting 15
16 April 2010


Collection Process

Step 2

Let subject face camera frontally.

Participants are to say the following two lip movement
calibration sequences for 5s each:

e o e o e o …

(testing lip rounding)




(testing vertical mouth opening)

This is similar to what was done in the AVOZES corpus and
turned out to be quite useful in determining some
understanding of the range of lip movements a subject

Other sequences are possible

Time: 5min

Big ASC Meeting 15
16 April 2010


Coding and Annotation

No coding or annotation required as such

Want to take note of the environmental conditions in
which the recordings take place

Light level

Can the acoustic base level, i.e. when no one is talking can be measured
from the recorded audio stream, be sufficiently determined from the
recordings without
a subject? If so, no extra measurements required here.

Distance of camera(s) to subject(s)

Distance of microphone(s) to subject(s), e.g. to chin or mouth

Location of light sources and distance to camera(s) or subject(s) (we may
need a sketch of the recording environment for each location)

Big ASC Meeting 15
16 April 2010