ViSiCAST Milestone: Final Report

mumpsimuspreviousΤεχνίτη Νοημοσύνη και Ρομποτική

25 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

258 εμφανίσεις




ViSiCAST Milestone: Final Report


Project Number:

IST
-
1999
-
10500


Project Title:

ViSiCAST

Virtual Signing: Capture, Animation, Storage and Transmission

Deliverable Type:

Milestone Report


Deliverable Number:

M8
-
3

Contractual Date of Delivery:

Decemb
er 2002

Actual Date of Delivery:

December 2002

Title of Deliverable:

Final Report

Work
-
Package contributing to the Deliverable:


Workpackage 8


(Exploitation and Dissemination)

Author(s):

Michele Wakefield



Abstract:

This final report addresses the ob
jectives of the project ViSiCAST and
provides a comprehensive view of the results obtained.











ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
2

of
49

INFORMATION SOCIETIES TECHNOLOGY

(IST)

PROGRAMME






Contract for:


Shared
-
cost RTD




Project acronym: ViSiCAST

Project full title:
Virtual Signing:
Capture, Animation, Storage and
Transmission

Contract no.:

Related to other Contract no.:


Date of preparation of Final Report: 10 December 2002


Proposal number: IST
-
1999
-
10500

Operative commencement date of contract: 01 January 2000

Duration: 3 years


P
artners:

Independent Television Commission (ITC), Televirtual (TV), Deaf Institute
of Netherlands (IvD)(to be known as Viataal in the future), Royal Institute for Deaf
People (RNID), University of East Anglia (UEA), The Post Office (PO), University of
Hamb
urg (UH), Institut for Rundfunktechnik (IRT), Institut National des
Télécommunications (INT).

Final Report



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
3

of
49


Contents




1. PROJECT OVERVIEW

................................
................................
................................
................................
..

4

2. PROJECT OBJECTIV
ES

................................
................................
................................
..............................

6

3. APPROACH

................................
................................
................................
................................
.....................

7

4. PROJECT RESULTS A
ND ACHIEVEMENTS

................................
................................
...........................

7

4.1

S
CIENTIFIC
/

T
ECHNOLOGICAL
Q
UALITY AND
I
NNOVATION

................................
................................
.........

8

4.1.1 Television & Broadcast Transmission

................................
................................
...............................

8

4.1.2
Customer
Services

................................
................................
................................
............................

12

4.1.3 Signed Weather Forecast on the Internet

................................
................................
........................

13

4.2

C
OMMUNITY ADDED VALUE

AND CONTRIBUTION TO
EU

POLICIES

................................
.............................

14

4.3

C
ONTRIBUTION TO
C
OMMUNITY
S
OCIAL
O
BJECTIVES

................................
................................
................

15

4.4

E
CONOMIC
D
EVELOPMENT
&

S
CIENTIFIC
&

T
ECHNOLOGICAL
P
ROSPECTS

................................
................

17

5. DELIVERABLES AND
OTHER OUTPUTS

................................
................................
..............................

19

5.1

W
ORKPACKAGE
1:

T
ELEVISION AND
B
ROADCAST
T
RANSMISSION

................................
.............................

19

5.2

W
ORKPACKAGE
2:

M
ULTIMEDIA AND
WWW

APPLICATIONS

................................
................................
.....

23

5.3

W
ORKPACKAGE
3:

F
ACE
-
TO
-
F
ACE
T
RANSACTIONS
................................
................................
....................

24

5.4

W
ORKPACKAGE
4:

A
NIMATION AND
M
ODELLING

................................
................................
......................

26

5.5

W
ORKPACKAGE
5:

L
ANGUAGE AND
N
OTATION

................................
................................
.........................

31

5.6

W
ORKPACKAGE
6:

T
RIALS AND
E
VALUATION

................................
................................
............................

33

5.7

W
ORKPACKAGE
7:

P
ROJECT
M
ANAGEMENT
,

E
XTERNAL
C
OMMUNICATIONS
&

P
UBLICITY

.......................

35

5.8

W
ORKPACKAGE
8:

E
XPLOITATION AND
D
ISSEMINATION

................................
................................
...........

36

5.9

P
ARTICIPATION IN EXHI
BITIONS
,

ARTICLES
,

CONFERENCE PRESENTAT
IONS

................................
................

36

6. PROJECT MANAGEMEN
T AND CO
-
ORDINATION

................................
................................
............

46

6.1

C
ONSORTIUM AND
W
ORKPACKAGE
M
EETINGS

................................
................................
..........................

47

7. OUTLOOK

................................
................................
................................
................................
.....................

48

8. CONCLUSIONS

................................
................................
................................
................................
.............

49



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
4

of
49

1. Project Overview


This final report describes virtual human signing systems develope
d by the
European
collaborative project named
Virtual Signing: Capture, Animation, Storage and
Transmission

(ViSiCAST)
(
www.visicast.org)
.


ViSiCAST seeks to improve access to information, entertainment, education
and
public services for Europe’s deaf citizens. ViSiCAST has developed enabling
technologies to provide signing from annotated text, from the captured motions of a
skilled human signer, and from speech.
Recent advances in multimedia technology
have created

an opportunity to use

a 'virtual human' sign language interpreter, in the
form of an animated avatar.
Led by the ITC under its technology research programme,
this project has successfully demonstrated that virtual humans (VH) can
achieve
acceptable signin
g for television, point of sale and
Internet applications.



Television & Broadcast Transmission

The signing possibilities using the ViSiCAST capture and broadcast technology were
shown publicly at the International Broadcasting Convention in September 200
2.


Worldwide broadcasters and deaf participants registered their interest in IRT and
INT’s demonstrations.


Figure 1: Broadcast example of “open signing” Figure 2: Example of “closed signing” approach


BBC Research and Development, collab
orators with ViSiCAST, developed its first
demonstrator for broadcast closed signing based on motion capture. The closed
signing approach promises to open up many more programmes (eventually all those
which have been subtitled) for sign language access by
use of automated translation
from subtitles into sign language gestures and movements.


Customer Services

In co
-
operation with the UK Post Office (PO), ViSiCAST has also been exploring the
possibilities of increasing access to customer services nationwide
through signing.
The ViSiCAST virtual human used in this application (named “TESSA”) won the
BCS 2000 IT Award and Gold Medal and was successfully exhibited at the Science

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
5

of
49

Museum, London in Summer 2001 (and at ACM One, the major conference of the
Associat
ion for Computing Machinery, San Jose).



Figure 3: TESSA at the PO Figure 4:
TESSA
-

the first virtual signer at the Science Museum


Evaluations of
this Post
Office system by members of the Deaf community (in
conjunction with the RNID) have shown that TESSA’s signing is easily understood by
BSL users, who are enthusiastic about how useful TESSA may be in the future.
Encouraging reports have been broadcast on na
tional television
-

TESSA was featured
very favourably on the BBC’s See Hear Programme and the Children’s TV
programme “Blue Peter”.


Signed Weather Forecast on the Internet

In collaboration with the Dovenschap (Dutch deaf society), the ViSiCAST project ha
s
launched an internet weather forecast application. Interest shown by deaf users has
been very encouraging.
Signs have been captured for Sign Language of the
Netherlands, German Sign Language and British Sign Language, each with a native
signer of that la
nguage.


Figure 5: Web page with a weather forecast in sign language, performed by the avatar 'Visia'


The ViSiCAST project has also developed a multimedia package to assist in the
learning of sign language, where the user can build a phrase to be signed
interactively
by the avatar.


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
6

of
49

Natural Language Translation

The natural language processing to animation route is now complete for simple
sentences.

The University of East Anglia, Televirtual and the University of Hamburg
developed the automated translation
system for ViSiCAST. This translates English
text to signs represented in the signing gesture markup language (SiGML) and
synthetically animates the virtual human in three European sign languages.


Figure 6: Translation of simple sentences to animation vi
a SiGML sign descriptions


2. Project Objectives


The Project objectives can be found in Annex 1 of the Contract:

ViSiCAST develops, evaluates and applies realistic Virtual Humans (avatars),
generating European deaf sign languages. By building application
s for the signing
system in television, multimedia, internet and face
-
to
-
face transactions, ViSiCAST
seeks to improve the position of Europe’s deaf citizens, their access to public services
and entertainments and enable them to develop and consume their ow
n multimedia
content for communication, leisure and learning through :



systems for the generation, storage and transmission of Virtual Signing Systems.



user
-
friendly methods to capture and generate signs.



a machine readable system to describe sign
-
languag
e gestures (hand, face and
body) to:



retrieve stored gestures or



to build complete gestures from low
-
level gesture components.



use the descriptive language to develop translation tools from speech and text
to sign.


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
7

of
49

3. Approach


The starting point for the

project was a working prototype automatic signing system.
The participants then identified three main application areas and two areas requiring
new basic research. The applications, closely inter
-
related to the project’s ultimate
exploitation aims, were
in television, the signing of stored contents and in face
-
to
-
face
transactions. Work on the applications started immediately using the existing signing
system. The application systems were refined from the re
-
iterative evaluation process.
The user evaluat
ion was assigned its own workpackage to reflect the crucial
dependence of the project on the involvement of the deaf community.


In general, the project methodology componentised each task. The sub
-
tasks were
conceptualised in the simplest possible terms.

The experience and knowledge gained
from this process was then used as the basis for research into a task that more
accurately reflected the real
-
world situation.


In practise, the results from the research workpackages were input to workpackages
that ad
dress applications, which were then evaluated. Results from the evaluations
were then used to both revise the research objectives and to refine the applications.


The Work plan was arranged so that some effective deliverables were available
relatively e
arly. These were based on a large degree of manual intervention in the
process of translating to sign language, or on the use of unmediated signing recorded
as discrete performances. Some were based on interim stages towards full sign
languages. Feedbac
k from the deaf community refined the focus for later stages.


Teams working on a group of related workpackages were able to progress between
milestones/ deliverables with relatively little dependence on other groups, so that
changes in the pace of progres
s within the project would not have serious
repercussions.


4. Project Results and Achievements


The ViSiCAST project addresses the needs of the deaf community across Europe.
The project aims to improve communication between the deaf and the speaking
comm
unity, and to allow deaf people to access information sources. Poor access to
television excludes deaf people from the major source of news and information,
entertainment, education, and (modern) culture available to the speaking world. With
the convergen
ce of digital broadcasting and broadband multimedia on the internet, it
is increasingly important that the needs of deaf people are addressed through signing,
as recognised by legislation in the UK.



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
8

of
49

In addition to broadcast captioning, systems have been

innovated by ViSiCAST for
recognising a limited range of signs that allow deaf people to participate with greater
ease in many transactions in social contexts such as Post Offices. A system by which
sign language content can be added to web
-
pages and mult
imedia has made these
media more accessible and useful for deaf people.

4.1 Scientific / Technological Quality and Innovation


ViSiCAST uses new multimedia technologies to radically improve the quality of life
of deaf people by providing more effective an
d cheaper ways to communicate. Video
and CD
-
ROMs are already improving communications for deaf people by providing a
medium for recording and transmitting sign language. They do not, however, address
all situations for example in television or in a face
-
to
-
face interaction with a non
-
signer, or situations with limited bandwidth such as the internet accessed through a
regular modem. Despite recent advances in video compression, the bandwidth and
storage demands of video still make video an expensive medium
. ViSiCAST
addresses these problems by providing low
-
bandwidth and hence low
-
cost solutions.

4.1.1 Television & Broadcast Transmission

A final goal for the broadcast application of the virtual human signing system is the
automated translation from text s
ubtitles (accompanying very high proportions of
television programmes) into sign language, providing a wider choice of programmes
for those who rely on this form of access. Techniques to translate in real
-
time,
however, from English into natural forms of s
ign language (such as BSL) are still not
yet fully mature.


In view of this, the ViSiCAST project has also innovated a simplified system which
captures the movements and gestures of a human sign language interpreter and then
codes these for low bandwidth t
ransmission and subsequent reconstruction, to be
performed by a high quality avatar in the receiver.

A block diagram of the complete
ViSiCAST approach, including the simplified system, is shown in Figure 7.
The data
-
rate needed for transmission of the sign
ing information is approximately 50kbps for
the motion capture approach and 20kbps for the notation
-
based approach. Both of
these use significantly less than bandwidth reduced video.



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
9

of
49


Figure 7: Overview of ViSiCAST system


The simplified approach

For bro
adcasting,
ViSiCAST innovates in the field of human motion capture itself and
has created tools to produce a motion capture system capable of use by non
-
specialist
operators. A

human sign language interpreter
produces motion captured sign
sequences to acco
mpany the broadcast TV programme as in the process shown at the
top half of Figure 7. This involves a few simple steps:




To provide the data needed to animate the virtual human in the receiver, the
gesture movements of the human sign language interpreter a
re recorded in the
form of motion capture.




Data is captured using individual sensors for the hands, body and face (Figure 8).
This is because
natural sign languages, such as BSL, communicate efficiently with
hand shape position and movements, facial expr
ession and body posture.




Data
-
gloves, which have sensors to record finger and thumb positions are used to
record hand
-
shapes. Magnetic sensors also record the wrist, upper arm, head and
upper torso positions in three
-
dimensional space relative to a magne
tic field
source. A video face
-
tracker, consisting of a helmet
-
mounted camera with infrared
filters, surrounded by infrared light emitting diodes, records facial expression.
Reflectors are positioned at regions of interest such as the mouth and eyebrows.
T
he various sensors are sampled at between 30 and 60 Hz.



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
10

of
49


Figure 8: Sign motion capture




The sign language interpreter could in principle perform long sequences of signing
for live broadcast. In practice the data of the moves is usually recorded and

edited,
as would be a soundtrack. The real
-
time animation software allows edited sections
to be sequenced without creating “jump cuts”.


The automated translation approach

In parallel with this motion capture system, the ViSiCAST project has also innovate
d
a more flexible sign generation system for the automatic translation of natural
language into animation in what may be regarded as the true, or complex, forms of
sign language, such as German Sign Language (DGS), Dutch Sign language (NGT),
or British Sig
n Language (BSL). This part of the system cannot yet reliably operate in
the broadcast environment, but it is already able to translate simple sentences
successfully
.


Sign languages have different lexicons and grammar. Different parts of the body are
us
ed to generate gestures for different signs in parallel; modifiers
-

such as facial
expressions
-

are used to provide context which may radically change (or even
reverse) the meaning of a signed sequence. Position or direction of signing may relate
it to
somebody present or previously mentioned, or may indicate a temporal variation
(past, present or future). Totally different signs may be used to indicate certain
objects depending on context. For instance, a book is usually indicated by an iconic
gesture,

palms together, outstretched, opening to simulate the opening of a book and
its pages. But in a sequence describing the giving of a book to somebody else, the
gesture for the book may be in the form of the hand shape used for taking a book
down from a she
lf.


A system to address the complex sign languages requires many new state of the art
features. The project has utilised existing English natural language processing
resources and extensions to these with a world
-
leading first attempt to characterise
gra
mmatical features of natural sign languages within computer models. Within this
approach, the ability to construct hand shapes and movements dynamically is
essential. This has required development of an executable interpretation, and

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
11

of
49

extensions to, a mach
ine
-
readable notation to describe gestures (including the various
elements of hand, body and face). Prior to ViSiCAST, no such executable
interpretation of a detailed notation existed.


The task of translating textual content into sign language is decompo
sed into the
following sequence of stage, (as illustrated in Figure 7):


1.

The English text (perhaps from subtitles) is parsed and from this a semantic
representation is derived.

For example,

“I invited four friends”

is converted to a more complicated
form of “I(Y), invite(Y, X), friend(X),
number(X)=4”



Figure 9: Natural Language to Animation Process


2.

For each target language sign language, a variation of a common HPSG based
grammar, and lexical entries within this framework, have been developed
.
These grammars represent the first serious attempts to characterise grammatical
and phonological features of natural sign languages in sufficient detail that
synthetic sign sequences can be generated. For example, the sign sequence
(referred to by means
of English glosses)


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
12

of
49

FOUR FRIEND INVITE I

is generated, associating “friends” with a specific position in signing space and
ensuring that the start position of “invite” is located at that position and ends at
the signer. In addition, this synthesis stage ge
nerates the full inflected forms of
nouns, verbs etc from their base forms in the sign notation.



3.

Sign sequences in the sign notation are then interpreted by software which
drives the virtual human (avatar) animation. The software first translates the
sig
n gesture notation into an XML based intermediate language (SiGML). It
then combines this avatar independent notation with a description of the
avatar’s geometric properties, to generate a stream of animation data.


Transmission of Signing

Standard ways of

representing the virtual human signing data and efficiently
transferring this across different platforms were needed. ViSiCAST has innovated two
formats


the “baf” [“Bones animation Format”] format for the simplified system
based on motion capture, and t
he Signing Gesture Markup Language (SiGML) for the
transmission of synthetically derived animation based on “notation”.


The data streams drive the virtual human, the latest we have named Visia (Figure 10).




Figure 10:
Visia as a 3D model


This combina
tion of capture and synthetic generation is itself an innovative real
-
time
animation technique and has proven capable of translation of simple sentences.


The signing preparation and virtual signing system has been used to prepare and
present at least 4 sa
mple television programmes with virtual signing, and the
feasibility of broadcast transmission of virtual signing within MPEG
-
2 multiplexes
has been established both through transmissions in the UK and Europe.


4.1.2
Customer Services



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
13

of
49

Another important ar
ea that ViSiCAST has addressed is face
-
to
-
face communication
for the deaf. This innovative system integrates understanding of the Post Office
assistant’s speech and the deaf customer’s signs, to control the dialogue in an
intelligent fashion. The naturally

restricted nature of the transactions at a Post Office
helps to make this novel approach successful.


Figure 11: The VH signer communicating at the Post Office


The system has used newly developed image
-

and pattern
-
processing techniques to
incorporate
recognition of a limited number of signs made by the deaf customer. At
the start of the project, several experimental systems incorporated speech
understanding to provide a service but they were mostly based around remote
(telephone) access and upon under
standing the speech of the caller alone.

4.1.3 Signed Weather Forecast on the Internet

A weather
-
forecaster innovation for the internet has been able to make use of signs
captured for Sign Language of the Netherlands, German Sign Language and British
Sign

Language, each with a native signer of that language.

Figure 12: VH signing on the WWW




The ViSiCAST weather forecast application allows sections of signing to be built
-
up
including variables such as temperatures, weather types and wind directions. In

this
application, the avatar is truly 3
-
dimensional, i.e. the user can enlarge, reduce and turn
it. This is done with simple mouse movements, and can be done both when the avatar
is moving and when she is standing still. This is an important advantage bec
ause

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
14

of
49

forward movements in sign language are sometimes difficult to perceive when
presented in two dimensions.

The ViSiCAST project has also developed a multimedia package to assist in the
learning of sign language, where the user can build a phrase to be s
igned interactively
by the avatar.



Figure 13: VH signing in a learning environment


4.2 Community added value and contribution to EU policies


European legislation (to be implemented within the statutes of the Member States)
requires that access to all

services is made equally available to all citizens. This
means that companies offering very diverse services have to address the issue of
communication with sensory impaired people. At present, such communication relies
heavily upon human sign language
interpreters, but there can never be enough of these
skilled individuals to be present at every face
-
to
-
face interaction or even to sign a
large proportion of broadcast television.


The ViSiCAST project produces adaptable communication tools allowing sign
language communication where only speech and text are available at present. The
ViSiCAST development tools are based on advanced technology for the synthetic
generation, transmission, and storage of sign language. These tools improve the
integration of de
af individuals in society by allowing them access to widely available
communications tools. This gives increased access to public services, commercial
transactions, entertainment, educational and leisure opportunities, including broadcast
television and t
he WWW.


The ViSiCAST project operates at European rather than national level because:


(i)

Of the need for access to varied expertise and resources distributed around
Europe.


(ii)

It has required detailed understanding and analysis of a number of Europea
n
languages, both verbal and signed, which is only really available from

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
15

of
49

individuals and organisations experienced in those languages as native
speakers.


(iii)

Products from the research element have needed subjective and technical
evaluation. Cross
-
natio
nal testing has ensured that methodologies have not
been adopted which are language or local culture specific.


(iv)

By addressing the particular requirements of the DVB/MPEG standards,
ViSiCAST helps to strengthen the international position and standing o
f
European Commission
-
supported research.

For example, IRT have claimed three new data broadcast identifiers at the
DVB Project Office such that future digital TV receivers will be able to
identify ViSiCAST by the unique IDs.


There is a relatively high p
roportion of UK participants within ViSiCAST. This is
because the initial first
-
generation work on virtual signing was carried out as part of
the ITC’s private out
-
sourced research programme. ViSiCAST represents the ‘export’
of this work to the wider Eur
opean consortium and also brings in additional European
expertise.


ViSiCAST complies and contributes to the European Commission 5
th

Framework’s
programmes by confirming the international stature of European Community research
in the field of computational

linguistics and real
-
time virtual human animation. Key
companies in the project are SMEs. This is again in
-
line with 5
th

Framework
horizontal programme objectives.


4.3 Contribution to Community Social Objectives


With regard to social policy, ViSiCAST
complies with the EU’s objectives by
improving quality of life and of living resources, for deaf people and for the hearing
who they need to communicate. ViSiCAST also helps to create a user friendly
information society: it’s signing virtual human tries to

be a natural, intuitive human
-
machine interface. This aspect is enhanced by linking it, within the project, to speech
to text systems. These measures help a larger section of society to use and interact
with computer
-
based information systems.


While m
any deaf people are keen to enjoy the rapidly developing benefits of the
digital age, they do not want to do so at the expense of their identity. It has been
important throughout ViSiCAST that deaf people can use the new systems and find
them of use. The d
eaf people are the best advocators of our product.


ViSiCAST has involved and empowered deaf people at both the stages of evaluating
and testing of systems, and
-

equally important
-

in the defining of legitimate goals.
The University of Hamburg (working

on the development of sign language notation

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
16

of
49

and translation) is a teaching institution with the active involvement of deaf
individuals. The Institute for the Deaf (IvD) in the Netherlands (working on the
development of web
-
applications and multi
-
media l
earning tools) is closely involved
with its own deaf community. The Royal National Institute for Deaf people (RNID)
in the UK is run by deaf and hearing people working together. It provides services for
the deaf community and manages teams of sign langua
ge interpreters.


ViSiCAST contributes to the social objectives of the community as in Key Action 1 of
the IST work programme.
For people who are born deaf or have become deaf before
learning a spoken language, it is often very difficult for them to learn
to speak and to
read and write. Sign language provides the only viable alternative; everything that can
be expressed in spoken language can be expressed in Sign Language.
Through the
production of digital systems which communicate in sign language, ViSiCAS
T
addresses this problem.


Figure 14: VHs at the Post Office


ViSiCAST improves access for members of the deaf community to information
services over the internet and to face
-
to
-
face transactions in Post Offices. It helps
them to integrate more fully i
nto society and to enjoy services and sources of
information taken for granted by most people. The ability to conduct transactions with
counter clerks through the medium of sign enhances their independence.


It will soon be possible to produce signed
-
on
-
de
mand news services. An edit tool has
been developed for inputting and editing SiGML to create animated sign sequences.
The aim of this application is to enable sign language users to create their own sign
animated sign sequences language content. The fi
rst version of this tool will be made
available to the Dovenschap (the national deaf society) in the Netherlands for the
creation of a football news service in animated sign language.


The ViSiCAST project draws on a number of human language technologies,
including
machine translation, speech to text, and facial and gesture animation. In particular, it
Figure 15: Visia, as she might appear
signing a news progra
mme


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
17

of
49

addresses multi
-
linguality in digital content and services by creating tools which
allows services based predominantly on verbal language
-

speech and text
-

to offer a
signed equivalent. Moreover, the system has been developed to translate between
English and the sign languages of Great Britian, the Netherlands and Germany. The
signing component of that system should be seen not as a unique service for the

deaf,
but an extension of services being developed for society at large, to include deaf
people.

4.4 Economic Development & Scientific & Technological Prospects


The size of the market for products employing deaf signing is not large, amounting to
less th
an 0.1% of the population. In the UK, legislation under the 1996 Broadcasting
Act required broadcasters of digital terrestrial television services to begin providing
signing for 1% of their output from 2000. A simple look at the economics reveals that
the
costs of signing using real humans and the associated studio equipment runs into
many millions of Euros and will increase year upon year. The number of highly
qualified signing interpreters is small and the capacity of the digital terrestrial
transmission
multiplex would be strained by having to carry conventional MPEG
-
2
-
encoded pictures of real humans signing.


Promoting access to television programmes for deaf or hard of hearing people is an
important objective of the Independent Television Commission. Th
ere are
approximately 70,000 severely and profoundly deaf people in the UK who rely on
sign language as the primary means of communication.
Many deaf people, often those
born deaf, find signing is the only language they can follow to keep abreast of
progra
mme content.


In recognition of the needs of these people, the UK Government has set a 10
-
year
target of 5 per cent of programmes on digital terrestrial television (DTT) services to
include sign language presentation or interpretation.


At present, these
services use an
“open signing” approach
, where a sign language
interpreter forms an integral part of the programme picture (Figure 1). The
disadvantage of this approach is that viewers without hearing loss can find the
interpreter distracting, and because
of this broadcasters are often reluctant to transmit
signing
at peak viewing times.

The ITC has set interim UK sign language targets
working up to the 5 per cent requirement. As these targets rise there is growing
interest in introducing also a “closed
sig
ning” approach where the image of the sign
language interpreter can be turned on and off by the viewer (Figure 2).

A disadvantage
of this approach is that it
requires the transmission of two programme feeds (one from
the actual programme and a second for t
he signed commentary), requiring extra
transmission capacity.


The ViSiCAST project created an opportunity to use

a 'virtual human' sign language
interpreter, in the form of an animated avatar. The advantage of this approach is that

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
18

of
49

only the positioning in
formation needed to activate the avatar in the receiver
(face,
body, hands) need to be transmitted, reducing the required bandwidth by up to a factor
of ten compared with a video approach. More significantly, such an approach
promises to open up many more
programmes (eventually all those which have been
subtitled) for sign language access by use of automated translation from subtitles into
sign language gestures and movements.


If virtual human signing can eventually operate from the subtitled text, not onl
y would
many of the resource problems be solved, but the deaf community could have access
to between 50 and 80% of programmes through signing. Solving all the linguistic
problems necessary to produce British Sign Language automatically is beyond the
term
of the project, but substantial progress in this direction has been made. The
transmission system has been developed and proven together with the virtual human.
The VH is capable of running in the processor of any set
-
top box or domestic PC.

There remains

great interest in the potential of ViSiCAST’s work, and as yet there is
no legislation for UK satellite services, or closed signing of any kind in other
countries world
-
wide which have adopted the standards set by DVB.


Web and multimedia access for those

who are not fluent in text
-
based languages is a
major untapped area of product development which ViSiCAST has addressed. The
Web is a worldwide communication medium as well as an information medium, and
ensuring that everyone has access to it for educat
ion and trading, makes powerful
commercial sense. ViSiCAST gives away a version of the browser to seed as large a
market of users as possible, and then plans to charge companies for the authoring tools
which will allow them to be able to present their Web

sites as accessible to deaf
people.


There are spin
-
off applications, one such area is the virtual human itself for face
-
to
-
face transactions. The VH signs for the benefit of pre
-
lingually deaf people in the Post
Office but might equally speak Spanish or
Punjabi for those who wish to be addressed
this way. ViSiCAST’s virtual human has more detailed and natural hand or face
gestures than any other available, and this alone will enhance their para
-
social
interaction with user.


Figure 16: A frame example o
f Visia signing


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
19

of
49

5. Deliverables and other Outputs




A key result in year 1 was a live signing system for TV using a virtual human
according to a new standard transmission format, trials of face
-
to
-
face
communication with deaf subjects and initial web
-
based
tools.



Year 2 delivered an ambitious prototype text
-
to
-
signing tool and an avatar driven
from HamNoSys.



Key year 3 results included a face
-
to
-
face dialogue system, a semi
-
automatic
translator from text to signing.


The following sections outline the main

developments in each workpackage:

5.1 Workpackage 1: Television and Broadcast Transmission


WP1 is concerned with the deployment of virtual human synthetic signing in
broadcast television. The workpackage has two related aspects: the development and
inte
gration of the appropriate technology, and the monitoring and establishment of
appropriate standards.


A complete synopsis of the transmission link in ViSiCAST is shown on Figure 16,
which details the 3 potential scenarios for delivering ViSiCAST animation
, namely
MPEG
-
4 Video, MPEG
-
4 FBA (both developed by INT
-
ARTEMIS) and BAF
-
based
(developed by UEA and Televirtual).


Figure 17: Synopsis of the ViSiCAST project transmission link


The transport layer of television broadcasting in DVB is used. In the DVB
environment the MPEG
-
2 Transport Stream, the MPEG
-
4 / MPEG
-
7 data streams and
the standard transport protocols, were analysed in order to determine the mechanism
for broadcasting ViSiCAST. Synchronisation of ViSiCAST data content with MPEG

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
20

of
49

video/audio has
to be assured and restrictions identified. DVB
-
inherent trigger
mechanisms have to be checked and modifications need to be formulated. At an
intermediate level, it has to be ensured that the ViSiCAST broadcast model is
compatible with the emerging broadcas
t standards.



D1
-
1

Direct sign Transmission Demonstrator

One of the first aims of ViSiCAST was to produce a system
for
television
transmission of (unmediated, live or recorded) signed performance,
based on
a signing
avatar “driven” by motion capture data
.

An extract of pre
-
recorded television together
with an associated ViSiCAST data
-
file was transmitted and successfully reconstructed
in a PC
-
based receiver. This can be considered as a set top box in the home.


For WP1 to achieve this it was necessary for
WP4 to first supply an animation system
capable of integration within a broadcast environment. The Consortium agreed that
this system should be based on the Televirtual Mask
-
VR system. A working prototype
was demonstrated to a committee of UK senior Broadc
ast Engineers drawn from all
sectors of the industry (TDN Committee) on August 31
st

2000.


Since bandwidth limitations apply, the motion format used internally for animating a
Televirtual avatar cannot be sent directly, but must be compressed before
transm
ission.
Additional work was done on the compression of this broadcast format,
achieving bandwidths directly comparable to those claimed for MPEG
-
4 systems.


Deliverable D1
-
1 documented the real
-
time CODECs, which have been developed for
this purpose.
Close

liaison between WP4 members (Esp. Televirtual and UEA) and
WP1 created updated client
-
server versions of the avatar player for the Broadcast
application, including the development of a highly compressed transport layer. This
permits broadcast
-
style transm
ission (i.e. no return path) at approx. <30kbits/second.


In parallel, IRT’s development of the broadcast transmission system continued
sending ViSiCAST data from a server PC to a visualiser PC, using an RF
-
modulated
broadcast link in December 2000 and Apr
il 2001. Successful work has demonstrated
high compression of the data stream for test sequences.


The UEA/Televirtual proprietary transmission link is detailed below:











ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
21

of
49






























Figure 18: ViSiCAST’s Broadcast system (shows w
here the transmission layer has been inserted in the Mask
-
VR pipeline)


In tight
cooperation

with IRT, INT
-
ARTEMIS contributed to the demonstration of
real
-
time transmission of a multimedia stream, associating MPEG
-
2 audio
-
visual data
and MPEG
-
4 animation
data, using the MPEG
-
2 transport layer. The advantages of
using MPEG
-
2 as a transport mechanism in a broadcasting framework were found to
be:




Design simplicity: MPEG
-
2 takes care of all system requirements (including
packetising, multiplexing and synchro
nisation). Animation data are combined with
audio
-
visual data at the decoding stage, which only requires inserting an MPEG
-
4
SNHC coder/decoder prior/after the MPEG
-
2 multiplexer/demultiplexer.

Moti
on
Data +
Calibratio
n

Motion
Data +
Calibratio
n

IHostCOM

Mesh

Mesh
Attachment
Description

Bone Set

Renderer

Avatar Codec

DSP

Compressed
Motion
Stream

DSP

Avatar Codec

Compressed
Motion
Stream

Bone Set

MPEG 2 Broadcast
Stream


Scene


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
22

of
49



Maximum studio compliance: Maintaining full MPEG
-
2
-
compliance
allows to
take advantage of existing hardware/software equipment.



Standardisation: MPEG
-
4 transport over MPEG
-
2 streams is now standardised via
the FDAM
-
7 directive. Moreover, this design minimises divergence with the DVB
/MHP architecture.



Dealing with th
e animation data pathway, the experimental set up was comprising:
on the server side, the MPEG
-
4 SNHC coder and integrated UPD server; on the
client side, the MPEG
-
4 coder/player equipped with a UDP client and running the
MPEG
-
4
-
Visia avatar model. Despite

the lossiness of the UDP protocol, media
synchronisation and animation fluidity were found satisfying. At the network level,
a more adequate framework for properly dealing with synchronisation and issues
would obviously involve protocols incorporating err
or resilience mechanisms,
such as RTP.


D1
-
2

Advanced sign Transmission Demonstrator

Prototype implementation of the broadcast TV system for low
-
bandwidth transmission

able to handle various signing formats accurately and precisely. The signing was
repres
ented in different ViSiCAST
-
formats : SiGML, BAF and FAP and MPEG
-
4
video.


In order to compare the representation and transmission solution for the avatar
animation, an analysis in terms of bandwidth and terminal complexity was performed
to assess several

possible scenarios for implementing the representation and
transmission of avatar animation data.


This demonstrator was presented to the broadcast trade audience during the
International Broadcast Convention 2002 in Amsterdam.


D1
-
3

Broadcast Specificat
ions

Report on specifications and standards activity, covering
inter alia
:

i.

relation of ViSiCAST transmission technology with MPEG
-
2 transport layer
standards;

ii.

compression and transmission of MPEG
-
4 compliant animation parameters
over MPEG
-
2 transpor
t layer; and

iii.

integration of ViSiCAST
-
SiGML signing notation the definition and
description of the interface between the SiGML driver and the broadcast driver.


D1
-
4

MPEG
-
4 Video for Representing Avatar Animation


MPEG
-
4 Video encoding of synthetic sc
enes has been released by INT. It includes
an
MPEG
-
4 Video encoder; and a packaged MPEG
-
4 Video decoder and player;

INT activities have been focused on developing this end
-
to
-
end MPEG
-
4 solution for
delivering synthetic signing within a broadcast context i
ncluding:




an MPEG
-
4 Video encoder; and a packaged MPEG
-
4 Video decoder and player


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
23

of
49



defining a MPEG
-
compliant transport architecture, versatile enough to host a
variety of data streams, including MPEG
-
2/4 audio
-
visual streams, MPEG
-
4
animation streams and s
uch proprietary streams as BAF and SIGML data.



specifying and implementing multimedia synchronisation and composition layers.


According to this framework, an MPEG
-
4 Audio
-
Visual suite was innovated as
follows:




an
MPEG
-
4 Video coder
with video object
-
bas
ed coding capabilities, including (i)
separate (spatial and motion) encoding of individual moving video objects, and (ii)
coding adaptation via the automatic partitioning into frame groups with globally
high / low motion activity for codec
-
dedicated subseq
uent processing;



an
MPEG
-
4 AAC (Advanced Audio Coding) coder;



an integrated
MPEG
-
4 Audio
-
Video decoder
-
player

packaged as a standalone
application running on Windows NT / 98 / 2000 platforms. Based on the same
high
-
performance thread
-
based architecture as
the one used for the MPEG
-
4
SNHC decoder
-
player, it comprises:



a decoding engine, incorporating an optimised MPEG
-
4 Video decoder (50 fps
video rate for standard (540x388) frame size on Pentium III 500 MHz / 128 Mb
RAM platforms) and an MPEG
-
4 AAC decoder;



a rendering engine, based on the Microsoft DirectShow API.


5.2 Workpackage 2: Multimedia and WWW applications


This workpackage develops applications of virtual signing aimed at the WWW,
multimedia and third party software. An internet browser plug
-
in h
as been developed
allowing the viewing of signs. A version is provided free of charge to deaf users.


D2
-
1

Internet Browser Plug
-
in

This provides high
-
quality signing for pages annotated with SiGML.

To achieve the first objective, a tool was constructed
early in the project using the first
generation avatar, which takes SiGML (D5
-
2), using a standard XML parser, and
animates a limited number of predefined SiGML sequences. The tool was further
improved as later versions of SiGML (WP5) and the avatar (WP4)
became available.


D2
-
2

Web Pages with Signing

This demonstrates the value of the browser plug
-
in by developing content for WWW
applications that are of interest to deaf signers. The material consists of weather
forecasts which make use of signs created t
hrough motion capturing, and which can
subsequently be combined in a flexible manner to show any number of weather
forecasts in three European Sign languages. Special software called the Weather
Forecast Creator was developed that makes it possible to crea
te a weather forecast by

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
24

of
49

simply filling in a form. The Weather Forecast Creator converts the completed forms
into grammatically correct sequences of signs, and the output is in the form of a
SiGML
-
playlist of captured signs. In D2
-
2, the weather forecast a
pplication has been
further developed with the aim of introducing it as a service on the internet.

The weather forecast application was successfully launched for demonstration and
evaluation in a real
-
life field trial on the internet. The launch was carr
ied out in
collaboration with the Dovenschap (national deaf society of the Netherlands). With
the aid of the browser plug
-
in (D2
-
1) daily forecasts of Dutch weather can now be
accessed at URL
www.dovensch
ap.org/weerbericht
. The aim is to continue the service
on a permanent basis after the end of the project.


D2
-
3

Signing Tutor

This enables deaf people to create their own sign language content for web pages.
First, a multimedia package (D2
-
3) has been pr
oduced to assist in the learning of sign
language. It is possible to use the avatar interactively; i.e. the user can build a phrase
and have the avatar sign it. In that respect, the deliverable goes beyond what is
possible with conventional (i.e. digital v
ideo
-
based) learning environments for sign
language.


Finally, an edit tool (M2
-
3) has been developed for inputting and editing SiGML to
create animated Sign sequences. The aim of this application is to enable sign
language users to create their own anima
ted sign language content. The first version
of this tool will be made available to the Dovenschap (the national deaf society) in the
Netherlands for the creation of a football news service in animated sign language.


5.3 Workpackage 3: Face
-
to
-
Face Trans
actions


This workpackage develops applications of the virtual human signing system to be
used in face
-
to
-
face transactions, such as post offices, health centres and hospitals,
advice services, and shops. The scenario for these transactions is a Post Offi
ce. The
system developed allows the counter clerk serving the deaf customer to speak into a
microphone and have his or her speech translated into on
-
screen virtual human
signing. To improve the efficiency of the transactional system, it incorporates
avai
lable technologies to “read” limited signs from the deaf customer and translate
these into text (or speech) that the counter clerk can understand.


D3
-
1 Constrained PO system

This translates only a constrained set of spoken phrases from the clerk, which ma
y
contain a number of variable quantities, such as prices, weights, countries etc.
Only
high volume transactions have been included on the understanding that these popular
transactions are used by the majority of the Post Offices’ customers.
This system
re
cognised 100 different phrases spoken by the clerk and enables the clerk to

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
25

of
49

accomplish 80% of all transactions that take place in a typical UK Post Office using
automatic signing. Counter clerks, and a panel of deaf users from the RNID evaluated
and accomp
lished a number of transactions using the system.


D3
-
2

Unconstrained PO system

The new TESSA has been improved to incorporate extra Post Office transactions
which has been successfully deployed for trial at the London Science Museum
Summer 2001. The Scie
nce Museum Post Office system acted as a normal Post Office
offering the full range services. This was a great success and generated a lot of media
interest
-

newspapers and television. During this period a member of the BBC's Blue
Peter Television progra
mme saw
TESSA

on display and requested that we
demonstrate the system on television.


This system extends the initial system to accept much less constrained speech from
the clerk: for instance, instead of being required to say “Do you want first or second
class postage?” the system would accept “First or second?”.
This new prototype now
uses a le
ss constrained recognition system, IBM’s Via Voice, with free form speech
input (still within the Post Office domain) being mapped to one of the pre recorded
phrase
s, translating the phrase into sign language for signing to the customer. The
database of motion captured data recorded enables the avatar to sign the recognised
phrases.


Following the Blue Peter Programme we conducted an evaluation of
TESSA

the Face
to F
ace Transactions system. During this trial ten profoundly deaf people whose first
language is British Sign Language (BSL) and five Post Office clerks undertook
numerous Post Office transactions. During the evaluations it became apparent that the
time to co
nduct a transaction took considerably longer using
TESSA
. As a result an
improved Counter Clerk Interface screen was developed by the UEA to reduce
transaction times.


D3
-
3

Dialogue PO system

The language processing aims to provide a high degree of reliabi
lity with minimal
need for manual intervention to clarify ambiguity or correct badly signed phrases.


This system incorporates a limited dialogue between the clerk and the deaf customer.
It was not the intention to attempt a comprehensive translation syste
m for the whole of
sign language into text, but rather to recognise a very limited number of signs. This
helped to develop an appreciation of how sophisticated such a system needs to
become before it can begin to be of practical use for members of both hea
ring and
deaf communities.


The signs were recorded by five different signers in order to make the system
independent of a particular signer’s characteristics. Each sign was repeated fifteen
times, ten used for training the system and the other five for t
esting. The system
initially recognised five signs


‘Hello’, ‘yes’, ‘no’, ’could you repeat that’ and ‘thank

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
26

of
49

you’. Many of the phrases which TESSA can sign require yes/no answers and
therefore being able to recognise these two signs allows the system to i
nterpret many
of the customers responses. Recognising the sign requesting that the phrase be
repeated allows the customer to have some control over the timing of the delivery of
the signing. This is important since in previous evaluations, customers often
missed
the beginning of the signs because they were watching the clerk rather than TESSA.


The Post Office is conducting a trial of the
TESSA

system at 5 Post Offices around
the UK. Each Post Office has been selected following advice from the RNID because

they are situated in areas where there is a large deaf community. The trial has
generated a great deal of media interest and has been filmed for a national Channel
four programme plus a regional programme in Bristol. It has also had press articles
written

for Local Papers in Derby and Liverpool.


5.4 Workpackage 4: Animation and Modelling


This is the first of the technology workpackages underlying the planned applications
described above. WP4 develops the high
-
fidelity avatars or virtual humans used
withi
n ViSiCAST, along with the animation, motion capture, recording, transmission
and replay software tools which will underpin the applications being developed in
WPs 1,2&3. Tools are developed and refined using the proprietary standards and
techniques used b
y one of the partners (Televirtual) and, in parallel, addressing the
emerging MPEG
-
4 standard.


These tools integrate scanned
-
from
-
life human models for the virtual signer. Run
-
time software dynamically combines individual elements of signing gesture:
ind
ividual hand and body movements, facial expression etc. to create the complex,
non
-
sequential elements of the true European sign languages. The individual
components of shape and movement are acquired from life (motion capture) or
constructed in 3D graphi
c space using physical modelling tools. Tools based on each
system are developed in parallel and compared for functionality and subjective
realism (c.f. Evaluation WP). The final system combines both techniques.


Thus ViSiCAST signing avatars are capable
of being driven directly, using real
-
time
motion capture, (live or recorded) from the performance of a human signer or
interfaced to a form of notation which describes sign language.
The ‘notation avatar’
translates data from the latter, expressed in SiGML

(see workpackage 5) into realistic
smooth, continuous motion. The synthetic motion data can be output using a number
of formats: .baf (ViSiCAST virtual human animation files), .bvh (motion data format)
and vrml: H
-
Anim (virtual human).


For use with the d
irect recording of signed sequences, the project develops a refined
suite of advanced hardware and software motion capture tools. These form a single,

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
27

of
49

coherent capture, recording and replay system, using robust techniques and equipment
capable of use in T
V studios and other industrial, non
-
laboratory settings. Similar
systems have been developed for physical modelling of shape and motion.


At the beginning of the project, there were a number of rudimentary avatars or virtual
humans available to the project
. A “standardised” version, existing in two versions
(“Visia”


for generic ViSiCAST applications and “
TESSA



for ViSiCAST
applications developed for the UK Post Office) was developed during the first year. It
(she?) has been incorporated in the run
-
time

applications (e.g. broadcast viewer,
weather forecast creator, internet browser plug
-
in) used elsewhere in the project.


D4
-
1

Prototype animation system for Direct TV transmission

The latest versions of the ViSiCAST avatar, Visia, were improved. It was cl
ear that
facial expressions are very important in practice and to achieve acceptable signing in,
for example broadcasting, the facial expressions must be further developed. To this
end Televirtual explored methods of morphing between meaningful expressions

and
UEA has explored the use of statistical active appearance models. Both had the
potential to run in real
-
time and to be transmitted by low capacity (bandwidth) data
channels.


The visual quality of the avatar was improved to make it more ‘photo
-
real’.
For
example figure 18

shows an avatar head developed in this way from two photographic
images of the person who was the original model for the Visia character.








Figure 19: Photographically derived Avatar head (
surface texture part
removed to reveal

underlying mesh)





D4
-
2

SiGML notation
-
avatar software driver

The is the first prototype synthetic animation engine which given a signing sequence
expressed in the SiGML notation (developed in WP5) generates a corresponding
sequence of animation frame d
ata (represented in the project’s internal BAF format
(“Bones Animation Format”). This is the format of the data used to drive the Visia
avatar. The latest high
-
performance animation engine is now able to generate
animations at 25fps (frames per second) i
n a fraction of real
-
time.



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
28

of
49


Figure 20: A frame example of Visia signing


In conjunction with the other architectural developments, the synthetic animation
technology is capable of delivering animation in real
-
time from a low
-
bandwidth
stream of SiGML/Ha
mNoSys signing data.




The engine can be used as a scriptable component in WWW pages and other
application environments.



A
BAF Player

module provides the basic Visia avatar control with a front
-
end
consisting of an animation data cache together with approp
riate control facilities.



Demonstration applications each of which generates an on
-
screen animation given
an input file of SiGML (or HamNoSys) data defining a signing sequence. This is
achieved by a two stage process: first the SiGML data is run through th
e animation
engine to generate an intermedate file of animation (BAF) data, and then this baf
data is fed to the BAF Player module for on
-
screen animation.


D4
-
3

Final ViSiCAST avatar

The systems are now capable of being driven directly, using real
-
time mo
tion capture,
from the performance of a human signer or interfaced to the ViSiCAST Gesture
Mark
-
up Language, written in XML and based on extensive further refinement of a
gesture notation scheme being perfected in WP5, Language.


This deliverable had two p
rimary elements: the creation of a high
-
polygon, high
-
fidelity project avatar, capable of capitalising on the advanced capabilities of modern,
low cost computer graphics cards. The second objective was to further develop the
Mask animation environment into

Mask 2 and enable it to host the avatar in up
-
dated
versions of the project applications.



Figure 21: Visia 3
-

The advanced, high fidelity, high
polygon ViSiCAST signing Avatar







ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
29

of
49

Mask 2 also allows the animation of the avatar to be “driven” by anim
ation of a 3D
skeleton hierarchy of bones while at the same time permitting either bones or ‘morph
targets’ to be used for the animation of facial features. This dual system has the
benefit of suiting the animation to control by motion capture systems and
or synthetic
animation parameters. The flexibility has the potential to allow the creation of both the
wide range and the highly detailed facial expressions needed for sign languages.


ViSiCAST is pursuing two approaches that address the problem, raised by

Deaf
commentators, of how good facial expression should be generated.


The milestone M4
-
3 (Advanced system for physical motion modelling)

investigated
vision
-
based motion capture solutions
as alternatives to hardware sensors.
Specifically, a

vision
-
based

markerless
facial motion capture subsystem has
been developed by INT for
automatically grabbing, creating and
playing MPEG
-
4 compliant facial
motion content. This subsystem
provides an offline, cheap and easy
-
to
-
set up solution for accurate and robust
fac
ial motion capture, which
significantly reduces the invasiveness
and calibration issues encountered by
Figure 22:

Vision
-
based motion capture of facial

expressions using MPEG
-
4 compliant templates


the currently used optical marker
-
based, camera head
-
mou
nted real
-
time system. It
integrates itself seamlessly into the network
-

and terminal
-
interoperable standardised
environment for distributed virtual character animation developed in the framework of
deliverable D4
-
4.

D4
-
4 Advanced MPEG
-
4 animation system

This deliverable is aimed at developing a
standardised system for distributed virtual
character animation able to support virtual signing within a variety of multimedia
frameworks, including digital broadcasting, internet and mobile communications. To
this

end, the SNHC (Synthetic and Natural Hybrid Coding) part of the MPEG
-
4
multimedia standard has been retained. The developments, carried out by INT, have
concerned:


1.

a supervised mesh propagation algorithm for partitioning a virtual character
into anatomic
al segments;

2.

a modelling authoring tool for generating MPEG
-
4 SNHC
-
compliant human
body models;

3.

an animation authoring tool for editing MPEG
-
4 SNHC animation parameters;


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
30

of
49

4.

a server
-
client communication tool for transmitting the MPEG
-
4 SNHC
animation paramete
rs;

5.

an MPEG
-
4 SNHC player, integrating a MPEG
-
4 SNHC decoder and an
optimised animation engine.


Together with the MPEG
-
4 SNHC encoder developed within the scope of
workpackage 1, these components provide an end
-
to
-
end solution for designing,
exchanging an
d animating MPEG
-
4 compliant virtual signer models, with the benefit
of natively including such features as network/terminal scalability mechanisms,
synchronisation with other media and multimedia compositing.













Figure 23: The MPEG
-
4 Avatar Anim
ation Interface
-

Some tools (inverse kinematics, animation parameter
editing and motion interpolation).


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
31

of
49

5.5 Workpackage 5: Language and Notation


The main objective of WP5 is to develop language technology that integrates with
Virtual Humans animation t
echnology to provide signing as a result of semi
-
automatic
translation from English or from intermediate formats (semantic representation).


The interface to animation technology is based on sign language notation. It is
therefore another objective of this

workpackage to extend the Hamburg Notation
System (HamNoSys) to cover all aspects of signing in the target sign languages (BSL:
British Sign Language, DGS: German Sign Language, and NGT: Dutch Sign
Language) and to implement that system as an XML applicat
ion.


The enabling animation technology here is a sign language notation developed by the
University of Hamburg, HamNoSys, which was further developed to give an XML
-
compatible notation, ViSiCAST
-
SiGML (Sign Gesture Mark Up Language). The
lexicons for the

target sign languages (BSL: British Sign Language, DGS: German
Sign Language, and NGT: Dutch Sign Language) were developed using this notation
to describe phonological aspects of sign languages.


D5
-
1

Interface definitions

This defined the computational a
nd communication interfaces between the various
system components, and involved


(i)

refinement of the HamNoSys

notation for sign language so that it can control a
human avatar.

(ii)

extension of the HamNoSys notation so that it expresses aspects of postu
re,
facial expression.

(iii)

definition of a semantic representation, based on Discourse Representation
Structures (DRSs), as the interface between English text processing and the
sign language synthesis of the formulator.

(iv)

definition of a common lexic
on structure based upon the refined HamNoSys
notation (as the basis for 3 lexicons
-

one for each sign language)


Minor modifications to these were incorporated as subsequent work progressed,
however in the main these remained robust throughout the project
.


D5
-
2

SiGML definition

Signing Gesture Markup Language (SiGML) is at the heart of the ViSiCAST project,
providing the link between content creation and animation. Tools have been
developed to generate signing sequences in SiGML and to animate them within

ViSiCAST applications.


Milestone

M5
-
10 (Initial SiGML Definition) provided an encoding of HamNoSys 3 to
enable initial work on synthetic animation. For

D2
-
1 (Internet Browser Plug
-
in) an

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
32

of
49

extended version was provided which enabled a playlist of sign file
s to be specified.
Support is thus provided for applications based on motion captured signs such as the
Weather Forecast Application.


D5
-
2 (SiGML Definition) specified SiGML 1.0, which includes support for
HamNoSys 4 and non
-
manual gestures as specified i
n

D5
-
1 (Interface Definitions).
The representation was reorganised to address the needs of synthetic animation, while
retaining the structure of HamNoSys as far as possible. The notation is animated in

D4
-
2 (SiGML Driven avatar) which is used to display th
e results of D5
-
3 (Prototype
English to SiGML Translator) in M5
-
11 (Proto Text To Sign Animation).


The original plan envisaged a final definition of SiGML that would be offered for
adoption by W3C. In practice, SiGML has not been used outside the ViSiCAST

project so it is unrealistic to assume that the definition is ready to meet a wider set of
needs.

M5
-
8 (SiGML Standardisation) identifies routes to standardisation through
W3C, MPEG, and other European and international bodies.


D5
-
3

Proto
-
text
-
to
-
sign no
tation


Implements a provisional text (English)
-
to
-
sign notation translator. The result of this

activity demonstrated that:


(i)

the sign lexicon and sign formation rules permit synthesis of enough signs to
cover a prototype domain.


(ii)

standard Engli
sh construction of active, passive, interrogative and imperative
sentences involving relative clauses and adjectives are converted to appropriate
semantic representations.


With the completion of deliverable D5
-
3 (August 2001), we demonstrated the
provisio
nal route from English text to sign language notation, specifically
demonstrating the feasibility of:




conversion of isolated English sentences to a suitable intermediate representation
(based on Discourse Representation Structures DRSs);



provisional sign

language synthesis for a suitable lexicon, incorporating grammar
rules which are representative of the issues which need to be addressed in
generating natural sign language structures. The lexicon and the grammar for each
of the three target languages (BS
L: British Sign Language, DGS: German Sign
Language, and NGT: Dutch Sign Language) are modelled in a Head
-
Driven Phrase
Structure Grammar (HPSG).


The output of the generation system builds on the extensions of the Hamburg Notation
System (HamNoSys) as des
cribed in deliverable D5
-
1 (February 2001) and their
implementation as an XML application called SiGML (Signing Gesture Markup

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
33

of
49

Language), described in deliverable D5
-
2 (May 2001). Milestone M5
-
12
demonstrated the completion of this route to visualisation
within the avatar.


D5
-
4

Integrated Sign translation environment

This demonstrates all components have been implemented in this workpackage.

Deliverable 5
-
4 extends the approach taken in the D5
-
3 prototype, providing a more
robust, less restricted demonst
rator.


Work consisted of:



Definition of a less
-
restricted (but nevertheless still small) domain and
collection of data from native signers on this domain.



Extension of the grammar rules in order to handle more sign
-
language specific
phenomena. As many top
ics have not been satisfactorily answered in the
literature, this implementation work often required linguistic empirical research
to be undertaken in order to provide a solid foundation for the modelling task.
Topics covered include:

o

Noun and verb plurali
sation

o

Modal verbs

o

Verb aspect

o

Question pronouns and question word order

o

Temporal adverbs

o

Quantifiers

o

Connectives

o

Fingerspelling

o

Allocation of references in signing space



Extension of the lexicon to cover this domain as well as to lay the grounds for
certa
in grammar phenomena.


In addition the system implementation allows multiple sentences to be treated as a
discourse unit, resolving pronominal references thus ensuring that such references are
associated with the same location in signing space.


The system

architecture allows monitoring of progress of the translation by a
linguistically sophisticated person, currently restricting human intervention to
modification of the input text, indication of collective versus distributive plurals and
to modification of

the gesture notation prior to visualisation within the avatar.


5.6 Workpackage 6: Trials and Evaluation


Informal feedback and evaluation is an inherent aspect of the entire project
methodology. Furthermore, more formal evaluation of the work


by deaf

users


is
seen as of such importance that it has been given its own workpackage. Throughout
the project, members of the deaf community in several countries evaluate the quality
of ViSiCAST virtual humans, and the signing they generate.


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
34

of
49

Focus groups with

profoundly deaf people whose preferred means of communication
is sign language evaluate each virtual signing application. For the following styles of
communication, the respective signing systems are evaluated for effectiveness,
perceived usefulness and
acceptability:



(i) WWW access

(ii) face
-
to
-
face transactions

(iii) TV


D6
-
1

Evaluation Report 1

Evaluation Report 1 describes and compares evaluations of the constrained Post
Office system, the web browser plug
-
in and a demonstration of virtual signin
g on
television. The evaluations provided a measure of the success of the applications
developed in the project and showed that the three systems were perceived as useful
by the majority of the sign language users who took part. The ratings given by sign
language users for the clarity and acceptability of the signing and the avatars and the
scores for phrase intelligibility in the applications were reasonable for all of the
systems but showed scope for improvement. The greatest need for improvement was
ind
icated for the demonstration of a television broadcast application. Feedback from
the deaf participants and analyses of the data gave insight into how improvements
could be achieved. It was concluded that improvements to the facial expressions and
lip pa
tterns of the avatars in all of the systems would improve intelligibility and
subjective ratings of acceptability and clarity of signing.


D6
-
2

Evaluation Report 2

Evaluation Report 2 describes the evaluation of the unconstrained PO system (TESSA
version 3
). This evaluation included a comparison between identification of virtually
-
signed phrases and identification of video
-
recorded phrases signed by a human
interpreter. Outcomes of this evaluation were then compared to those from evaluation
of an earlier ve
rsion of the system (version 1) and from an informal evaluation of
version 3 in focus groups.


From the perspective of both deaf people and clerks, the only measurable
improvement in TESSA version 3 compared to TESSA version 1 was a slightly higher
rating

given by deaf people for clarity of signing. On all other measures, including
accuracy of identification of signed phrases and subjective ratings, no significant
improvements were identified. The higher levels of accuracy and subjective ratings
achieved f
or identification of video
-
recorded phrases from a human signer provide
target levels to be achieved for virtually
-
signed phrases. Areas identified as needing
improvement include an improved speech
-
recognition component with reduced delay
between spoken an
d signed phrases, better facial expressions, more realistic lip
patterns and clearer handshapes.





ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
35

of
49

D6
-
3

Evaluation Report 3

A second round of focus group evaluations took place in October 2002. These
followed a similar format to those completed in July 20
01 but took place throughout
the UK, rather than in London, to sample views of more deaf people with a wider
range of signing styles. Systems evaluated included a modified demonstration of
virtual signing on television, the weather forecast system and a de
monstration of
virtual signing generated from text. The signing tutor is under evaluation in Germany
and the evaluation of the live internet site giving the Dutch weather forecast is nearing
completion in the Netherlands. It is hoped to complete a compari
son of virtual signing
generated from motion capture and from text in early December in the UK.


In the Netherlands the weather forecast application is being evaluated in a field trial
carried out by IvD in collaboration with Dovenschap. With the aid of th
e browser pug
-
in (D2
-
1) developed earlier in the the project, daily forecast of Dutch weather can
during the evaluation period be accessed at URL:
www.dovenschap.org/weerbericht
.


5.7 Workpackage 7: Proj
ect Management, External Communications &
Publicity


This workpackage is concerned with the overall management of the project including
external communications. The aims are to monitor and co
-
ordinate the activities of
participants to ensure that the work

plan is followed, that any adjustments are made in
a properly controlled manner, and that the project is effective in producing output
which will meet the needs of EU deaf citizens.


D7
-
1

ViSiCAST public and members administration website
http://www.visicast.org


The consortium Website was developed to provide public up
-
to
-
date information
about the project and its partners and as a means of exchanging and archiving project
information.


D7
-
2

Project co
-
ordination and

Periodic reporting

This task covers the monitoring and reporting of the progress made by all the
participants and the subcontractors in executing the project. This includes the
following tasks:

-

preparation of monthly progress reports

-

identification o
f unforeseen problems that may impact on deliverables and
hence require a change to the project work plan

-

monitoring of activities of each participant organisation to ensure that
manpower and materials are professionally and efficiently managed in
accord
ance with the work plan

-

co
-
ordination of participants’ efforts to ensure maximum efficiency and
minimum duplication of effort


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
36

of
49

-

co
-
ordination of production of contract deliverables with the participant
responsible for each, to ensure that contributions t
o them are produced on time

-

co
-
ordination of project meetings and recording of minutes

-

financial management and control of the project.

5.8 Workpackage 8: Exploitation and Dissemination


This workpackage is intended to ensure that the outputs of the p
roject are
disseminated and exploited as effectively as possible.


ViSiCAST is progressing in a number of dimensions detailed in this report :



In community services



for broadcast transmission



For the WWW.


D8
-
1 & D8
-
2

ViSiCAST Marketing and Exploitation P
lan

The Dissemination and Exploitation Report gives details of the exploitation and
dissemination activities which have taken place.


D8
-
3

Technical
-
implementation
-
plan

A ‘Technical Implementation Plan’, compiled in the final stages of the project, will
e
stablish a procedure for the introduction and application of the developed
technologies after the life
-
span of the project itself.


5.9 Participation in exhibitions, articles, conference presentations


Quarter

Date

Details

Q1

Dec 99

50th MPEG Meeting, Mau
i, Hawaii


Feb 00

RFIA 2000 conference, Paris, France


Feb 00

Journal paper: FIFF
-
Kommunikation 2/2000, Rolf Schulmeister
(UH) Ûbersetzung in und Generierung von virtueller
Gebärdensprache im Fernsehen und Internet, pp 44
-
47


Mar 00

51st MPEG Meeting, N
oordwijkerout, Netherlands


Mar 00

4th International Conference on Automatic Face and Gesture
Recognition (FG 2000), Grenoble, France

Q2

Apr 00

IST/French Ministry for Education, Research and Technology
joint workshop on Information Technologies for Heal
th Care,
Paris, France


Apr 00

6th Conference on Content
-
Based Multimedia Information
Access (RIAO 2000) Paris, France


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
37

of
49


Apr 00

Bangham, J. A., Cox, S. J., Elliott, R., Glauert, J. R. W.,
Marshall, I., Rankov, S. and Wells, M.
Virtual Signing: Capture,
An
imation, Storage and Transmission


an overview of the
ViSiCAST Project
.

Bangham, J. A., Cox, S. J., Lincoln, M., Marshall, I. and Tutt,
M.
Signing for the Deaf using Virtual Humans
.


IEE Colloquium on Speech and Language Processing for the
Disabled and E
lderly. 6/1
-
6/7. April


Apr 00

Delivering Independent Access, Nuneaton, UK Post Office
event held for other businesses to promote disability awareness


Apr 00

ViSiCAST Public/Private interactive website published.
Provides details of the project demos,
interactive facilities,
progress reports and members virtual work/admin area.


May 00

52nd MPEG Meeting, Geneva, Switzerland


May 00

Delivering Independent Access, Eastbourne, UK. Post Office
event held for other businesses to promote disability awarenes
s


May 00

Delivering Independent Access, Ipswich, UK. Post Office event
held for other businesses to promote disability awareness


June 00

U Magdeburg Colloquium on Animation


June 00

PAF Annual Conference. Mount Pleasant, London Royal Mail
event


Jun
e 00

BDA conference, Belfast

Q3

July 00

Extensive article about ViSiCAST on See Hear, the premier
programme for Deaf people on BBC television


July 00

BBC2 See Hear features programme for the Deaf Introduction to
project to UK deaf community. Seen also i
n Germany, France
and Netherlands


July 00

7th International conference on theoretical Issues in Sign
Language Research, Amsterdam. “Virtual Signing: First steps
on the Way to Machine Translation into Sign Language”


July 00

53rd MPEG Meeting, Beijing, C
hina. Participation to the AHG
(ad
-
hoc group) on 3D profiles


July 00

SPIE Conference on Mathematical Modeling, Estimation and
Imaging, San Diego, USA. M. Malciu, F Preteux, “Tracking
facial features in video sequences using a deformable model
-
based appr
oach”


Aug 00

Article for British Deaf News being planned for August


Aug 00

“Tracking facial features in video sequences using a deformable
model
-
based approach”, M Malcui, F. Prêteux presented to
SPIE, San Diego


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
38

of
49


Aug 00

Presentation and Demonstration

to UK DTN Committee.
Presentation at HQ of Channel Four Television to committee of
senior broadcast engineers from BBC and commercial
broadcasting organisations, with a view to the possible adoption
of the ViSiCAST system for televised Virtual Human signi
ng


Sept 00

Independent Living Exhibition 2000. Wembley, UK


Sept 00

EU workshop “Preparing a European Deaf Network for
Information and Communication” Klagenfurt, Austria
Presentation about ViSiCAST with the aim of gaining feedback
on (i) how deaf people

feel about a signing avatar and (ii) how a
signing avatar can be used as a communication medium in a
European Deaf network. About half the audience was deaf.
Their reactions showed that most are open
-
minded about a
signing avatar and value this as poten
tially useful for
communication between deaf and hearing people, as well for the
translation of written/spoken language. Some of the suggestions
for the use of avatars were:



Teaching deaf children to use the computer



A helpdesk for deaf people who have p
roblems with their
computer or with surfing at the Internet



A signed dictionary that can be accessed via the internet



Sept 00

Deafway 2000 Exhibition. Hove/Brighton, UK


Sept 00

Televirtual Corporate WWW site being re
-
designed and re
-
launched with new

sections dealing with EU research projects,
including ViSiCAST

Q4

Oct 00

54th MPEG Meeting, La Baule, France


Oct 00

RNID Hear for All. Demos and conference presentation, TV


Oct 00

Deaf Awareness Week Event. Caerphilly,UK


Oct 00

ViSiCAST Exhibition
. Islington Business Design Centre


Oct 00

National Federation of Sub Postmasters Exhibition


Oct 00

Islington Council Deaf Awareness Event


Nov 00

Throughout early November, project achieved major press and
trade coverage following its triumph in the B
CS top national
awards competition. Included major two
-
page feature in “The
Times”.


Nov 00

British Computer Society IT Awards


Nov 00

IST 2000 Conference, Nice


Nov 00

Post Office Disability conference


Nov 00

Elliott, R., Glauert, J. R. W., Kennaway
, J. R. and Marshall, I.
Development of Language Processing Support for the ViSiCAST
Project
. ASSETS 2000 4
th

International ACM SIGCAPH
Conference on Assistive Technologies, Washington DC, USA.
November 1 58113 3148


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
39

of
49


Dec 00

Interview held by Deutschlandfu
nk in preparation for a feature
report on ViSiCAST for the radio news magazine “Forschung
Aktuell” to be broadcast in Jan 01.

Q5

3 Jan 01

Radio interview broadcast by Deutschlandfunk.

Interview with Rolf Schulmeister (UH): “
Pantomime aus dem
Rechner
-

Vir
tuelle Dolmetscher übersetzen Sprache in
Gebärdensprache” within the programme “Forschung aktuell”


Jan 01


55
th

MPEG Meeting, Pisa, Italy Participation to MPEG
-
4 SNHC
and MPEG
-
7 groups.


9 Feb01

Tages
-
Anzeiger (Zurich newspaper), p38. Andreas Hirstein:

Gebärdensprache, die aus dem Computer kommt. (sign language
out of the computer) concentrates on the value of the ViSiCAST
project for the Deaf community. Online version available at
http://www.tages
-
anzeiger.c
h
, (go to Archive, search for articles
dated 2001
-
02
-
09 by author Hirstein)


24 Feb
01

DER SPIEGEL (one of the largest weekly news magazines in
Germany), p. 186. Katja Timm: Dolmetscher im Datenanzug
(Interpreter in data suit) by The article features li
nguistic aspects
of the ViSiCAST project. Text only version available at
http://www.spiegel.de/spiegel/0,1518,120631,00.html
.


Feb 01

Nvidia Developers conference


Feb 01

Feb 01

UK Roya
l Society


BCS Award Winners Presentation
Televirtual


1 person Exhibition and presentation to invited
audience of politicians and heads of IT industry


Mar 01


56th MPEG Meeting, Singapore Participation to MPEG
-
4
SNHC and MPEG
-
7 groups. Input documen
ts:
Generic
articulated model: definition and animation (restricted access)


Mar 01

Demonstrator D1
-
1: Direct Sign Transmission


Mar 01

ACM1, San Jose, California. ViSiCAST and TESSA
demonstrated on UEA stand at major Association for
Computing Machinery

exhibition and conference.


Mar 01

Manifestation “Drempels Weg”, Utrecht Netherlands.
Demonstration of Visia

Q6

7 Apr 01

WDR (West German TV Broadcaster) 5 minute feature on
ViSiCAST technology in the programme “computer club”.


Apr 01

Preparation
and supply of video rushes of Motion Capture shoot
and other material


to Poseidon TV for broadcast on Channel 4
TV


Apr 01

Annual conference for Teachers of the Deaf in the Netherlands.
Demonstration of Visia


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
40

of
49


Apr 01

4
th

International Workshop on Gest
ure and Sign Language based
Human
-
Computer Interaction, London UK. A signing avatar on
the WWW. UH presented a paper :
-
HamNoSys: Phonetic
transcription as a basis for sign language generation
.

Kennaway, J. R.,
Synthetic animation of deaf signing gestures
,
Lecture Notes in Artificial Intelligence 2298, 146
-
157.


May 01


Magazine article on ViSiCAST in Markant
-

a monthly
magazine for people providing facilities and care for the
handicapped in the Netherlands


30 May
-

1 June
01

International Conference
on Augmented, Virtual Environments
and 3D Imaging, Mykonos, Greece

M Preda, F Preteux


Advanced virtual humanoid animation
framework based on the MPEG
-
4 SNHC standard

M Malcui, F Preteux


MPEG
-
4 compliant tracking of facial
features in video sequences


26 June
01

Eins Live (regional radio programme in West Germany) Julia
Forster: Access Rules of the game for living in the 21st Century
speculates on the impact of ViSiCAST technology for everyday
life of the deaf.


19 June
01

Article in London Evening S
tandard. About 200 words, account
of TESSA system.


June 01


57th MPEG Meeting, Sydney. Participation to MPEG
-
4 SNHC
and MPEG
-
7 groups


June 01

Launch of TESSA at Science Museum, London, UK.


19 June
01

TV interview John Low interviewed by BBC televis
ion at
launch of Post Office demonstration at Science Museum.



Item on Network South East and Look East TV on TESSA in
Science Museum. Interviews with John Low, RNID ViSiCAST
manager and a deaf user. Coverage = whole of UK south east
and east region



Article in the ‘Sun’ newspaper. About 200 words, short
description of TESSA.

Q7

5
-
10
Aug 01

HCI International.

The ViSiCAST Project: Translation into
Sign Language and Generation of Sign Language by Virtual
Humans (Avatars) in Television, WWW and Face
-
t
o
-
Face
Transactions (Rolf Schulmeister, UH)


Aug/
Sept 01

Magazine feature about virtual signing in RNID’s ‘One in
Seven’ magazine titled ‘RNID welcomes subtitling target’.



Item on British Satellite News. Coverage = worldwide except
UK Interview with

Stephen Cox, UEA.


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
41

of
49


4
-
7 Sept

01

Marshall, I. and Safar, E. Extraction of Semantic
Representations from Syntactic CMU Link Grammar linkages.
Recent Advances in Natural Language Processing (RANLP)
. G.
Angelova. Bulgaria, Tzigov Chark:pp 154
-
159


Safar, E. a
nd Marshall, I. The Architecture of an English
-
Text
-
to
-
Sign
-
Languages Translation System.
Recent Advances in
Natural Language Processing (RANLP)
. G. Angelova. Bulgaria,
Tzigov Chark: pp223
-
228.


10
-
13
Sept 01

B. Theobald, S. Kruse, G. Cawley and J.A. Bang
ham,
Towards
a low bandwidth talking head using appearance models
, In
Proceedings of the British Machine Vision Conference
(BMVC), pp 583
--
592, Manchester, UK, 2001. (
pdf

464 kb,
ps.gz

764 kb).


13
-
19
Sept 01

IBC. Paper for New Technology Forum on Virtual Humans,
featuring Visia/TESSA VH signing; M. Wells, Televirtual.


28 Sept
01

London


interview with BBC for See Hear programme.

Also s
upplied additional video material and signing sequences.
Item broadcast in November.


2001

Constantine Stephanidis (ed) : Universal Access in HCI


Towards an Information Society for Al, Vol 3. Mahwah NJ:
Lawrence Erlbaum pp.431
-
5: Rolf Schulmeister: The

ViSiCAST
Project: Translation into Sign Language and Generation of Sign
Language by Virtual Humans (Avatars) in Television, WWW
and Face
-
to
-
Face Transactions.


Sept 01

Recent Advances in Natural Language Processing (RANLP),
Bulgaria. Refer 4.2


Sept 01

EuroSign.


Sept 01

British Machine Vision Conference (BMVC), Manchester.
Refer 4.2


Sept 01

‘Sign language in the workplace: a European Perspective
Conference’


Sept 01

British Deaf Association Digital Congress. Prototype Internet
application demonstrat
ed on RNID stand.


Sept 01

Symposium “Taal op ‘t spoor”, Sint Michielsgestel, NL.
Demonstration of Visia

Q8


WP3: Paper submitted to “Int. Journal of Human Computer
Interaction”Title : “The Development and Evaluation of a
Speech to Sign Translation Syst
em to Assist Transactions”



WP2: Paper submitted to ASSETS 2002 conference Title : “A
Framework for WWW Applications for Deaf Users in the
ViSiCAST Project”



WP3: Paper submitted to ASSETS 2002 conference Title :
“TESSA: a system to aid communication w
ith deaf people”


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
42

of
49


Oct 01

RNID’s Breaking the Sound Barrier annual event
-


a. Technology Zone

b. Technology Seminar

Virtual signing display. Advances in avatars


Oct 01

RNID: Breaking the Sound Barrier Presentation in Plenary
Session by Professor Andre
w Bangham


Oct 01

RNID Exhibition/Conference Demos of ViSiCAST work


Oct 01

SNHC ad
-
hoc meeting, Rennes, France. Participation in MPEG
-
4 SNHC Group


3 Nov
01

10 minute piece about ViSiCAST and its developments so far on
the television programme “See He
ar” including interviews with
project partners recorded at the 8
th

Consortium meeting in
London in September


5
-
7 Nov
01

Conference paper Margriet Verlinden, Corrie Tijsseling and Han
Frowein ‘Sign Language on the WWW’. In: KnutNordby:
Proceedings of th
e 18
th

International Symposium on Human
Factors in Telecommunication in Bergen, Norway. The paper
was awarded the John Karlin award for best paper at the
conference.


Nov 01

18
th

International Symposium on Human Factors in
Telecommunication in Bergen, No
rway. Han Frowein presented
a paper describing the WFC application and its evaluation in the
Netherlands


Nov 01

HAMP2001,Tutzing DE. Virtual Signing : First step to
converting speech to virtual signing


Nov 01

EVA2001, Berlin DE. Hands
-
on presentation o
f ViSiCAST sign
language generation an animation technology


Dec 01

58
th

MPEG Meeting, Pattaya, Thailand. Participation in MPEG
-
4 SNHC and MPEG
-
7 Groups



Journal article M Preda, F Preteux
Insight into avatar animation
and MPEG
-
4 standardisation
. To ap
pear in Image
Communication Journal


3 Dec 01

Scientific Journals Science Media Briefing



Safar, E. and Marshall, I.
Translation of English Text to a DRS
-
based, Sign Language Oriented Semantic Representation
.
Conference sur le Traitement Automatique d
es Langues
Naturelles (TALN). 2: pp297
-
306



B. Theobald, J.A. Bangham, S. Kruse, I. Matthews and G.
Cawley,
Towards Videorealisic Synthetic Visual Speech
, In
Proceedings of Workshop on the Management of Uncertainty in
Geometric Computations, 2001, (Accep
ted)


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
43

of
49



B. Theobald, J.A. Bangham, I. Matthews and G. Cawley,
Visual
speech synthesis using statistical models of shape and
appearance
, In Proceedings of Auditory
-
Visual Speech
Processing (AVSP), pp 78
--
83, Aalborg, Denmark, 2001.

Q9

8
-
10 Jan
02

13
th

Co
ngres Reconnaissance des Formes et Intelligence
Artificielle (RFIA ’02) Angers. M Malcui, F Preteux


Recalage des elements faciaux dans les sequences videos: une
approche compatible MPEG
-
4 (submitted)


Jan 02

Paper accepted ‘Using dynamic statistical
models to texture the
mesh of a talking head’. Barry J Theobold, Silko M Kruse,
Gavin Cawley and J Andrew Bangham


Jan 02

Paper accepted for ASSETS 2002 conference at Edinburgh.
Title: “TESSA: a system to aid communication with deaf
people”


Feb 02

Sa
far, E. and Marshall, I.
Sign Language Translation via DRT
and HPSG
. Third International Conference on Intelligent Text
Processing and Computational Linguistics (CICLing), Mexico
City, Mexico, Springer
-
Verlag. Pp58
-
68. Feb 2002 ISBN 3
-
540
-
43219
-
1. Lecture
Notes in Computer Science (LNCS) 2276,


Feb 02

CICLing 2002
-

Third International Conference on Intelligent
Text. Processing and Computational Linguistics, Mexico. "Sign
Language Translation via DRT and HPSG." E Safar, I Marshall


Feb/Mar
02

Approaches t
o English to Sign Translation

S. Cox, E. Safar and I. Marshall

The Linguist, February/March 2002, Vol. 41 No 1, pages 6
-
10


Mar 02

Marshall, I. And Safar, E.
Sign Language Synthesis using
HPSG
. Ninth International Conference on Theoretical and
Methodolog
ical Issues in Machine Translation (TMI), Keihanna,
Japan. March 2002


Mar 02

Paper submitted to International Conference On Spoken
Language Processing, Denver, September 2002.

Title: “Speech and Language Processing for a Constrained
Translation System”


Mar 02

“RTM” Journal article “ViSiCAST, ein neuer programm
-
begleitender Dienst im digitalen Fernsehen?” von W. Brucker
and S Kruse.


Mar 02

59th MPEG Meeting, Jeju, S Korea Participation to MPEG
-
4
SNHC and MPEG
-
7 groups

Q10

Apr 02

Paper presented ICA
T 2002: ViSiCAST: Sign Language using
Virtual Humans


Apr 02

East of England eGovernment conference. Presentation of
TESSA and future eSIGN project


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
44

of
49


Apr 02

Glauert, J. R. W.
ViSiCAST: Sign Language using Virtual
Humans
. International conference on Assis
tive Technology
ICAT 2002, Derby. BCS. 21
-
33. April 2002


15 May
02

Eastern Daily Press, Eastern Evening News: TESSA


May 02

60th MPEG Meeting, Fairfax, USA. INT Participation to
MPEG
-
4 SNHC and MPEG
-
7 groups


May 02

UEA Court. Presentation on TESSA


M
ay 02

East of England Multimedia Alliance, UEA Sportspark.
Presentation on TESSA


May 02

60th MPEG Meeting, Fairfax, USA. Participation to MPEG
-
4
SNHC and MPEG
-
7 groups


21 June
02

Presentation of ViSiCAST to visiting German science
journalists


June 02


International conference tutorial INT
-

M Preda, Virtual
Character animation within MPEG
-
4, Proceedings Third
Workshop and Exhibition on MPEG
-
4, San Jose CA



Book chapter Hanke, Thomas: HamNoSys in a sign language
generation context. In: Schulmeister,

Rolf/ Reinitzer, Heimo
(eds):
Progress in sign language research. In honor of Sigmund
Prillwitz/Fortschritte in der Gebaardensprachforschung.
Festschrift fuur Siegmund Prillwitz
. (International Studies on
Sign Language and Communication of the Deaf; 40
) Hamburg;
Signum (2002)


pp249
-
264


June 02

Third workshop and Exhibition on MPEG
-
4, San Jose, USA.
INT Tutorial speaker on MPEG
-
4 animation framework
extension




18 June
02

Articles advertising TESSA in the local Post Office:


Derby Evening Telegraph


19 June
02

Liverpool Echo


22 June
02

Bristol Evening Post


25 June
02

Wolverhampton Chronicle


17 June
02

Wolverhampton Exp & Star

Q11

July 02

TESSA, a system to aid communication with deaf people

S.J. Cox, M. Lincoln, J. Tryggvason, M. Nakisa, M.
Wells, M.
Tutt and S. Abbott. Proc. ASSETS 2002, Fifth International
ACM SIGCAPH Conference on Assistive Technologies, pages
205 212, July, 2002, Edinburgh, Scotland


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
45

of
49


July 02

61st MPEG Meeting, Klagenfurt, Austria. Participation of
MPEG
-
4 SNHC and MPEG
-
7
groups



Book chapter

T. Hanke : Auf dem Wege zu Language Resources
für Gebärdensprachen. Forschungsberichte des Instituts für
Phonetik und Sprachliche Kommunikation der Universität
München (FIPKM) 37 (2001), pp. 191
-
203.



B. Theobald, J.A. Bangham, I.

Matthews and G. Cawley,
Towards Videorealisic Synthetic Visual Speech
, In Proceedings
of the International Conference on Acoustics, Speech and Signal
Processing (ICASSP), Orlando, Florida, USA, 2002, pp 3892
--
3895. (
pdf

508kb,
ps.gz

1.02 mb).



Paper written on constrained language system and submitted to
ICASSPO3


29 Aug
02

www.bbcworld.co.uk

description of TESSA and ViSiCAST
with quotes from John Low


Aug 02

Safar, E. and Marshall, I.
An Intermediate Semantic
Representation Extracted from English Text for Sign Language
Generation
. The Seventh Symposium on Logic and Language,
Pécs, Hungary. A
ugust 26
-
29, 2002


Aug 02

CVHI 2002, Granada, Spain. H. Popescu: At a better integration
of Deaf people into information society


1 Aug
02

Articles advertising TESSA in the local Post Office:

Liverpool Echo


1 Sept
02

Disability Times


Sept 02
issue

Gl
auert, J. R. W.,
Future forecast


virtual human signing on
the web
, Ability Magazine, BCS Disability Group, Issue 45, pp
20
-
21, September 2002


Sept 02

IBC Amsterdam, Netherlands


Sept 02

International conference on Image Processing (ICIP 2002)
Rocheste
r, NY, USA. Critic review o MPEG
-
4 Face and Body
Animation Proceedings


Sept 02

World Telecommunication Congress 2002 (WTC 2002) Paris,
France. Demonstration of INT MPEG
-
4 compositing technology
for ViSiCAST


Sept 02

In “Interpres”, magazine for Dutch Si
gn Language interpreters.
“Internet toegankelijker voor gebarentaalgebruikers”, C,
Tijsseling & H Frowein


Sept 02

Speech and Language Processing for a Constrained Speech
Translation System

S.J. Cox. Proc. Int.. Conf.. on Spoken Language Processing,
Denve
r, September 2002


ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
46

of
49

Q12

Nov 02

NWO
-
symposium “Tussen Brein en Bewustzijn”, Amsterdam,
NL. Presentation of ViSiCAST “Geanimeerde gebaren”, I
Zwitserlood and C Tijsseling


Dec 02

The Development and Evaluation of a Speech to Sign
Translation System to Assist

Transactions.
S J Cox, M Lincoln,
M J Nakisa, M. Wells, M. Tutt and S. Abbot. Int. Journal of
Human Computer Interaction (accepted for publication)


Dec 02

A Comparison of Language Processing Techniques for a
Constrained Speech Translation System.
M. Li
ncoln and S.J.
Cox. IEEE Conf. on Acoustics, Speech and Signal Processing,
Hong Kong, 2003 (submitted)


Dec 02

B. Theobald, S. Kruse, G. Cawley and J.A. Bangham,
Towards
a low bandwidth talking head using appearance models
, Journal
of Image and Vision Com
puting (IVC), (Submitted)



Dec 02

Dual Systems Processing and Translation at the Post Office:
Reading the Signs.
A. Wray, S.J. Cox, M. Lincoln and J.
Tryggvason. Applied Linguistics (
submitted
)


6. Project Management and co
-
ordination


Success in ViS
iCAST has been achieved through the use of loose and dynamic project
structures coupled with the encouragement of a strong social cohesion between
individual project members. This flexible approach is underpinned by a single central
Management Group in ViS
iCAST, which performs both ‘steering’ and ‘work co
-
ordination’ tasks and which meets at regular 3
-
month intervals for 2 days.


ViSiCAST has 9 partners and is organised into 8 project workpackages. Each
workpackage is led by a different partner. The excepti
on is the ITC. The ITC is the
coordinator leading both management and exploitation. The BBC has collaborated
with the ViSiCAST members for the purposes of researching and developing
technologies and techniques, which involve the use of virtual characters f
or conveying
deaf sign language to accompany television broadcasts.


To achieve the objectives the project is structured to have three application
-
orientated
workpackages, each focusing on the technical issues in delivery for that specific
application area
, and two enabling technology workpackages, focusing on virtual
signing, sign language representation, and sign language synthesis from conventional
textual sources. A further evaluation workpackage is concerned with eliciting
feedback from deaf people at
various stages within the development of the system.



ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
47

of
49

It has been beneficial that organisations for the deaf are closely involved in the
project, ensuring a clear focus on the key needs of the community, and promoting the
outcome of the project to the wide
st audience.


Contributing to the success is the structure of ViSiCAST: the workpackages are
grouped into a number of layers (technology, application of the technology and user
trials/ dissemination). The workplan is arranged into a number of phases: techn
ology
transfer and familiarisation; prototype applications; and advanced applications.


Each workpackage involves a number of partners, so cooperation is essential.
Workpackage leaders are responsible for negotiating a detailed plan for achieving the
workp
ackage milestones and deliverables by breaking the work into subtasks that are
the responsibility of a single partner.


Technology development partners have often found it difficult to deliver to schedule.
This could be due to staff skill shortages, techn
ology issues. The evaluators and
developers have worked well together to avoid further impacts : the developers have
kept the evaluators informed and the user evaluators have arranged preliminary sight
of the deliverables.



6.1 Consortium and Workpackage

Meetings

Each partner has a project manager who belongs to the ViSiCAST management team
that meets in Consortium Meetings on a quarterly basis. Two days are set aside for
consortium meetings. Management business will usually take less than one day to
faci
liate co
-
working. The rest of the time can then be given over to technical work in
workpackages to help resolve any issues. Workpackages arrange additional meetings
as and when required.


One example is the HPSG modelling task within WP5 where a very close

cooperation
between all persons involved was necessary to ensure that the models for the three
target language come out as close as possible. This required direct communication
between individual researchers from different teams. In order to initiate such

a close
communication, two small workshops (2.5 days) were organised that were attended by
all researchers involved in HPSG modelling. Feedback from participants was very
positive, they felt both motivated in their work and in undertaking more effort in
c
ommunications. It was suggested to continue the work with at least one more
workshop.


Decisions concerning the direction of the work have been made jointly by all
participants, usually at the quarterly Management Group meetings. Where significant
decisio
ns must be made at other times, this action will be co
-
ordinated by the Project
Manager, usually via the medium of e
-
mail. All participants and subcontractors will
make their best endeavours to achieve decisions which are acceptable to all. In the
event
of a disagreement which is not resolvable by negotiation, a decision has been

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
48

of
49

reached by a simple majority vote of participants, where each has one vote
irrespective of its level of commitment to the project. In the event that there is no
overriding majo
rity, the decision of the co
-
ordinating participant will be accepted.


Extensive use is made of electronic mail for regular communication and exchange of
documents. Microsoft Office tools are used by default. Microsoft Project is used in a
simple way to su
pport project management. A number of mailing lists are used to
contact the project management team, or even all staff involved in the project.


The consortium Website has been updated with the project domain
http://w
ww.visicast.org

to provide public up
-
to
-
date information about the project
and its partners. The site provides demos of ViSiCAST deliverables; a means of
archiving and exchanging project information; and facilities to assist project
management, by trackin
g resource usage and progress against the plan.


7. Outlook


In the UK and Europe the ViSiCAST project is expected to benefit the deaf
community and their service providers as follows:




The weather forecast in sign language on the Internet plans to be imp
lemented by
the Dutch deaf society Dovenschap in the Netherlands and RNID in the UK on a
permanent basis.



Dovenschap and IvD are planning a project in which the ViSiCAST technology
and applications are made available for general use by deaf people who want

to
use sign language in their emails and on their website. In this project more
synthetic signs are to be created and structural support (course, helpdesk, etc.) will
be set up, in order to enable many deaf people and information providers to create
their

own content.



In conjunction with other partners, Televirtual plans to proceed further with the
development of broadcast avatar systems, for both sign language and general
entertainment use.



The Post Office is strategically planning a nationwide rollout of

the TESSA
system.



New web
-
based services in animated sign language will be created using the
ViSiCAST
-
results in the follow
-
up European Commission project eSIGN

which
will take virtual signing onto eGovernment portals.

These services are meant to
help D
eaf people improve their employment situation and gain better access to
social services and facilities offered by government bodies.


Involvement in ViSiCAST has enabled Televirtual to stay abreast of developments in
the field of motion capture and animati
on. The stringent requirements of real
-
time
motion capture for sign language recording
-

the ability to record and monitor face,

ViSiCAST

IST
-
1999
-
10500

10
December 2002



ViSiCAST

Page
49

of
49

body and hand gesture in real time
-

exceed the demands of normal motion capture for
animation, entertainment or scientific pur
poses. Thus the hardware/software suite of
tool developed by Televirtual can be seen as representing current state of the art in this
area. There is further work to be done, however, to achieve a system capable of
regular use in a studio rather than a labo
ratory context, and Televirtual plans to
continue this work with other partners.


The Televirtual animation environment
-

the company's Mask software
-

has been
extensively revised, in part as a part of the ViSiCAST project. It has been optimised to
give s
tate of the art animation, rendering and visualisation capabilities on standard
PCs fitted with high performance (but relatively low cost) video cards. Mask 2, the
second generation of the software, is designed to allow easy application build and has
been
the cornerstone of other developments within the ViSiCAST project.


In Europe and the follow
-
up European Commission project eSIGN, new web
-
based
services in animated sign language will be created using the ViSiCAST
-
results.



Televirtual will provide the Ma
sk 2 animation environment as host software for the
follow
-
on eContent project, eSign, which will take virtual signing onto
eGovernment portals. It is and will continue to be developed and used as the host
environment for the company's commercial and resea
rch activities, at a national or
European level. It is already being used to develop systems for Broadcast
entertainment use as far afield as Malaysia and Australia.



8. Conclusions


The work described in this final report has demonstrated that virtual hu
mans can
achieve acceptable signing for television, point of sale and internet applications.
These achievements provide a firm foundation for further evolutionary improvements
to the quality of translation and virtual signing.