Proposed Research: Implementation (VXML)

illinoiseggoΛογισμικό & κατασκευή λογ/κού

28 Οκτ 2013 (πριν από 4 χρόνια και 7 μήνες)

74 εμφανίσεις

Proposed Research: Implementation (VXML)

The above d
iagram can be used to explain the project’s
VXML application.
we will need

several components
which are necessary

for such an

A telephone network(Public Switched Telephone Network :

A VXML platform. The VXML engine runs on this platform and as a VI
(Voice Interface) to the caller, the voice server translates VXML

An application server. This is typically a Web s
erver that holds
applications and databases.

The network protocol. The VXML documents will be transferred across
the internet using TCP/IP

HTTP. By this, dynamic VXML can be
implemented using server side languages (ASP/JSP/CGI).


Users can

connect to a VXML application by calling the specific phone
number for this application. Then the VXML interpreter in the VXML Platform
receives and answers the call by executing the root VXML document. Based
on the project requirements the system will re
quire dynamic VXML in order to
access a DB (database) and perform tasks involving it. Dynamic VXML allows
the creation of robust applications by separating presentation from dynamic
content. Moreover in
order to develop a dynamic V
XML application, server
ide scripting languages will be used. Such languages are ASP.NET, JSP
(Java Server Pages) and CGI.

Dialog Implementation:

VXML provides two ways of creating dialogs, Forms and Menus. The top
element is <vxml>

which is actually a container for dialogs. In VXML, Forms
are implemented for presenting information and gathering input. On the other
hand menus are implemented to offer users choices of what to do next. The
form element is defined as <form> and the Menu

element as <menu>. After
those, the <field> must be considered as it is used for collecting user input.
Based on the idea of collecting user input the
<grammar> elements must be
considered. As described
grammars can
be used as templates for
ing legal user input. Moreover the main way of speaking using TTS is
by use of <prompt>. Also the <block> element is important since it is a dialog
item which will contain executable content.

VXML also provides a number of ways for navigating in dialogs.
Some of the
most important are by using the <goto> and <submit> elements. Both of them
can be used in executable content: <block> elements. The <goto> causes a
transition to a new item, which could be in a different page, inside a different
dialog. In cont
rast the <submit> obtains a new page from the server and
transfers control to a dialog on the new page. Furthermore another element
which needs to be considered is the <link> tag which basically specifies a link.
Also it should contain at least one grammar

and as a grammar matches
according to the user input, then either a hyperlink is executed or an event is

Regarding dialog design in VXML there are several elements which can be
used for the development of a convenient dialogue.
These are necessary for
the TTS engine in the VXML platform. The TTS engine can use these
elements as guidance on how to say/pronounce the dialog text providing a
more natural feel of speech.
Such elements include the <sentence> or <s>
and <paragraph> or <
p> tags which are used to clearly define sentences and
paragraphs. Some other elements which are very important are the
<emphasis> and <prosody>.The first one is responsible for requesting
specific level of emphasis (weight or stress) of the synthesized
speech which
is generated from the text enclosed within the tag. The second element is
responsible for the prosody of the text enclosed within <prosody> tag.
Prosody refers to volume (loudness) of the speech, rate (fast, slow, and
medium), and pitch (high,

medium, low) and pitch range (variation in pitch).
The synthesizer uses these tags as rules on how to inflect the generated
speech. Moreover, all of the structural tags (e.g. <prompt>) support an
optional attribute for language specification. For example:

<prompt xml: lang=
GB”> or <prompt xml: lang= “el”>. Based on ISO 639


(a standardized
nomenclature used to classify all known languages) the first example “en
is the English language followed by an optional sub
specifier, ISO3166
(country code
). The second example “el” refers to the Greek language without
specifying any country code. Another powerful feature of VXML is that it
allows the developer to choose a voice according to the common preferences.
The <voice> tag is responsible for this alo
ng with its attributes gender,
category and age.

Finally, VXML offers a technique called Tapered prompting (count), <prompt
count=”1”> which enables a prompt to vary each time is visited. Every form
item and event handler has a “prompt counter” .This promp
t counter is
actually a hidden counter, set to ‘one’ each time an item is visited and for
every time the item will be visited again it will incremented. In most cases the
tapered prompting is used in elements such as <noinput> and <nomatch>.
For example we

may specify different responses for no input or if a user’s
response does not match with the grammar. By doing this it will enable better
natural language and a more efficient dialog.


1) Bob Edgar (2001),
The VoiceXML handbook
, CMP Books, New


M.Oshry, RJ Auburn,P.Baggia, L.M Bodell, D. Burke, D. C. Burnett,
E.Candell,J. Carter, S. McGlashan, A. Lee, B. Porter,K.Rehor, (2007)

Voice Extensible Markup Language (VoiceXML) 2.1

20070619/ >

3) François Mairesse (2008),
An Introduction to VoiceXML:
ART on Dialogue
Models and Dialogue Systems

University of Sheffield,UK

4) ISO 639
1: Codes for the Representation of Names of Languages,

hp >

Rick Beasley

K. M Farley

J. O'Reilly

L. Squire

Voice Application
Development with VoiceXML ,
Sams Publishing