BIOVI Text To Speech (TTS) project

blackeningfourAI and Robotics

Oct 19, 2013 (4 years and 8 months ago)




Speech (TTS) project

Nordisk sprogmøde


30. August 2013

Kristinn Halldór Einarsson

project manager and chairman for


Icelandic organization of the visually
impaired (BIOVI)


Life quality taken for granted.

Visually impaired people and Text to Speech systems.

BIOVI Text to Speech project.

Listening examples and tools presentation.

Quality of life

How would it affect us if we would lose our ability to read?

This is something that will most likely happen to some of us in our

retirement years.

What can be done to limit the huge negative impact on the life quality of

those who are going to lose their ability to read in a conventional manner?


Can it be accepted that an increasing part of our population could lose their

ability to enjoy reading in an independent manner?

Who are they?

5% of people 70 years and older are affceted by later stage of Age

Macular Degeneration (AMD). No effective treatments available today.

AMD affects mainly the central vision (reading vision).

There are around 800 visually impaired individuals in Iceland as a

result of later stage AMD. In 2030 the number is expected to double, be

1600. Total poulation of visually impaired in Iceland is 1600.

The organization of people with dyslexia in Iceland claims that up to 25%

of grown ups are dealing with dyslexia.

... a bit of history

The first known tales of effort to build a talking machine.

1968 The first computer speech synthesizer is built.

1988 The Universities of Iceland and Stockholm start cooperation.

1990 The Swedish company Infovox releases Sturla, the first Icelandic TTS voice.

2000 Snorri, an updated and improved version of Sturla is released.

2006 Ragga, a new Icelandic TTS voice is released by Nuance.

2012 Dóra and Karl, new male and female voices, are released by Ivona.

Text To Speech (TTS) technology?

TTS systems are linguistic tools that transforms text in a digital format to speech.

Modern TTS systems need to be able to operate on different operating systems

and tools such as: computers, tablets, smart phones, AMD
s, mp3 players and

other computing tools.

TTS voices are built for each language and need to be available in different

sizes and qualities.

Quality of TTS voices is measured wrt. listening qualities & closeness to natural


ICT, accessibility & quality of life

ICT (Information and Communication technology) can increase independence and

life quality of visually impaired people tremendously as it opens up a whole new

educational, leisure and employment possibilities.

A key element is well designed TTS system in the mother tongue of those who are

to benefit. The mother tongue is an essential part of every nation's identity, and

legal rights.

TTS system is not only beneficiary to visually impaired people but also the much

larger learning disability population.

TTS voices are marketing commodities

Producers of TTS voices expect return on investments.

Languages spoken by many people represent a market with a big demand that can

generate big supply and attractive business opportunities.

Language spoken by few people represent a market, with little demand and little

or no supply, that offers little or no business opportunities.

What is the situation with languages spoken by few people, in terms of having

modern ICT linguistic tools that are becoming more and more important in modern


Mother tongue

If you talk to a man in a language he
understands, that goes to his head.

If you talk to him in his language, that goes to his

Nelson Mandela.

Speech project

The project was based on two pillars:

Improved life quality


Cultivation of the Icelandic language

main definitions

Multiple usage options

Very good listening qualities.

License fee arrangement.

Open to further development .

Some control over future development.

Sustainable business model.

Selecting TTS producer

After exploring and taking stock on different TTS producers the Polish


was selected to build the new Icelandic TTS voices.

Royal National Institution of Blind People in UK (RNIB) have enjoyed very

good cooperation with Ivona. Ivona was finishing building welsh TTS voices.

The Ivona voices have received many rewards for the accuracy and listening

quality they possess.







a new age for Text

BrightVoice technology guarantees a smooth natural speech

New language models

provide intelligent text interpretation

Up to
10 times faster

speech generation

Crystal clear sound

due to noise and distortions reduction


Rapid Voice Devolopment

Rapid Voice Development

fast building of IVONA Voices

RVD technology (Rapid Voice Development) makes the process of building

IVONA Voices fast and relatively cheap.

It uses a set of tools modeling a linguistic issues such as subvocalization,

accentuation, intonation.

It also allows to efficiently, quickly and accurately determine the speech signal

in original speech recordings.

The Ivona tecnology

Development in number of Ivona voices


Operation systems and the Ivona voices

The Ivona voices are capable of operating on

Windows XP/Vista/7/8



iOS (Apple iPhone & iPad)


Windows mobile

The project in steps

December 2010

March 2011:

Ivona visited, agreement drawn up and signed.

Summer 2011:
10.000 sentences selected from the Icelandic corpus in Leipzig,

voice talents selected, recording of sentences. Voices named Dora & Karl.

February 2012:
Ca. 900 sentences released. Valuation and feedback by team

of linguistics and users. Beta 1.

Apríl 2012:
Valuation and feedback on Beta 2 is concluded.

June 2012:
Beta version 3 is released and distribution starts.

October 2012
: 10.000 additional pronunciation examples added to the corpus.

June 2013
: Final version of Dora and Karl released.

Cost and plans

Total cost was 500.000 Euros (85 million IKR).

The project was close to fully financed when agreement was signed.

Cost and delivery times where according to plans and turned out to be


Financal contributors

Blindrafelagid (inheritance from Dora Stefánsdottir)

25,0 29%

Lions, national colection The Red feather

19,3 23%

Foundation for disability related projects

15,0 17%

Ministries of welfare and education

11,3 13%

The diability oragnization of Iceland

10,0 12%

Blindravinafélagið (Friends of the blind)

5,0 6%


85,6 100%

Valuable contributors

Among valuable advisers, contributors and co
workers where:

Eiríkur Rögnvaldsson, Icelandic professor at the University of Iceland and his


Sigrún Helgadóttir at Árnastofnun.

The people behind the Icelandic corpus at the University of Leipzig.

Mrs Vigdís Finnbogadóttir, former president of Iceland, who acted as the

project’s patron.

Sustainable business model

The Icelandic Ivona voices, along with Ireader, are given free of charge to

all Icelanders who are visually impaired or are dealing with reading

impairment. Others can buy the Ireader and the voices for around 50 Euros.

BIOVI handles all sales of the Icelandic Ivona voices and different tools like

the text reader, recording studio and the webreader. Customers are

individuals, schools, institutions and businesses. Additional voices in other

languages can easily be bought from Ivona and added to one’s voice portfolio.

Profits from the sales of the Icelandic voices are meant to finance further

development and extra additions that might bee needed.

Linguistic challenges


South or north pronunciation?

Emphasize in pronunciation:
Difficult to deal with compound words as the

rules for stress placement in Icelandic compounds are unclear.

Difficult because of so many declensions forms.

Read them or interpret them?

Foreign words:
Solved with an additional dictionary

Main tools

SAPI 5 voices for Windows and reader and mini reader.

Webreader that reads from the cloud.

Android voices for smart phones and tablets.

Recording studio.

Ivona SDK (Software devolopment kit) and voices for,

telephone answering, AMD and other computing tools.

Ivona An Amazon company

On the 24th of January 2013 Amazon announced that it has acquired the

leading text
speech technology company IVONA.

This acquisition strengthens and protects the position of Ivona on a market

where there are some much bigger players then Ivona.

Amazon acquiring Ivona is in a way confirmation that others have seen the

same thing as we did when it comes to the potential of Ivona TTS products.

Listening examples and tools precentation






Takk fyrir