Speech (TTS) project
30. August 2013
Kristinn Halldór Einarsson
project manager and chairman for
Icelandic organization of the visually
Life quality taken for granted.
Visually impaired people and Text to Speech systems.
BIOVI Text to Speech project.
Listening examples and tools presentation.
Quality of life
How would it affect us if we would lose our ability to read?
This is something that will most likely happen to some of us in our
What can be done to limit the huge negative impact on the life quality of
those who are going to lose their ability to read in a conventional manner?
Can it be accepted that an increasing part of our population could lose their
ability to enjoy reading in an independent manner?
Who are they?
5% of people 70 years and older are affceted by later stage of Age
Macular Degeneration (AMD). No effective treatments available today.
AMD affects mainly the central vision (reading vision).
There are around 800 visually impaired individuals in Iceland as a
result of later stage AMD. In 2030 the number is expected to double, be
1600. Total poulation of visually impaired in Iceland is 1600.
The organization of people with dyslexia in Iceland claims that up to 25%
of grown ups are dealing with dyslexia.
... a bit of history
The first known tales of effort to build a talking machine.
1968 The first computer speech synthesizer is built.
1988 The Universities of Iceland and Stockholm start cooperation.
1990 The Swedish company Infovox releases Sturla, the first Icelandic TTS voice.
2000 Snorri, an updated and improved version of Sturla is released.
2006 Ragga, a new Icelandic TTS voice is released by Nuance.
2012 Dóra and Karl, new male and female voices, are released by Ivona.
Text To Speech (TTS) technology?
TTS systems are linguistic tools that transforms text in a digital format to speech.
Modern TTS systems need to be able to operate on different operating systems
and tools such as: computers, tablets, smart phones, AMD
s, mp3 players and
other computing tools.
TTS voices are built for each language and need to be available in different
sizes and qualities.
Quality of TTS voices is measured wrt. listening qualities & closeness to natural
ICT, accessibility & quality of life
ICT (Information and Communication technology) can increase independence and
life quality of visually impaired people tremendously as it opens up a whole new
educational, leisure and employment possibilities.
A key element is well designed TTS system in the mother tongue of those who are
to benefit. The mother tongue is an essential part of every nation's identity, and
TTS system is not only beneficiary to visually impaired people but also the much
larger learning disability population.
TTS voices are marketing commodities
Producers of TTS voices expect return on investments.
Languages spoken by many people represent a market with a big demand that can
generate big supply and attractive business opportunities.
Language spoken by few people represent a market, with little demand and little
or no supply, that offers little or no business opportunities.
What is the situation with languages spoken by few people, in terms of having
modern ICT linguistic tools that are becoming more and more important in modern
If you talk to a man in a language he
understands, that goes to his head.
If you talk to him in his language, that goes to his
The project was based on two pillars:
Improved life quality
Cultivation of the Icelandic language
Multiple usage options
Very good listening qualities.
License fee arrangement.
Open to further development .
Some control over future development.
Sustainable business model.
Selecting TTS producer
After exploring and taking stock on different TTS producers the Polish
was selected to build the new Icelandic TTS voices.
Royal National Institution of Blind People in UK (RNIB) have enjoyed very
good cooperation with Ivona. Ivona was finishing building welsh TTS voices.
The Ivona voices have received many rewards for the accuracy and listening
quality they possess.
a new age for Text
BrightVoice technology guarantees a smooth natural speech
New language models
provide intelligent text interpretation
10 times faster
Crystal clear sound
due to noise and distortions reduction
Rapid Voice Devolopment
Rapid Voice Development
fast building of IVONA Voices
RVD technology (Rapid Voice Development) makes the process of building
IVONA Voices fast and relatively cheap.
It uses a set of tools modeling a linguistic issues such as subvocalization,
It also allows to efficiently, quickly and accurately determine the speech signal
in original speech recordings.
The Ivona tecnology
Development in number of Ivona voices
Operation systems and the Ivona voices
The Ivona voices are capable of operating on
iOS (Apple iPhone & iPad)
The project in steps
Ivona visited, agreement drawn up and signed.
10.000 sentences selected from the Icelandic corpus in Leipzig,
voice talents selected, recording of sentences. Voices named Dora & Karl.
Ca. 900 sentences released. Valuation and feedback by team
of linguistics and users. Beta 1.
Valuation and feedback on Beta 2 is concluded.
Beta version 3 is released and distribution starts.
: 10.000 additional pronunciation examples added to the corpus.
: Final version of Dora and Karl released.
Cost and plans
Total cost was 500.000 Euros (85 million IKR).
The project was close to fully financed when agreement was signed.
Cost and delivery times where according to plans and turned out to be
Blindrafelagid (inheritance from Dora Stefánsdottir)
25,0 m.kr. 29%
Lions, national colection The Red feather
19,3 m.kr. 23%
Foundation for disability related projects
15,0 m.kr. 17%
Ministries of welfare and education
11,3 m.kr. 13%
The diability oragnization of Iceland
10,0 m.kr. 12%
Blindravinafélagið (Friends of the blind)
5,0 m.kr 6%
85,6 m.kr. 100%
Among valuable advisers, contributors and co
Eiríkur Rögnvaldsson, Icelandic professor at the University of Iceland and his
Sigrún Helgadóttir at Árnastofnun.
The people behind the Icelandic corpus at the University of Leipzig.
Mrs Vigdís Finnbogadóttir, former president of Iceland, who acted as the
Sustainable business model
The Icelandic Ivona voices, along with Ireader, are given free of charge to
all Icelanders who are visually impaired or are dealing with reading
impairment. Others can buy the Ireader and the voices for around 50 Euros.
BIOVI handles all sales of the Icelandic Ivona voices and different tools like
the text reader, recording studio and the webreader. Customers are
individuals, schools, institutions and businesses. Additional voices in other
languages can easily be bought from Ivona and added to one’s voice portfolio.
Profits from the sales of the Icelandic voices are meant to finance further
development and extra additions that might bee needed.
South or north pronunciation?
Emphasize in pronunciation:
Difficult to deal with compound words as the
rules for stress placement in Icelandic compounds are unclear.
Difficult because of so many declensions forms.
Read them or interpret them?
Solved with an additional dictionary
SAPI 5 voices for Windows and reader and mini reader.
Webreader that reads from the cloud.
Android voices for smart phones and tablets.
Ivona SDK (Software devolopment kit) and voices for,
telephone answering, AMD and other computing tools.
Ivona An Amazon company
On the 24th of January 2013 Amazon announced that it has acquired the
speech technology company IVONA.
This acquisition strengthens and protects the position of Ivona on a market
where there are some much bigger players then Ivona.
Amazon acquiring Ivona is in a way confirmation that others have seen the
same thing as we did when it comes to the potential of Ivona TTS products.
Listening examples and tools precentation