Online Handwritten Arabic OCR file 1 - ALTEC

strangerwineAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

87 views

Pre
-
SWOT Report.

Online Handwritten Arabic OCR

(
O
nline
H
andwritten
R
ecognition:
OHR
)

Dr.
Ashraf

Al
-
Marakby

Eng. Hesham Osman

Eng.
Randa

Al
-
Anwar

Dr. Mohamed
Waleed

Fakhr

Dr.
Mohsen

Rashwan

Eng.
Eman

Mostafa


1
-
Introduction and challenges


The wide spread use of pen
-
based hand held
devices such as PDAs, smart
-
phones, and tablet
-
PCs, increases the demand for high performance
on
-
line handwritten recognition systems.


These systems recognize text while the user is
writing with an on
-
line writing device, capturing
the temporal or dynamic information of the
writing. This information includes the number,
duration, and order of each stroke (a stroke is the
writing from pen down to pen up).

Main Challenges in Arabic OHR


Unconstrained writing problem


Dotting problem


Delayed Strokes problem: association between
letters and their diacritical marks.


Overlapping problem


2
-

Applications


Education domain: Huge number of attractive
applications for students and teachers, over
tablet PCs and other devices (smart pens,
smart boards, etc.)


Online mapping of notes to text in online data
collection (questionnaires, forms, etc.)


Huge number of Mobile applications for
business people and others.

3
-

State of the art in products
(Latin script)


OHR is a highly mature technology for Latin
script with excellent performance.


MicroSoft
,
RitePen
,
VisionObjects
,
QuickScript

are a few very successful OHR solution
providers for more than
20
languages.


Most require no training, allow for user
-
defined dictionary and user adaptation.


Performance is claimed to be excellent for
unconstrained, continuous writing.

4
-

State of the art in products
(Arabic script)


Sakhr and
ImagiNet

both offer OHR products
for most MS based devices (HTC, Pocket PC,
PDA).


Also,
VisionObjects

and
QuickScript

have OHR
Arabic support and claim good performance.


A comparison between these products on a
standard benchmark is needed
to find out the
strengths and weaknesses of each.

5
-

State of the art in Research for
Arabic OHR


Focus mainly on producing true unconstrained
continuous cursive writing.


Focus on developing algorithms that can run
in real time and on limited resources.


Significant recent efforts: Most recent
research employ Recurrent Neural Networks,
HMMs, fusion of other pattern recognition
techniques. Also, making use of the offline
image as an extra source of information.

Competition ICDAR
2009


The database consists of
23
,
251
Arabic words
handwritten by more than
130
different writers
(ADAB database:
Tunisian City names).


For testing,
2400
words are used written by
24
writers different than the ones in training.


Best performance obtained by
VisionObjects

team:
99
%. The system use neural networks with
other PR techniques.


Second best is MDLSTM by Alex Graves:
96
%.
Using a hierarchy of multidimensional recurrent
neural networks.


6
-

Required Modules


Pre
-
processing tools: delayed strokes, smoothing,
resampling, etc.


PAW or letter Segmentation and extraction tool


Language models


Feature extraction tools.


Statistical training tools: HTK, SRI,
Matlab
, and
many neural network tools.


Error analysis tools: Need to be implemented.

7
-

Required Resources


Major question: How many PAWs? And How
many of them are most frequently used? (an
estimate of
500
is given).


Word annotated corpus (estimated
2000
pages by
2000
writers).


Character/PAW annotated corpus for initial
models to cover
10
instances for each PAW.


Dictionaries with PAW transcriptions


8
-

Available Resources and Gaps


ADAB database is the only large one available
(limited domain, limited number of writers).


Annotation and segmentation tools required.


More data required.


9
-

LR proposed by ALTEC


For the training data, we suggest
10
,
000
writers, one
page per person.


In the first phase, we will start with
2000

writers, each
writing two pages (average of
50

words per page),
which gives about
200
,
000

words. We could retain
150
,
000
words for training and
50
,
000
for
benchmarking.


The vocabulary issue must be addressed. Also, we need
to ensure the fair coverage of the PAWs.


Cairo university has annotation tools to assist manual
segmentation of the online data, and Dr.
Sherif

Abdou

will kindly make it available to ALTEC.


10
-

Preliminary SWOT analysis


Strengths:

1.
The expertise in DSP, pattern recognition, image processing, NLP, and stochastic
methods

2.
Potential to have huge amounts of annotated data.



Weaknesses:

1.
No comprehensive benchmarking available for Arabic OHR

2.
No standard training database available for research community for Arabic OHR


Opportunities:

1.
Large market of such a tech. of over
300
million native speakers, plus other
numerous interested parties, over a wide range of platforms (tablet PCs, smart
phones, etc.)







Threats:

1.
Other R&D groups all over the world (esp. in the US) is working hard and racing for
more reliable products and for more applications.

2.
Microsoft could make its OHR Arabic product open source when it is done.


11
-

Survey


Specify the application that OHR recognition will be used for


What is the data used/intended to train the system?


What is the benchmark to test your system on?


Would you be interested to contribute in the data collection. At
what capacity?


Would you be interested to buy Arabic OHR annotated data?


Would you be interested to contribute in a competition


How many persons working in this area in your team? What are
their qualifications?


What are the platforms supported/targeted in your application?


What is the market share anticipated in your application?


Would your application support any other languages? Explain.


List of Survey Targets


Sakhr


ImagiNet


RDI


Orange
-

Cairo


IBM
-

Cairo


Cairo University


Ain

Shams University


Arab academy (AAST)


AUC


GUC


Nile University


Azhar

university


Helwan

university


Assuit

university


Other research Centers from outside Egypt


Other companies that are users of the technology

12
-

Key Figures in this Field


Dr. Alex
Graves (TU Munich, Germany).


Dr. Stephan
Knerr

(CEO,
VisionObjects
)


Dr.
Hazem

AbdelAzeem

(Egypt)