Towards Automation of Capturing Outlines of Arabic Fonts

cavalcadehorehoundMechanics

Nov 5, 2013 (3 years and 1 month ago)

51 views

Page
1

of
12


Towards Automation of Capturing

Outlines of Arabic Fonts

M. Sarfraz

ID 1234567

Department of Information & Computer Science

King Fahd University of Petroleum & Minerals

KFUPM # 1510, Dhahran 31261, Saudi Aabia.

Email: sarfraz@kfupm.edu.sa


Abstract

This

paper presents a method for capturing outlines of fonts, particularly suitable for non
-
Roman languages like
Arabic. In most of desktop publishing systems, the shapes of the characters are stored in the computer memory in
terms of their outlines, and outli
nes are expressed as cubic Bezier curves. The outline fonts are produce from a
gray
-
level image obtained by scanning the original characters drawn on paper. Significant points are obtained from
the gray
-
level image. However user has to specify the many sig
nificant points manually. In the above process of font
design, it is evident that human interaction makes it very slow. If the human interaction is minimized, it will greatly
improve the speed. This paper proposed an approach to minimize the human interact
ion in obtaining the outline of
original character.


Keywords:

Font, Significant points, Contour, Gray
-
level image, Outline, Spline


1

Introduction


There are two fundamental techniques for storing fonts in computer [19
-
20]. One is bitmap and
other is outlin
e. Outline representation of fonts has many advantages over bitmap representation
[1, 7
-
10, 19
-
24]. Multiple sizes may be derived from a single stored representation by suitable
scaling. Different type faces can also be derived e.g. italics may be derived

by shearing the
original outline. Outlines may be arbitrarily translated, rotated, scaled, and clipped. Therefore
most of the cotemporary desktop publishing systems are based on outline fonts.


Arabic characters are different from other characters such as

English, Latin etc., in that they are
written cursively from right to left. Each character has two to four different forms, depending on
its position in the word, see Figure 1. The Arabic script is very rich in different font formats and
its cursively nat
ure requires much more attention. This paper proposes an approach to minimize
the human interaction in obtaining the outline of original character.

In the traditional approaches [1], initially, a hand drawn character is scanned from paper to
obtain a gra
y level image. From this gray
-
level image, contour of the character is obtained. Then
corner points of the character are determined from contour. These corner points can be obtained
by some interactive method or by some automated process. Optimal curve fit
ting is done by
Page
2

of
12


segmenting the contour outline at the corner points. Normally, the curve fitting methods are
based on conics or Bezier cubics.




















Figure
1
:
Different shapes of Arabic Characters depending on their po
sition with the word.


The methodology, in this paper, mainly differs to the traditional approaches in various ways.
Since, some times corners are not detected precisely and some times only corner points are not
sufficient to fit the curve which represent
the original character. In addition to corner points,
some more points are needed to achieve a best fit. This paper, in addition to corner points,
identifies
significant points

too. These
significant points

play important role in the overall shape
of the
final character. The outline capturing technique is based upon a rational cubic which has
attracting features to control the outline segments locally as well as globally at every

Page
3

of
12


characteristic point. There is freedom, in the form of shape parameters, in t
he description of
rational cubics. These parameters are optimized to produce the optimal contour outline.


The organization of the paper is made as follows. The Section 2 explains briefly the approach
towards the construction of the body of the paper. The
discussion of scanning the image and
filtering the noise in it is made in Section 3. This Section is also meant for the boundary
detection algorithm to the scanned image. The issue of detecting the characteristic points (corner
points and significant point
s) is discussed in Section 4. The details of boundary drawing method
are given in Section 5. The Section 6 discusses the issue of best possible boundary fit if the
initial fit is not optimal to the outline of the original scanned image. The Section 7 summa
rizes
the whole discussion in the form of an algorithm and the Section 8 concludes the paper.


2

Font Design Approach


Our proposed method of font design consists of following steps.


(i)

The hand
-
drawn image is scanned and filtered to remove any noise that mig
ht be
present in the scanned image.

(ii)

Boundary detection algorithm is applied to the scanned image to estimate the contour.

(iii)

Significant points (including corners points) are determined from the contour of
character

(iv)

Outline of the character is generated by
fitting spline to the significant points.

(v)

If the boundary thus obtained is not as accurate as desired, the parameters in the
description boundary drawing method are adjusted so that the drawn outline is
optimal to the scanned outline.


In the following sec
tions, we explain the above steps.



















Figure 2:
Gray
-
level image of a character.

Page
4

of
12



3

Image Outline Extraction


The extraction of the contour points, from the gray
-
level image, is the first step of the whole
process of Font Outline Generation.
During the digitization process (converting the gray
-
level
image to a bilevel image), some noise may arise. This adds irregularities to the outer boundary of
the image and, hence, may have some undesired effects. The algorithm of Avrahami and Pratt [6]
is
a reasonable solution to convert the gray
-
level image to a bilevel image. Although, this
algorithm minimizes the error during the conversion process but it requires some modifications.
The outline in Figure 2 is the demonstration of the original image in F
igure 1 after the
modification of Avrahami and Pratt algorithm [6].
























Figure 3:
The contour of the image in Figure 2.


4

Detection of Characteristic Points


After finding out boundary points, next step in preprocessing is detection of
c
haracteristic
points
. We can categorize them into two classifications:
corner points

and the
significant points
.
The corner points are those points which partition the outline into various segments. There has
been a good amount of work done for the detecti
on of corner points given the boundary of an
image. A number of approaches have been proposed by researchers [2
-
5]. These include
Curvature analysis with numerical techniques and some signal processing methods. In [3] some
of the possible ways to detect co
rners in an image are presented. The curvature can be analyzed
using some numerical approaches. The algorithm, in [1], has used the numerical approach. But it

Page
5

of
12


presents a problem of scaling. The detection of corner actually depends on the actual resolution
of the image and processing width to calculate the curvature.


For the detection of corners, in this paper, we adopted the technique used by Quddus [5].
Quddus’s algorithm sometimes gives duplicate corners. The algorithm has been slightly modified
so tha
t the duplication is removed.

























Figure 4:
The contour of the image in Figure 1, identifying the corner points.


Since corner points are not always enough to produce the outline of the character, see Figure 7.
Therefore to find signi
ficant points, in addition to corner points, we identify some more points,
called significant points. In the first step, these points are searched on the basis of computation of
high curvatures in each segment of contour outline. If these points do not ha
ppen to provide
optimal outline, the second step is adopted to get further significant points. We calculate the
length of segments between each two consecutive character points so far calculated. If some
calculated length is greater than the defined thresh
old segment length, we break the segment into
two or more segments and take intermediate points as significant points. This process is repeated
until each segment length is less than or equal to threshold segment length.


5

Outline Generation


Pratt [7] has
shown that attractive fonts can also be produced using conic splines. In addition to
showing how conic splines could be used to approximate some of the properties of cubic curves,

Page
6

of
12


Pratt [7] has developed a mechanism for generating conic splines using an al
l
-
integer version of
Pitteway’s algorithm [8]. Hussain [9] has developed an elegant algorithm for converting outlines
of fonts described by cubic splines to an equivalent conic form. The approach is an improved
version of Hussain’s algorithm [10]. This met
hod uses an integral approach that yields a direct
conic conversion for Bezier arcs, instead of through a number of specified curve points. Further
more, the algorithm always returns the best conic
-
fit (in a least
-
squares residue sense) for a font
outline

described in terms of Bezier cubic curve segments.


We have used rational cubic spline to generate the outline of character from significant points.
This spline formulation is interpolatory and has ideal shape and geometric properties similar to
those in
[11
-
18]. We conclude that the rational cubic spline provides a well controlled and
relatively smoother shape of the character as compare to existing outline schemes of curves.


Mathematically, splines are piecewise polynomial parametric curves, generated b
y varying a
parameter over a specified range. If
x
(
t
) and
y
(
t
) are two functions that supply points (
x
(
t
),
y
(
t
))
along a curve as
t

is varied then mathematically we can write





















.
.......
,
.......
2
2
2
1
0
2
2
2
1
0
n
n
t
b
t
b
t
b
a
t
y
t
a
t
a
t
a
a
t
x





(1)


where
i
a
’s and
i
b
’s are coefficients and
n

is the order of polynomial.



To approximate a particular curve, it is broken into segments that meet at their end points. The
meeting points are called
joints.

A pair of formulas defines each segment in the spline
. The
coefficients are not chosen arbitrarily, but rather to achieve smoothness at the joints. In general,
splines of order
n

have continuity in the (
n
-
1) derivative at each joint. Therefore conic spline has
continuous first derivative (slope), and cubic h
as continuous first and second derivatives at the
joints.


For our purposes, we have utilized splines which have cubic polynomials in the numerator and
denominator. Hence, these are economical for computation point of view. Moreover, there are
shape parame
ters in its description, which are used to control the font outline in a desired way.
Some mathematical details of these splines follow in the following section.


5.1

The Rational Cubic Interpolant


Let
F
i



R
m
,

i

= 0,1,….
n
, be a given set of points at the dis
tinct knots
t
i



R
, with unit interval
spacing. Also let
D
i



R
m
, denote derivative values defined at the knots. Then a parametric
1
C
-
piecewise rational cubic Hermite function
P : R


R
m

is defined by:












,
t
,
t
t
,
t
M
t
N
t
P
t
P
i
i
i
i
i
1






(2)

where


Page
7

of
12














,
2
1
2
1
1
1
3
1
2
2
3
i
i
i
i
i
i
i
i
i
X
r
W
r
V
r
F
r
t
N






























,
2
1
2
1
1
3
1
1
2
2
3

















i
i
i
i
i
r
r
r
r
t
M


,
,
/
)
(
1
i
i
i
i
i
t
t
h
h
t
t








The choice of parameters
i
r
i


,
0
, ensures a strictly positive denominator in the rational cubic
(2). Thus from Bern
stein Bezier theory, the curve lies in the convex hull of the control points {
F
i
,
V
i
,
W
i
,
F
i
+1

}and is variation diminishing.


When two pieces of the rational cubics need stitching together with such a smoothness that the
tangent continuity holds, we need
to have the following:





























1
1
1
1
1
2
2
i
i
i
i
i
i
i
i
i
i
i
i
D
r
r
F
W
D
r
r
F
V
F
X


(3)


This can be achieved after some simple manipulations by imposing
C
0

and
C
1

constraints:






















.
,
,
1
1
1
1
i
t
P
t
P
t
P
t
P
i
i
i
i
i
i
i
i


(4)


simultaneously at the joint points. The distance based tangent information can be had using the
approximations given in the following subsection.


5.2

Estimation of Tangent Vectors (Derivatives)


The distance based tangent approximations
D
i

at
F
i

are
defined as follows:


For open curve:








































1
,
...
,
1
,
1
,
2
2
,
2
2
1
1
2
1
0
2
0
1
n
i
F
F
a
F
F
a
D
F
F
F
F
D
F
F
F
F
D
i
i
i
i
i
i
i
n
n
n
n
n
o

(5)



Page
8

of
12


For close curve:


























n
i
F
F
a
F
F
a
D
F
F
F
F
i
i
i
i
i
i
i
n
n
,
...
,
0
,
1
,
,
1
1
1
1
1
1

(6)


where

.
,
....
,
0
,
1
1
1
n
i
F
F
F
F
F
F
a
i
i
i
i
i
i
i











(7)


The experiments have shown that these approximations provide visually nice results.
Demonstration of this distance
-
based approximated derivative is shown in the Figure 5(a).


5.3

Shape Control Properties


The parameters
0

i
r

may be used to control the shape of the curve. For the practical
implementation, we choose them as non
-
negative. The following
shape control

properties of the
rational Hermite form are apparent from Figure 5.


5.3.1

Interval Shape Control


It is interesti
ng to note from (3) that when
0

i
r
, we have
i
i
i
F
V
W


,
1
. That is, the curve is
pulled towards the control point
i
F

with very low value of the parameter
i
r
. This behavior can
be seen i
n Figure 5(b) at the third point.


5.3.2

Interval Shape Control


It is interesting to note the following interval shape control property of the interpolant (2).






1
,
1
1







i
i
r
r
F
F
t
P
lim
i
i



(7)


That is, the curve

converges to the interval connecting the two consecutive points in the
i
th
interval of the curve. This behavior can be seen in Figure 5(c) in the third interval


5.3.3

Global Shape Control


Applying the interval property (or point tension property) above succes
sively, the design curve
converges to the control polygon as the derivatives, being distance
-
based, are bounded. Figure
5(d) demonstrates this behavior for the values of
.
.....,
,
1
,
0
,
01
.
0
n
i
r
i





6

Curve Fitting Approach to the Image Outline


Page
9

of
12


We will divide t
he whole set of contour points into segments. Each segment will be an ordered
subset of the universal set of contour points. A piecewise treatment is made for the curve fitting
approach. All the pieces (segments) are stitched finally at the corner points.
The whole curve
fitting process is explained as follows:


Step 1:

The spline method, in Section 5, is implemented to the corner points in such a way
that very low values are allocated to the shape parameters. These values will recover
the actual corner shapes as

in the original contour


Step 2:

If the desired fit in Step 1 is not optimal, the points of high curvature will be searched
in each segment. These points will be called
significant points
. The procedure, for the
search of significant points, is explained in Figur
e 6. Now, Step 1 is repeated together
with the inclusion of significant points in the spline scheme.
























Figure 6:

Curvature computation algorithm.


Step 3:

After passing through Step 2, the error may yet not be acceptable. In such a case, the
set of significant points is enhanced by halving the longer contour segments into two
equal smaller ones. Step 2 is repeated again with the new set of significant points.


Step 4:

If the output is yet to be corrected in Step 3, the Step 3 is repeated as long as t
he
desired output is achieved.


Approximate the curvature C
k
(i) at each contour point
P
i
=(x
i
,y
i
) as follows:


C
k
(i)= a
ik
.b
ik
/ |a
ik|
||

b
ik
|


where

a
ik
= (x
i



x
i+k

, y
i



y
i+k
)

b
ik
= (x
i



x
i
-
k

, y
i



y
i
-
k
)


A threshold va
lue T for C
k
(i) is set. A point P
i

is a
significant point if:


(i)

C
k
(i) takes local maxima.

(ii)

C
k
(i) > T


The value of k depends on several factors, such as the
resolution of the original digital image. Without
threshold value, the algorithm is too sensitive to

small
variations of

C
k
(i).

Page
10

of
12


It has been observed, during various experimentations, that the process of achieving an optimal
outline is terminated after Step 2. The Step 3 is needed only in the worst cases. The Step 4 has
not yet been seen by the algori
thm in any of the examples tested so far. But it has been included
as a part of the process and may be used by the machine, when required.


The above mentioned procedure is quite interesting and will help to automate the whole theory in
an efficient way.
The error manipulation of the computed outline with the contour is based on
visual display and hence is not fully automatic. Fully automation process of error manipulation,
from the authors point of view, can be searched and economized. This issue may be d
iscussed in
a subsequent paper.



7

Algorithm


The summary of the algorithm, designed for the generation of hand
-
drawn shapes, is as follows:



Step 1.

The hand
-
drawn image is scanned and filtered to remove any noise that might be
present in the scanned image.


Step 2.

B
oundary detection algorithm is applied to the image as the scanned image is simply
the bitmap representation of the hand drawn shapes.


Step 3.

Once the boundary is detected, important features of the outlines are extracted from the
boundary. This is achieved by
detecting the corner and significant points from the
detected boundary.


Step 4.

The boundary drawing method, in Section 6, is applied on the data acquired from
boundary and corner detection.


Step 5.

If the boundary thus obtained is not as accurate as desired, the para
meters in the
description of boundary drawing method are adjusted so that the drawn outline is
optimal to the scanned outline.


The Figure 9 explains the complete flow of the algorithm.


8

Concluding Remarks


A method for font designing has been presented w
hich is particularly suitable for non
-
Roman
languages like Arabic. However, it can be used for Roman languages too. In addition to the
detection of corner points, a strategy to detect a set of significant points is also explained to
optimize the outline. A

rational cubic spline, with shape parameters, has been utilized to capture
the outline of the fonts through the characteristic points. The proposed approach minimizes the
human interaction in obtaining the outline of original character. The authors feel t
hat the
proposed approach, still, has the potential to be enhanced and make more automated and robust
treatment. Therefore, such a work is still in progress and the authors are expecting some more
elegant results.

Page
11

of
12


References


[1]

Itoh, Koichi, (1993), A curve
fitting algorithm for character fonts, Electronic Publishing,
Vol. 6(3), 195
-
205.


[2]

Beus, H. L., (1987), An improved Corner Detection Algorithm based on chain coded plane
curves, Pattern Recognition, Vol. 20(3), 291
-
296.


[3]

Pei, S., et al, (1994), Corner Dete
ction Using Nest Moving Average, Pattern
Recognition, Vol. 27(11), 1533
-
1537.


[4]

Quddus, A., and Fahmy, M. M., (1995), Wavelet Transformation for Grey
-
Level Corner
Detection, Pattern Recognition, Vol.28(6), 853
-
861.


[5]

Quddus, A., and Fahmy, M. M.,
(1999)A fast wavelet
-
based corner detection technique,
Electronics Letters, Vol. 35(4), 287
-
288.


[6]

Avrahami, G., and Pratt, V., (1991), Sub
-
pixel edge detection in character digitization,
in Raster Imaging and Digital Typography II, eds. Morris
, R. and Andre, J., 54
-
64,
Cambridge University Press.


[7]

Pratt, V. (1985), Techniques for Conic Splines, Computer Graphics (ACM
SIGGRAPH), 19, 151
-
159.


[8]

Pitteway, M. L. V. (1985), Algorithms of Conic Generation, Springer
-
Verlag NATO
ASI serie
s F, 17, 2119
-
237.


[9]

Hussain, F. (1994), Quadratic Representation of Bezier Cubic Outlines, Private
Communication.


[10]

Hussain, F. (1989), Conic Rescue of Bezier Founts, New Advances in Computer
Graphics, Proceeding of CG International’89, Springer
-
Ver
lag, Tokyo, 97
-
120.


[11]

Sarfraz, M. (1995), Curves and surfaces for CAD using C2 rational cubic splines,
Engineering with Computers, Vol.11(2), 94
-
102.


[12]

Sarfraz, M. (1994), Generalized Geometric Interpolation for Rational Cubic Splines,
Computers and Graphics
, Vol. 18(1), 61
-
72.


[13]

Gregory, J. A., Sarfraz, M., and Yuen, P. K. (1994), Interactive Curve Design using C
2

Rational Splines, Computers and Graphics, Vol. 18(2), 153
-
159.


[14]

Sarfraz, M. (1994), A C
2

Rational Cubic Spline which has Linear Denominator and Sha
pe
Control, Annales Univ. Sci. Budapest, Vol. 37, 53
-
62.


Page
12

of
12


[15]

Sarfraz, M. (1991), Interpolatory Rational Bicubic Spline Surface with Shape Control, J.
Sc. Res., Vol. 20(1 & 2), 43
-
64.


[16]

Gregory, J.A. and Sarfraz, M. (1990), A Rational Spline with Tension, Compu
ter Aided
Geometric Design, Vol. 7, 1
-
13.


[17]

Raheem, A. and Sarfraz, M., (1999), Designing using a Rational Cubic Spline with Point
Shape Control, Proc. International Conference on Imaging Science, Systems, and
Technology (CISST'99), Las Vegas, Nevada, USA,
CSREA Press, USA, 565
-
571.


[18]

Sarfraz, M., Al
-
Mulhem, M., Al
-
Ghamdi, J., and Hussain, A., (1998), Quadratic
Representation to a C
1

Rational Cubic Spline with Interval Shape Control, Proc
International Conference on Imaging Science, Systems, and Technology (C
ISST'98), Las
Vegas, Nevada, USA, CSREA Press, USA, 322
-
329.


[19]

Karow, P., (1994), Font Technology (Methods and Tools), Springer
-
Verlag, Berlin.


[20]

Karow, P., (1994), Digital Typefaces: Description and Formats, Springer
-
Verlag.


[21]

Hersch, R. D., (1993), Font R
asterization, Visual and Technical Aspects of Type, Ed: R D
Hersch, Cambridge University Press, pp 78
-
109.


[22]

Herz, J. & Hersch, R. D., (1994), Towards a Universal Auto
-
hinting System for
Typographic Shapes, Electronic Publishing, Vol 7, No 4, pp 251
-
260.


[23]

Zalik, B., and Hussain, F., (1998), Constraint
-
based Interactive System for Font Outlines
Design, Proceedings: International Conference on Imaging Science, Systems, and
Technology, CISST'98, Las Vegas, Nevada, USA, pp 274
-
281, ISBN: 1
-
892512
-
03
-
3, Jul
199
8.


[24]

Hussain, F., Zalik, B., and Kolmanic, S., (2000), Intelligent Digitisation of Arabic
Characters, accepted for publication in IEEE proceedings on Information Visualisation
conference, London, Jul 2000.