sv-lncs - Bio-Grid - University of Connecticut

perchorangeΛογισμικό & κατασκευή λογ/κού

1 Δεκ 2013 (πριν από 3 χρόνια και 9 μήνες)

92 εμφανίσεις

Parallel Computation of the Minimum Separation
Distance of B
é
zier Curves and Surfaces

Lauren Bissett
1
, Nicholas Woodfield
1


1

University of Connecticut

Department of Computer Science and Engineering

Storrs CT, 06269


{lauren.bissett, nicholas.woodfield}@uc
onn.edu

Abstract.

The detection of the minimum separation distance between any two
points on a geometric surface is a tedious and time
-
consuming process.
Algorithms exist to determine this distance, but they are slow and not practical
for use. However, the
se algorithms can be easily parallelized to improve
performance. By using modern parallel programming architectures, the
speedups for the detection of this distance make it fast enough for practical use.

Keywords:
B
é
zier surfaces, B
é
zier curves, temporal
aliasing, minimum
separation distance, double normals, CUDA

1 Introduction

The underlying principle behind computer animation is the continuous movement
between image frames to give the illusion of motion. When an unexpected change
occurs between frames,

it disrupts the fluidity of motion for the viewer. This problem
is known as temporal aliasing.

One method of preventing temporal aliasing is the detection of imminent or
existing self
-
intersections in geometric surfaces. This can be accomplished by
calcul
ating the minimum separation distance, or the smallest length of separation of
any two points. This distance, if sufficiently small, will indicate the presence of a self
-
intersection.

1.1 Minimum Separation Distance

Although the minimum separation dista
nce is roughly defined as the smallest segment
between any two points on a geometric surface, there are some subtleties. A pair of
points can only be considered a potential segment if they are
doubly normal
.
A
doubly normal segment is such that both its en
dpoints are normal to the geometric
object. Therefore, the minimum separation distance is defined as the the minimum of
all lengths of all double normal segments.

1.2 Calculation of the Minimum Separation Distance

The general algorithm to find the minimu
m separation distance consists of three main
parts:




Generate candidate doubly normal segments



Refine results using Newton's method



Find the segment of minimum length



The geometric object is first searched for potential double normal segments, by
exhaus
tively testing pairs of points. This results in a number of potential estimates for
Newton's Method. Once the candidates are iterated through Newton's Method, their
lengths are compared, and the smallest is considered the minimum separation
distance.

1.3

Geometry

The geometry chosen for this research were B
é
zier curves and B
é
zier surfaces. The
algorithms can also handle composite curves and surfaces.

1.4 Parallelization

This algorithm is ideal for parallelization; the exhaustive checking of large sample

sizes of points in finding candidate doubly normal segments is time
-
consuming in
procedural implementations. Parallelization of this step would decrease run times
exponentially.

2 B
é
zier Curves

2.1 Generation of Candidate Seed Pairs

The generation of s
eed pairs on a given curve is an important step, as good initial
estimates for Newton's method are ideal for finding the correct result.

A pair of endpoints will be considered doubly normal if they satisfy the following
equations:


[
c
s

c
t
]

c'
s
=
0
[
c
s

c
t
]

c'
t
=
0


(
1
)


Where s,t
Є

[0,1]

Initial Estimate Algorithm for Newton's Method.

To produce initial estimates for
Newton's method,
it is first necessary to define a sample size (
n
). This sample size
determines the number of sampl
e points to take along the curve.


The following algorithm generates the initial estimates; the doubly normal test
referenced in the code is detailed below.


Initial Estimate Algorithm For Newton's Method [2]

Input: n, and sub
-
curves ci , where i = 1, .
. . , m


1. For k = 1, . . . , n − 1


s = 1/n + (k
-

1)/n.


2. For = 1, . . . , n − 1


t = 1/n + (
-

1)/n.


3. For i = 1, . . . , m


4. For j = 1, . . . , m


5. If i = j and s = t, cull versus doubly normal test.


6.

If i != j, cull versus doubly normal test.

Output: all segments not culled.

Doubly Normal Test.
The initial estimate algorithm generates pair values, but these
points must then be tested to determine if they are doubly normal, and if they are a
double nor
mal segment.


A key component of this double normal test is the user
-
defined tolerance values, e
1

and e
2
. These tolerances are used to determine if the segment is sufficiently close to
being normal to the curve. It is therefore important to chose values
that will not result
in too many or too few candidates found.


The test, given the two initial points p
1

and p
2
, checks if the endpoints are
sufficiently normal if:



p
1,
p
2


c'
s
<e
1

p
1,
p
2


c'
t
<e
2


(
2
)



If these two inequal
ities are both true, the segment is considered doubly normal and
will be used as an initial estimate for Newton's method.

2.2 Newton's Method

The initial estimates (samples s and t) are then run through Newton's method for
refinement. Newton's method is ef
ficient and produces decent results.


The following equations compute Newton's method for B
é
zier curves:


[
s
n+
1
t
n+
1
]
=
[
s
n
t
n
]

J

1
s
n
,t
n
[
f
s
n
,t
n
g
s
n
,t
n
]
n=
0,1,2,
.
.
.


(
3
)


until the inverse Jacobian matrix (J
-
1
)

is less than some user
-
defined threshold va
lue
(e), Where f and g are the left
-
hand side of the functions in equation 2.1. The
following is the calculation of the Jacobian matrix:


J
s
n
,t
n
=
[

f

s

f

t

g

s

g

t
]


(
4
)



J

1
=
1

f

s


g

t


f

t


g

s
[

g

t

f

t


g

s

f

s
]


(
5
)



And:



f

s
=
[
c
s

c
t
]

c''
s
+c'
s

c'
s

g

s
=c'
s

c'
t

f

t
=

c'
t

c

g

t
=
[
c
s

c
t
]

c''
t
+c'
t


c'
t


(
6
)


After Newton's method has been run on all the seed pairs, the segment of minimum
length is found (through a simple search) and this value is considered the minimum
sepa
ration distance.

2.3 Results

With B
é
zier curves, the CUDA implementation had an approximately 200% speedup
compared to the procedural C code.


Table 1.

Results for B
é
zier curves


n

e1

e2

C code

CUDA code

512

0.4

0.4

28s

154.76ms

512

0.2

0.2

26s

154ms

2
00

0.4

0.4

4.3s

46.6ms

10

0.4

0.4

15ms

1.81ms

3 B
é
zier Surfaces

The algorithm used for B
é
zier curves naturally extends to B
é
zier surfaces, merely
introducing two new variables: u and v.

A pair of endpoints will be considered doubly normal if they sati
sfy the following
equations:


[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
1
=
0
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
1
=
0
[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
2
=
0
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
2
=
0


(
7
)


3.1 Generation of Candidate Seed Pairs

Initial Estimate Algorithm for Newton's Method.
To produce initial estimates for
Newton's method with surfaces, again def
ining a sample size (
n
) is necessary,
determining the number of sample points to take along the surface.


Initial Estimate Algorithm for Newton's Method

Input:
n, and sub
-
meshes ci , where i = 1, . . . , m

1. for i = 1 to n


1


u
1

= i / n

2. for j = 1

to n


1


v
1

= j / n


3. for k = 1 to n


1


u
2

= k / n


4. for l = 1 to n


1


v
2

= l / n


5. For i = 1, . . . , m


6. For j = 1, . . . , m


7. If i = j and u
1

= u
2

&& v
1

= v
2
, cull v
ersus doubly normal test.


8. If i != j, cull versus doubly normal test.

Output: All segments not culled


Double Normal Test.

The double normal test is similar to the curve case, in which
each u, v endpoint is determined if sufficiently dou
ble normal, within a certain
threshold.


[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
1
<e
1
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
1
<e
2
[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
2
<e
3
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
2
<e
4


(
8
)


3.2

Newton's Method

The initial estimate results (samples u
1
, u
2
, v
1
, v
2
) are then run through the bivariate
Newton's method. The following equations co
mpute Newton's method for B
ézier
surfaces:

[
u
1
n+
1
v
1
n+
1
u
2
n+
1
v
2
n+
1
]
=
[
u
1
n
v
1
n
u
2
n
v
2
n
]

J

1
u
1
n
,v
1
n
,u
2
n
,v
2
n
F
u
1
n
,v
1
n
,u
2
n
,v
2
n



(
9
)

Where:

F=
[
F
1
F
2
G
1
G
2
]


(
10
)


And:


F
1
=
[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
1
=
0
F
2
=
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
1
=
0
G
1
=
[
S
u
1,
v
1

S
u
2,
v
2
]


S

u
2
=
0
G
2
=
[
S
u
1,
v
1

S
u
2,
v
2
]


S

v
2
=
0


(
11
)


And where:


J
u
1,
v
1,
u
2,
v
2
=
[

F
1

u
1

F
1

v
1

F
1

u
2

F
1

v
2

F
2

u
1

F
2

v
1

F
2

u
2

F
2

v
2

G
1

u
1

G
1

v
1

G
1

u
2

G
1

v
2

G
2

u
1

G
2

v
1

G
2

u
2

G
2

v
2
]


(
12
)


The sixteen partial derivatives are listed in the appendix.

4 Parallelization with CUDA

One of the main goals of this research was to parallelize the algorithms used to find
the minimum separation distance. Luckily, the algorithms are easily adapted to make
use of parallel programming. The parallel architecture chosen was NVIDIA's CUDA
programming language, which allows parallel programming on the graphical
processing unit (GP
U).

4.1 CUDA Background

NVIDIA developed CUDA as a parallel programming language that allows easy
development on the GPU for the programmer, allowing “supercomputing for the
masses” [7]. Normally GPU programming is very difficult, but CUDA extends
elegantl
y as part of C. CUDA
-
enabled GPU's support vastly multi
-
threaded
applications, allowing the launch of thousands of simultaneous threads of execution
[3]. It also is a cost
-
effective method of fast, parallel computing without investing
enormous sums of mone
y in supercomputers. The only downside to CUDA is that is
hardware
-
specific, only running on NVIDIA, CUDA
-
enabled GPU's.


The CUDA architecture has three levels of organization: threads, blocks and the
grid. Threads are the basic component of execution,
and have their own register and
local memory. Threads are then organized into blocks;
the number of threads per
block is dependent on the architecture
. All blocks of threads have access to global,
constant and texture memory. The grid contains all the bloc
ks, and can be 1 or 2
-
dimensional. Register and shared memory are fastest, whereas local, constant, global
and texture memory are slower.

4.2 CUDA Specs

The machine used for the results presented in this paper used a QuadroFX 5800 GPU
on a machine running
Windows Vista.

4.3 Parallelization of B
é
zier Curves

The algorithm for the generation of candidate seed pairs (described in section 2.1)
uses four loops, or O(n
4
) complexity, making run times extremely slow for large
sample sizes.

To reduce the run time fo
r this step, each thread handles one sample point on the
curve, and compares it to every other point on all other curves; if it passes the double
normal test, Newton's method is performed.


The Main Kernel.

The main kernel containing the CUDA code perform
s two steps:
generation of candidate pairs, and Newton's method. Each thread in the kernel
receives the same input:




The entire input composite curve, represented as an array of control points



A two
-
dimensional output array to hold the double normal segmen
ts



The sample size



The number of curves



Tolerance values (e, e
1
, e
2
)


As stated, each thread handles one sample point on one curve, and checks for double
normals along its curve as well as all the others in the composite. Therefore a 1D grid
of threads is
used, arranged into blocks. If the sample size is smaller than the allowed
number of threads per block, each curve is represented as one block of threads.
However, if the sample size if larger, the curve must be split up over multiple blocks.


If a threa
d finds double normal segment(s), it then runs Newton's method on that
pair. If more than one double normal was found in that thread, it compares the newest
one found to the previously found segment. If smaller, it is saved and the old one
thrown out. At
the end of execution, the thread saves the shortest segment in the two
dimensional array, whose size is the number of blocks multiplied by the number of
threads. Once all threads have finished executing, the output array will contain the
shortest double no
rmal segment for each thread.


Secondary Steps.

Once the main kernel has finished finding all double normals, a
second kernel is launched. The purpose of this kernel is to sort through the returned
double normal segments and find the one of minimum length.

It works by first finding
the minimum length from each block. It returns these values, and then they are
iterated over to find the global minimum.

4.4 Parallelization of B
é
zier Surfaces

Modifying the existing CUDA code from B
é
zier curves to surfaces invol
ves not just
modifying the algorithm and equations used, but also changing the layout of the code.
Instead of a 1D array of threads, now a 2D one is used.


Main Kernel
. The main kernel behaves similarly to the one used for Bezier curves.
Instead of handlin
g one parametric s, t value, each thread in this kernel handles a u, v
pair. Each thread receives as input:




Composite mesh input as an array of control points



Output array that will contain found double normal segments.



Sample size.



Tolerance values.



This setup eliminates the need of the first two loops in the pseudocode listed in
section 3.1, already reducing the time complexity, in addition to CUDA's fast
runtimes.


So, similar to the curve kernel, each thread handles one u,v pair, and checks again
st
every other sub
-
mesh for double normal segments. If one is found, it is run through
Newton's method. If more than one double normal was found by the thread, it
compares the current segment to the previously saved segment, and throws away the
shorter.



If the sample size is less than the number of allowed threads per block, each sub
-
mesh is represented by one block. Otherwise, the meshes are split over multiple
blocks. Once all threads have finished execution, the kernel returns an array
containing the
shortest segment found by each thread.


Secondary Steps
. Once the kernel returns the results, a second kernel is launched to
search through the results, and return a smaller array, containing the shortest segment
from each block. From there, the array is s
earched to find the shortest segment. This
is the minimum separation distance.

Future Work

Although the method of parallelization for Bezier surfaces was developed, it was not
implemented. Therefore future work would include fully implementing the CUDA
cod
e for Bezier surfaces, and performing timing tests to determine if sufficiently fast
for practical use.

Acknowledgments

The authors were partially supported by grant NSF Grant #0755373, and supported by
University of Connecticut Biogrid REU program. Both a
uthors thank Dr. Bing Wang
and Dr. Tom Peters for their support and guidance.

References

1. Antonov, E., Bisceglio, J., Borovikov, I., Noble, N., Peters, T.J.: Convergence of Geometric
Algorithms for Graphics & Animation 8

14

2. Moore, L.F.: Computational
Topology of Spline Curves for Geometric and Molecular
Approximations 66

82

3. NVIDIA CUDA Zone,
http://
www.NVIDIA.com/cuda

4. Mathews, John H.
Numerical Methods For Computer Science, Engineering, And
Mathematics
. 1987 Prentice
-
Hall Inc: 1987. 380
-
382

5.
Miller L., Moore E.L.F., Peters T.J., Russell A.C.: Topological Neighborhoods for Spline
Curves: Practice and Theory

6. AK Peters ,

Real
-
Time Rendering 2
nd

Edition
, LTD July 2002 Thomas Akenine
-
Moller et al.

7. Dr. Dobbs CUDA, Supercomputing for the Masses

Part 1

7, http://www.ddj.com/hpc
-
high
-
performance
-
computing/207200659

8. K.E. Jordan et al.: Modeling time and topology for animation and visualization.

9. E. L. F. Moore, T. J. Peters, and J. A. Roulier. Preserving computatational topology by
subdivision



10. Patrikalakis, Nicholas M. et al.: Shape interrogation for computer aided design and
manufacturing.

Appendix

Partial Derivatives for Bivariate Newton's Method:



F
1

u
1
=

S

u
1


S

u
1
[
S
u
1,
v
1

S
u
2,
v
2
]


2
S

u
1
2


(
13
)



F
1

v
1
=

S

v
1


S

u
1


(
14
)



F
1

u
2
=


S

u
2


S

u
1


(
15
)



F
1

v
2
=


S

v
2


S

u
1


(
16
)



F
2

u
1
=

S

u
1


S

v
1


(
17
)



F
2

v
1
=

S

v
1


S

v
1
[
S
u
1,
v
1

S
u
2,
v
2
]


2
S

v
1
2


(
18
)



F
2

u
2
=


S

u
2


S

v
1


(
19
)



F
2

v
2
=


S

v
2


S

v
1


(
20
)



G
1

u
1
=

S

u
1


S

u
2


(
21
)




G
1

v
1
=

S

v
1


S

u
2

(
22
)



G
1

u
2
=


S

u
2


S

u
2
[
S
u
1,
v
1

S
u
2,
v
2
]


2
S

u
2
2


(
23
)




G
1

v
2
=


S

v
2


S

u
2

(
24
)



G
2

u
1
=

S

u
1


S

v
2


(
25
)



G
2

v
1
=

S

v
1


S

v
2


(
26
)



G
2

u
2
=


S

u
2


S

v
2


(
27
)



G
2

v
2
=


S

v
2


S

v
2
[
S
u
1,
v
1

S
u
2,
v
2
]


2
S

v
2
2


(
28
)