P
ARALLEL
I
MAGE
P
ROCESSING
David
Oldroyd
O
UTLINE
:
Images
Basic Rendering:
Ray Casting/Ray Tracing
Rasterization
Manipulation:
Shaders
Reflection
Lighting and shadows
Antialiasing and Filtering
Graphics Pipeline
Graphics Hardware
I
MAGES
2d pixel array
Raster format
Different styles:
Single byte
RGB(A) 24 or 32 bit
R
ENDERING
Program Data

> Image Data
Two general methods to choose from
Rasterization
Ray Casting/Ray Tracing
R
ASTERIZATION
Turn Polygons into pixels
3D approach:
Foreach
polygon p in the world:
Generate triangular faces for each surface of p
Foreach
triangle t:
Transform
Each vertex into 2D
Fill each pixel of the triangle with a texture
R
ASTERIZATION
: S
ERIAL
P
ERFORMANCE
Naive serial performance = O(
N
poly
)
Performance boosters:
Backface
culling
Cell and portal culling
Clipping partially seen triangles
Lower polygon count models for far out objects
R
ASTERIZATION
: P
ARALLEL
P
ERFORMANCE
Triangle Interpolation is easily parallelizable
Some models in front of others
Great for SIMD
Requires very high memory bandwidth
Time complexity approximately O(
N
poly
/P)
R
AY
C
ASTING
Given a ray, find the objects which it intersects
foreach
pixel in Image:
Generate
a ray from the camera through the pixel
Trace
the ray through the world
foreach
object it intersects
find the closest to the camera
R
AY
C
ASTING
: S
ERIAL
P
ERFORMANCE
Single Ray:
Finding the target object assumed to be
sublinear
,
often O(
logN
)
Bounded by the number of objects in the world
E(cost(r))=T
0
+p
tot
T
s
+p
avg
O
tot
T
h
Whole space:
Break down space into grid of cells
Find which cells the ray travels through
Check the objects in those cells for a collision
totalcost
=
T
p
+
N
r
E
(cost(r))
R
AY
C
ASTING
: P
ARALLEL
P
ERFORMANCE
Good News!
Embarrassingly parallel in shared memory systems
Algorithm:
Split image into regions
Each pixel is independent
Performance:
Minimal overhead, theoretical speedup of P
2008: 4 quad core
xenons
can run Quake Wars at 15

20 fps
2006: ATI X1900 1024x1024 resolution at 12

18 fps
R
AY
T
RACING
Extension of Ray Casting
Don’t stop at the first object you find
Optical effects:
Reflection
Refraction
Scattering
Dispersion
R
AY
T
RACING
Can be much more lifelike than
Rasterization
Poor performance
Often used in animated films
One frame can take over 15 hours to compute
I
MAGE
M
ANIPULATION
Reflection
Transparency/Translucency
Shaders
Lighting and Shadows
Antialiasing
and Filtering
R
EFLECTION
Ray Tracing:
Create a new ray from the point of reflection
Doubles the computation time of reflected rays
New rays depend on original rays, cannot be further
parallelized
Rasterization
:
True reflection:
Render the entire world where the mirror should be
If P equals the number of mirrors,
T
seq
=
T
par
Fake reflection:
Use Lighting to imitate reflection
T
RANSPARENCY
AND
T
RANSLUCENCY
Ray Tracing:
Find next closest object
Optionally use information from the first to modify
result
Minimal performance hit
Rasterization
:
Interpolate far objects first
Overlay Translucent objects on top
Adds more dependencies between near and far
objects
Makes culling more difficult
S
HADERS
Refer to both a hardware concept and a software
concept
Hardware (SIMD computation unit):
Designed for
Rasterization
Can be used to trace rays
Software:
Type of filter to apply during the rendering process
S
HADERS
Post processing effects
Can operate on:
Pixels:
Lighting and shadows
Color variation
Antialiasing
Vertices:
Translation
Color
Textures
Geometry:
Create new objects
Tessellation
L
IGHTING
AND
S
HADOWS
Angular Lighting:
Determine angle to light source(s)
Detect occluding objects
Determine net lighting value
Radiosity
:
Divide world into patches
Determine view factors between patches to get light bounce
Solution can be solved iteratively
More computationally intensive
Limited to diffusion
L
IGHTING
AND
S
HADOWS
Either method can be used with either rendering
paradigm
Angular Lighting:
Ray tracing: O(
N
pixels
)
Rasterization
: O(
N
poly
)
Always pipelined with other
shaders
Radiosity
O(
N
patches
), but naïve O(N
objects
4
)
P
OLYGON
L
IGHTING
Additional
shader
to apply different light values over the
triangle’s surface
Several different methods
Gourau
:
Foreach
vertex:
Average each adjoining face surface normal
Calculate the vertex intensity from the estimated surface normal
Foreach
pixel:
Calculate pixel intensity linearly based on the vertex intensity
Phong
:
Foreach
vertex:
Average each adjoining face surface
normal
Foreach
pixel:
Interpolate and normalize the surface normal
Calculate the surface intensity based on the surface normal
P
OLYGON
L
IGHTING
Flat Shading
Gourau
Shading
Phong
Shading
A
NTIALIASING
Family of methods to reduce aliasing
Signal processing:
2D FFT over final image O(N
2
logN)
Only allow frequencies below a certain limit
O(N
2
)
2D reverse FFT O(N
2
logN)
FFT can be done in parallel at O(N
2
logN/P)
Object

based antialiasing
Only
antialias
pixels that fall on a vertex
Minimal performance gain with high polygon
count
A
NTIALIASING
More
pixels
= same parallel speedup
Subpixel
rendering:
3x more pixels must be rendered
Supersampling
:
Render the image at a higher resolution (often 2x)
Downsample
to the desired resolution
Time complexity increases by the resolution ratio squared
Multisampling:
Only
supersample
specific aspects of each pixel
OpenGL only
supersamples
depth and stencil values
Better performance than full
supersampling
for similar
quality
G
RAPHICS
P
IPELINE
Step 1: pre

vertex lighting and shading
O(V) vertex
shaders
Each vertex handled separately, speedup=P
Step 2: clipping
O(
N
poly
) remove vertices of a polygon outside the
viewing area
Each Polygon handled separately, speedup=P
Step 3: Projection Transformation:
O(V) each vertex must be transformed to 2D
Each vertex handled separately, speedup=P
Step 4:
Rasterization
O(
N
pixels
) convert to raster format, determine pixel
values
Most pixels handled separately, speedup approaches P
Step 5: texturing and pixel
shaders
O(
N
pixels
) apply all the other pixel
shaders
to the final
image
Shaders
are usually pixel independent, speedup
approaches P
G
RAPHICS
H
ARDWARE
Top end GPUs can contain up to:
2048 Unified
Shaders
128 Texture Mapping Units
32 Render Output Units
Multi GPU rendering
P can be several thousand
Speedup is almost always close to P
S
OURCES
http://graphics.stanford.edu/papers/i3dkdtree/gpu

kd

i3d.pdf
https://sites.google.com/site/raytracingasaparallelproblem/
http://www.graphics.cornell.edu/~bjw/mca.pdf
http://www.codinghorror.com/blog/2008/03/real

time

raytracing.html
http://en.wikipedia.org/wiki/Rasterization
http://www.dtic.mil/dtic/tr/fulltext/u2/a236590.pdf
http://www.cambridgeincolour.com/tutorials/image

interpolation.htm
http://en.wikipedia.org/wiki/Rendering_pipeline
http://graphics.stanford.edu/~kayvonf/papers/fatahalianCA
CM.pdf
http://graphics.stanford.edu/papers/mprast/rast_hpg09.pdf
http://menohack.acm.jhu.edu/CUDAWriteup.pdf
Comments 0
Log in to post a comment