Procedural Graphics with DirectX 11
James P Reid
Department of Software Engineering
University of Wisconsin
Platteville, Wisconsin 53818
the past decade, computer graphics have become more demanding both in detail,
and in the
size of the area
rendered. These demands are expanding faster than modern graphics
hardware can keep up. In this paper I will be describing a popular technique for solving this
which can cut memory usage by up to 90%. This technique has been around since the
y 80s, but using modern hardware, and the power of DirectX 11, worlds with near infinite
size can be created that maintain all the detail
by the end user.
As a brief introduction to this paper and the concepts it covers, I would firs
t like to go over a few
basic graphics co
ncepts to make some of the ideas mentioned later
easier to understand.
At the basic level,
computer generated graphics are generated by creating meshes out of arrays of
vertices. A vertex is
simply a point in
dimensional space where two or more points mee
By combining hundreds or even
thousands of vertices together, objects can be represented in a
way that a computer will understand. For example, a simple box can be represented by th
combination of eight different vertices. The computer can then draw lines between these vertices
in a triangle pattern (two triangle per side of that box) and lay colors or even textures on that grid
of vertices. Now that we have a grid of vertices repre
senting our object, we can call this a mesh.
The other main concept needed is a height
map. This is usually a square
grayscale image where
each pixel in that image represents a different height on
the mesh it is being mapped to.
range from 0
, giving the mesh 255 distinctly different height
values. Now the level of detail for a height map can be increased if the map where to use all
three color channels
and the alpha channel.
A color channel is a set of 8 bits in
bit integer in
the RGBA scheme. This means the first 8 bits of the 32
bit integer are the red channel, the
bits and the green channel, etc.
This would allow for a 32
bit height for each pixel
allowing for over four billion distinct height va
lues on a mesh.
Procedural graphics where initially implemented in games because of memory constraints in the
early computers capable of supporting games. By forci
ng the game to generate
run time, designers w
ere able to cre
massive worlds that occupied only small amounts of hard
disk space and RAM.
One of the first games to use this technique was a 1982 game
was generated completely procedurally. The original des
ign for this game suggested having
total of 2
alaxies, with 256 solar systems in each galaxy. This meant there would be trillions
of items to explore in the game and it was all built into a single algorithm.
Procedural generation didn’t stop when computers became faster and able to handle more
es per frame, this technique was used to create even more believable worlds in high
definition, while the game designers could focus on the story line instead of how to fit all these
graphics into a set amoun
t of space. This led to a type o
f artificial int
elligence, allowing the
game to become dynamic
rather than static. This was
based on where the player was, and what
they were doing there.
& Why Use It
According to Wikipedia, procedural graphics is content that is generated algorithmically rather
than manually. This means the content seen in the graphics world will be generated at run
(on the fly) as opposed to
having the data stored a
nd loaded from file
Most games today use height maps and meshes stored in file
and load them when the
application calls for them. This means the computer will be spending a lot of time accessing the
hard drive and parsing out new data to display to the
world. This is what causes a loading screen
in most modern games
and 3D graphics generators such as a virtual world used to
represent a location (e.g. Google Earth).
The primary use for procedural generation is to reduce the amount of total
d disk space
by the application. Storing data on a
hard drive is very expensive for
the system, espec
the application has
to load new data or needs to write data that is currently stored in memory
back to the ha
to make spac
e in RAM
for the new data.
By generating everything
at run time algorithmically, the application no longer needs to constantly be co
the hard drive. This is very good considering the slowest part of a modern computer is usually
the hard drive. The only downside to using a procedural approach is the need for a fast
Processing Unit (CPU) and Graphics Processing Un
in order to generate the data
required by th
e system at a decent frame rate. Considering today’s modern hardware for C
PUs, there really isn’t much reason to worry about having the necessary speed to generate
all the data needed when CPUs
generally run in the 3
gigahertz range and most modern
GPUs have over one thousand stream processors.
A stream processor is basically a single core
running around 800 megahertz that works in parallel with all other stream processors on the
In order to generate something procedurally, an algorithm must be used. This is where
becomes very important for
generating objects in the virtual world that aren’t being stored
directly on the hard drive. Noise comes in many different fl
avors, the two most popular b
Perlin noise, and Voronoi n
oise. These are the two most widely used algorithms when attempting
to generate a virtual world procedurally. There is another popular type of noise know as fractal
noise, but this i
s simply a va
riation on Perlin n
After the noise algorithm has been used, the values it generates are assigned to each
value in a
grid of vertices or mesh. This deforms the grid
creating a random, and realistic looking
involving low areas for bodies of water, and high areas that can represent mountains. This
is just one simple application of noise, but it is probably the most notable.
oise is the most widely used algorithm when creating a procedural
ics engine. It
was invented by M
Since the invention of
his noise algorithm, it has become the standard for computer graphics and movement.
The algorithm itself is actually quite simple. It
consists of taking pseudorandom numbers over
different amplitudes and frequencies and summing up all
iterations into one final
dimension, this would be a simple line that looks most jagged as the frequency increases and
amplitude decreases o
ver increasing iterations. In two dimensions, this would start as several
large blots on an image
The noise function is
dependent on five
main values. The first is a random seed to be used with a
m number generator that returns values between 0 and 1.
The second is the
frequency of the plot over time.
The third is the amplitude of the plot. This determines the
maximum and minimum values that will be generated by the function. The fourth is the number
of octaves in the function, or how many times to iterate through. The fifth is the persistence. This
mber changes the amplitude over time by multiplying the current amplitude of the iteration by
the persistence. A higher persistence will result in
for each octave in the
Once the function finishes, it adds up all the octaves i
nto one final curve.
To do this
function is used which will interpolate the points dependent on all the other points around it. This
becomes far more complex when moving into two, and ev
en three dimensions, but
hardware can be exploited
to speed through these
calculations and return an
The basic smoothing equation for this interpolation has recently changed from
Voronoi noise comes from
a mathematics decomposition know as a Voronoi diagram. This type
of math was first used by
, and has continued to evolve since.
When using this
method for computer graphics, Lloyd’s algorithm is used. This algorithm starts with an initial
distribution of samples
and then compute
the noise image. The image is computed by creating
the Voronoi diagram of all the points, integrating each cell
computing the centroid of that cell,
and finally moving all points within the cell to its centroid. This gives the image a cellular look
to it with all cells being fairly random in terms of their size and distribution.
The Quadtree Structure
Up until thi
s point, procedural graphics may seem quite simple to create and manage. The
terrain, clouds, etc. made by this method are very convincing and completely remove the need to
store a mesh on the hard drive. However, there is a major downfall when generating
world in this manner, its scalability.
When creating small worlds the methods mentioned thus
far will work very well, but if the wo
ere to start getting large, the computer would eventually spend so much time trying to process
all of the vert
ices that it wou
ldn’t be able to maintain
sixty frames per second anymore. When
the application drops below sixty frames per second,
it is noticeable to
the human eye
solution to this problem is to use a quadtree structure for storing the vertex data
in the virtual
This structure is used was used in the solutions for each different tech
used today 
A quadtree is basically a binary tree that has four nodes instead of two. This means we can take a
it into four nodes, split
each of those nodes into four more, and so on down the
tree. A word of caution: try not to store too much data in each node of the tree since the memory
requirement quadruples for each level of depth traveled down the tree.
are many uses for the quadtree, and all of them drastically increase the performance of the
application. The main uses are view frustum culling, level of detail, vertex reduction, grid
location, and collision detection. Now only some of these apply to proc
edural graphics and those
are frustum culling, level of detail, and vertex reduction. The others are just mentioned as proof
of how powerful a quadtree really is, and thes
e are not
even all the reasons for using them.
By using quadtrees, vertices can be re
moved or added based on the camera’s current position,
where it is looking, and the level of detail desired at different locations within the virtual world.
Efficiency in computers graphics is all about quantity. The more vertices there are currently
rendered, the more work the computer will have to do. Adding effects is surprisingly fast
work for a modern GPU since it can process over one thousand vertices in parallel at speed
around 1 gigahertz. The main point is that we want quality for our world and each vertex, not
The following sections will go over some popular methods to remove and add vertices
based on the current state of the application. This makes the virtu
al world very dynamic, rather
than a single static mesh that consumes processing power.
The University of Massachusetts
presents several different applications of the quadtree, including the quadtree mesh
what will be used for the rest of
hods discussed in this
View Frustum Culling
One of the many ta
sks a quadtree allows
is view frustum culling. When viewing a virtual world,
only certain areas of the world can be seen at any given snapshot in time.
The view frustum
refers to the
area currently viewable by the camera. This includes a near and far plane of viewing
too close or too far away unseen by the camera.
Culling refers to the removal of
unwanted vertices in the meshes currently being rendered. By dividing the m
esh up in quads, the
application is able to traverse down the tree at run time and determine which quads are currently
visible to the camera’s view.
Below is a very good example of the result when using this
technique. As we can see, the entire upper left
quadrant of this terrain mesh has been removed
from the viewing area and not rendered by the graphics device.
Only the gray areas in Fig. 1
be sent to the graphics device for rendering
: View frustum culling with a quadtree.
Level of detail is the process of adding and
removing quadtree nodes from the virtual
world based on the current state of the application.
At the very top level of a quadtree, there will
be a single parent node that can be traversed
however, we may not need to traverse all the
way down this quadtree because only certain parts of it are important to the current view frustum
of the camera.
To solve this area selection issue
, as the world is being generated for the current
ing noise algorithms and other basic constants such as size and height of the grid,
additional quadtree nodes may be added to the mesh in areas much closer to the camera. This
results in having many more vertices near the camera and far less in the distanc
examined, the mesh will ap
pear to have
higher resolution squares that reduce in resolution the
farther from the camera they are.
, , and  all use a similar approach to create different
levels of detail in newer and more efficient manner
once again allowing for much larger worlds
to be rendered in even higher detail, while keep
the system usage at a minimum.
This concept is where DirectX 11, more importantly its sub API Di
makes a huge
in performance and rendering c
apabilities. Starting from the beginning, the world may
start out as a flat plain off in the distance, which can be represented by only four vertices forming
two triangles for the square. As the camera moves closer, additional nodes will be generated
Direct3D 11’s geometry shader. This part of the graphics pipeline allows additional
vertices to be added to a set of vertices already on the GPU.
Using this concept, additional
vertices will be added depending
on the distance from the camera
to the curren
Vertices farther from the camera will only traverse down the quad tree one or two times, if any.
A set of vertices right next to the camera will traverse several levels down the quadtree, greatly
increasing the detail seen by the camera
Now that the GPU is adding vertices based on the camera’s position and viewing direction in the
virtual world, we can start with a massive mesh containing only a few vertices, and add
additional vertices using a noise algorithm based on the current vertex
distance from the camera.
When completed all the areas viewable to the camera will have a different concentration of
vertices based on how far away from
the camera they are. Areas
nearby will have many vertices
while those areas off in the distance have f
ew since the viewer
would not be able to
see the detail
anyway. Now as the camera moves around, vertices will be added as seen fit by the LOD and
noise algorithms implemented on the GPU, removing a large portion of the work from the CPU
so it can spend its
time calculating other things such as shadows or artificial intelligence for the
Vertex Interpolation Error
There is one side effect of using this technique known as the vertex interpolation error. This
occurs when the level of detail changes in the
quadtree mesh. When the camera moves closer to a
specific point in the virtual world, the GPU will add more vertices as the camera gets closer to
increase the LOD seen by the user. As those vertices are being added, some visual popping may
occur because t
he exact location of the new vertices may be slightly different than that of the
vertices they just replaced
. This problem has many solutions, but the easiest is to simply take the
midpoint of the new vertex and the pervious vertex it replaced, and place t
he new vertex at that
midpoint. Interpolation is then used to determine where to place the additional vertices that
increase the resolution of the mesh at this location since those vertices didn’t replace anything.
Another issue that may o
ccur when changing the level of detail for a mesh is the textures will not
line up with each other since they are being placed next to each other on vertices of different
resolutions. On one side of thi
s ‘line’ which can be seen in [2
] the vertices will be
at one level of
detail, and on the other side there will be twice as many vertices, so the texture will be repeated
twice as much with its tiling.
To counter this issue
we first have to change the texture sample rate based on the depth
h in the quadtree
. Then a method known as texture morphing can be used to smooth
out the edges of the two different si
des to make them appear as though
they smoothly transition
from one LOD to the next without
he end user
Figure 2: Textures laid on two different LODs
in a terrain
showing the line creating by
disagreeing texture coordinates at the two different LODs.
Figure 3: The same snapshot of a terrain as in Figure 2, but with texture morphing to hide the
previously, the total number of vertices rendered per frame is what slows down a
system. Having few vertices at high quality will result in a good looking virtual world that w
stress the system
to the point wher
e it cannot
render what it is being sent.
To accomplish this,
the application should be removing redu
ndant vertices where they are not
needed. A great
example of this is a large flat area
in a terrain. If the application renders all vertices in the
, after LOD is applied
there may be extra data
that does not improve
the visual beauty
of the world, but rather stresses the system more. A large flat area really only needs four vertices
to make the square needed to draw this part of the terrain,
so the application removes all t
vertices before sending them to get rendered.
The basic idea behind this is to traverse down the quadtree and co
mpare the slopes of each vertex
vector to the others around them. If the application determines t
he slope of
a certain area is
enough, it will alter the quadtree structure by remove any nodes that are
specific area in the mesh.
This must be generalized to work in all areas of the terrain since a
steep slope can be flat, but the
application may not see it that way.
The entire idea here is to
remove redundant data without sacrificing any visual quality.
When done properly, up to 90% of
the original vertices can be removed, which will greatly improve the performance of the system
llowing it to work on other tasks such as shadows/artificial intelligence/etc.
Now with Direct3D 11, the programmer is able to alter how the graphics pipelinewill tessellate a
and send it to the output device in the end.
Simply, tessellation is adding more vertices to
sting mesh to increase its visual
The easiest way to accomplish this task is to add
new vertices to the midpoint of each existing triangle in the mesh. This will effectively double
detail of the area selected for tessellation since there are now twice as many vertices,
will allow for bumpier, or
There are many other methods to do
tessellation, and some of them are quite elegant since they will increase the
detail seen by the end
user, but not reduce system performance to a noticeable level.
Tessellation is actually one of the first tasks to take place when rendering a set of vertices. The
graphics device will first receive all the data in the vertex shader
stage. After this it will send the
vertices to the hull shader and domain shader. These two shaders are where tessellation takes
place if the programmer decides to utilize this powerful feature of Direct3D 11. After that the
new sets of vertices are sent
to the geometry shader to add or remove any parts of the quadtree
dependent on the current state of the application. Finally
the data is rasterized and sent to the
pixel shader for coloring each individual pixel on the screen.
Rasterization is the process
taking a three dimensional representation, and placing it on a two dimensional screen.
stage is where more effects will take place such as lighting and reflections.
so early in the graphics pipeline allows the g
eometry shader to add
to what the graphics device can already do very quickly if needed, or remove any vertices
because a flat area is being rendered
only a few vertices are needed to
view the scene.
Procedural graphics may not
what designers are looking for when creating a game or
some sort of graphical representation since it can be very difficult to create an algorithm for
premade art such as cities.
However, learning to use procedural methods for very large worlds
future game designers to render these virtual worlds with extreme detail, and not need
to run them on some sort of super computer that has multiple graphics devices.
By running through the method
s stated in this paper, a large
scale world can be rendered w
relative ease on a lesser system.
First, view frustum culling will be used on the CPU to determine
what nodes of the mesh are even sent to the graphics device for rendering
and what level of
detail each of those nodes should be rendered at
. This will r
emove large amounts of data right
from the start.
a noise algorithm will be used in the graphics device’s
shader to generate new vertices or alter the heights of existing vertices to generate a realistic
After the generat
the terrain, tessellation will be applied to increase detail in areas
desired by the application.
Now that a large mesh exists, the graphics device can remove an
unwanted vertices that are no
t adding to what is currently being viewed. Finally
textures can be
morphed to cover up any level of detail breaks
preventing the end user from ever noticing that
the distance has very little detail.
Using these methods has been proven to result in very efficient and highly detailed pieces of art.
s programmers today have been utilizing procedural generation more and more since
they wish to keep their games on single DVDs. This means compressing data, or finding some
other way to store that same piece of data.
Storing algorithms take
up almost no sp
ace and are
able to generate the same visually stunning worlds, at half the graphics device power as needed
Schneider, J., &Westermann, R. (2006).
quality terrain rendering
Computer Graphics and Visualization Group, TechnischeUniversitätMünchen, Munich,
Yusov, E., &Turlapov, V. (2007).
optimized efficient quad
tree based progressive
multiresolution model for interactive large scale terrain rendering
Computational Mathematics and Cybernetics, Nizhny Novgorod State University, Nizhny
Perlin, K. (2001).
, Media Research Laboratory, Dept. of Computer Science,
New York University , New York City, NY, .
Verts, W. T., & Hill, Jr., F. S. (1989).
, COINS Department, ECE
Department, University of Massachusetts, Amherst, MA, .
Olsen, J. (2004).
Realtime procedural terrain generation
Department of Mathematics And
Computer Science (IMADA), University of Southern Denmark, .
Bernhardt, A., Maximo, A., Velho, L., Hnaidi, H., &Cani, M. (2011).
modeling using cpu
gpu coupled computation
INRIA, Grenoble Univ., Univ. Lyon 1, IMPA,