Foo

birdsowlΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

92 εμφανίσεις

Making a game with Molehill:
Zombie Tycoon

Jean
-
Philippe
Auclair

Lead R&D Software Architect

Luc Beaulieu

CTO


Frima

Studio

Session Overview


State of Flash


Molehill’s API presentation


Digging deeper into Molehill

State of Flash


Is Flash Dead?


FB: Top 10 = 250M MAU


Desktops: Flash 10 installed on 99%+


SmartPhones
: Flash/Air 200+M, 100 devices


Streaming: 120
petabytes

per month



Advances in Flash for 3D games


AS3


10.1, 10.2 …


Molehill

Molehill’s API Presentation


Pros:


GPU Accelerated API


Relies on DirectX 9 and OpenGL ES 2.0


Native Software
fallback



Cons:


No point sprite support, branching, MRT, depth buffer


No CPU threading support


Native Software
fallback


This Page Intentionally Left Green

Digging deeper into Molehill


Assuming a basic knowledge of 3D development terminology



Display Layers


Model/Animation
File Format


Character Animation: Matrix
vs

Quaternion


Texturing


Optimizing the Particle System


Fast Lights & Shadows


CPU Post
-
Processing
effects


Profiling & Debugging tools



Bonus!


The math explaining all the numbers I’m going to talk about


Cheat sheets


Display Layers

Frima 3D File Format


Many 3D engines for flash try to support multiple input format


…Or support only generic format such as
ColladaXML



Using a format optimized for 3D game made in Flash


Small File Size


Small Memory footprint


No processing required




5250

15

0
1000
2000
3000
4000
5000
6000
Collada XML
Frima Binary Format
Model & Animation
File Processing on low
-
end computer

Time to process (ms)
Frima 3D File Format

Collada

XML

3DS Max
Scene

Max Script Exporter

Build

Tool



Export pipeline

Frima 3D File Format

Model / Animation

Build

Tool

Game

Object

Serialize

(AMF)

Compress

Game

File



Export pipeline

Add

To
Scene

Frima 3D File Format

Game

Object

Uncompress

Unserialize

Game

File



In
-
Game usage

Zombie
Re
-
Animation


Techniques


Matrix linear blending


DualQuaternion

linear blending


Molehill Constraint


Vertex
Shader

constants limits: 128 Float4


Zombie:

24
bones

Animation techniques


Matrix linear blending can cause loss of volume when joints are twisted or
extremely
bent


When using matrix, each bone take 3 constants


Maximum number of bones is 40


When using
DualQuats
, each bone take only 2 constants


Maximum number of bones is 60


Matrix

(
left
) / Dual Quaternion (Right)

Transitions & interpolation




72

48

0
32
64
96
128
matrix
DualQuaternion
VertexShader constant required for animating a character (24 bones)

Anim1 (72)

Anim1 (48)

Anim2 (72)

Anim2 (48)

0
32
64
96
128
matrix
DualQuaternion
Constant for anim 1
Constant for anim 2
Too

Much


Animation transition require two sets of bones


Idle blending to walk


Same thing for frame interpolation (ex: Bullet time Animation)



File size? Performance?




50

60

matrix
DualQuaternion
Animation file size (k)

100%

130%

matrix
DualQuaternion
Vertex Shader processing time

54

136

0
32
64
96
128
160
192
224
256
matrix
DualQuaternion
VertexShader assembler instructions for animation processing
Texturing

in
Molehill


Texturing

in
Molehill


The first version of the engine was only using PNGs


Adobe Texture Format (ATF)


Texture are kept compressed in Video Memory


Native support for multi
-
device publishing


One file containing 3 encoding: DXT1, ETC1 and PVRTC


1.3x bigger than original PNG


Contain the
MipMapping

of the
texture


Does not support transparency

Texturing

in
Molehill


Transparency


Use PNGs with indexed color


Sample a “alpha mask texture” in the pixel
shader


ATF

Avatar = opaque

PNG

Fence = Transparent

Texturing

in
Molehill


Many effects can use ATF when using the good blend modes


No need for transparency

Splatter

=
Multiply

Fire

=
Additive

Particle

System


Using a divided workload (CPU/GPU) for better performance


Each particle property update is computed on the CPU at each frame


Alpha, Color, Direction, Rotation, frame(If
SpriteSheet
), etc.


On the GPU


Applying theses properties


Expending billboard vertex to face the screen

Particle

System :
Optimization


How many particle?


Due to the
VertexBuffer

and
IndexBuffer

limits,


In
ZombieTycoon

we were limited to around 16383 particles per draw call


Using Fast
ByteArray

(
also known as Alchemy memory or
DomainMemory
)


Using Azoth, properties updates were 10 times faster


Batching draw calls using the same texture


Using a 100% GPU particle system


It’s expensive on the GPU


Support only linear transformation


Zero CPU required

Particle

System

Lights &
shadows


Techniques


ShadowMap

&
LightMap


Dynamic lighting


Fake Volumetric lights


Fake projected shadows

Lights &
shadows


ShadowMap

&
LightMap


We used two textures, a “multiplied”
ShadowMap

and an “additive”
LightMap




Diffuse

*
ShadowMap

+
Lightmap

= Composite

Lights &
shadows


Dynamic lighting


Lighting required expensive pixel
shader
, currently limited to 256 instructions


Zombie Tycoon support up to 7
-
9 lights (spot or points) per object.

Lights &
shadows


Pixel
Shader

assembly

code


Per light,
without

Normal/
Specular

mapping
.

Lights &
shadows


Fake Volumetric Lights


Using a few billboard particles, it’s easy to fake a nice and lightweight volumetric lighting


All object are sampling Shadow and light maps, and since the light particles are “additive”, if
an object is behind the lights, it will look brighter

Lights &
shadows

Lights &
shadows


Fake projected shadows


We created a particle of a gradient black spot aligned to the ground


Orientation and scale of the particle depends on light position and intensity

CPU Post
-
Processing


Possibility of reading the
BackBuffer


Strongly recommended not to use
Readback


Fast pipeline for data from the System memory to Video memory


VERY slow pipeline from video to system memory




Effects: Bloom, Blur, Depth of Field, etc.

Motion
Blur

CPU Post
-
Processing

Bloom post
-
processing

Normal



Profiling

and
Debugging

tools

(CPU)


FlashDevelop

(O.S.S.)


Most of the production is using
FlashDevelop


Now with a profiler and a debugger, it’s very easy to work with it


Profiling

and
Debugging

tools

(CPU)


Adobe Flash Builder Profiler


Profile Function calls


Profile Memory allocation


Profiling and Debugging tools (CPU)


FlashPreloadProfiler

(O.S.S.)


Profile Function calls


Profile Memory allocation


Profile Loaders status


Can be used in Debug/Release & browser/Projector


Profiling

and
Debugging

tools

(GPU)


Pix for windows


List of API calls


Shaders

assembly code


Pixel debugger


Texture viewer

Profiling

and
Debugging

tools

(GPU)





Intel®
Graphics

Performance
Analyzers

(GPA)


Render in wireframe


Profile Vertex and Pixel
shader

performance


Visualize overdraw and draw call sequence


Save a frame, and make real
-
time experiment


Identification of bottlenecks


Sources &
References


Geometric Skinning with Approximate Dual Quaternion Blending


http://isg.cs.tcd.ie/kavanl/papers/sdq
-
tog08.pdf


Intel®
Graphics

Performance
Analyzers

(GPA)


http://software.intel.com/en
-
us/articles/intel
-
gpa/


Pix

for
windows


http://msdn.microsoft.com/en
-
us/library/ee417072(v=VS.85).aspx

Contact


Luc Beaulieu


luc@frimastudio.com


Jean
-
Philippe Auclair


jpauclair@frimastudio.com



@
jpauclair



jpauclair.net







TD
-
Matt blog


http://td
-
matt.blogspot.com/


FlashPreloadProfiler


http://jpauclair.net/flashpreloadprofiler/


Azoth


http://www.buraks.com/azoth/


Flash in
Facebook


AppData.com


Flash
Stats


http://adobe.ly/rwXU


http://adobe.ly/gnlUEH



Bonus
Slide
: The maths!


Character animation:


Matrix linear blending:


128 Float4
VertexConstant



WorldMatrix



ViewProj

matrix = 120Float4


120Float4 / / 3Float4 per bone = 40 bones in the constants


Bullet time and transitions require two sets of bones: 40/2 = 20 bones per character max


DualQuaternion

linear blending:


128 Float4
VertexConstant



WorldMatrix



ViewProj

matrix = 120Float4


120Float4 / / 2Float4 per bone = 60 bones in the constants


Bullet time and transitions require two sets of bones: 60/2 = 30 bones per character max


Max Particle Count


The
VertexBuffer

is limited to 65536 vertex, the
IndexBuffer

is limited to 983040 index of type SHORT


In theory, you could have up to 327680 triangle in one draw call


In practice, with no vertex re
-
use between particles and using quads (4 vertex): 65536/6 = 16383 particle max per
draw call


Lighting


With the
PixelShader

limit of 256 instructions, we were able to fit around 7 to 9 dynamic lights per object (point or
spot light)


Achievement
:
Geek


Cheat

Sheet

Achievement
: Super
Geek
!






Contact


Luc Beaulieu


luc@frimastudio.com


Jean
-
Philippe Auclair


jpauclair@frimastudio.com



@
jpauclair



jpauclair.net






Thank You!

Questions?