Richard Thomson DAZ 3D www.daz3d.com

monkeybeetleΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

106 εμφανίσεις

Richard Thomson

DAZ 3D

www.daz3d.com

Direct3D 11


CTP in November 2008 DirectX SDK



Vista (and beyond) only, not on XP



Evolution of Direct3D 10



Compatible with D3D 10 cards

Evolution of Direct3D


Direct3D 9


Stable, been around for a while


Last version to be deployed on Win XP


Direct3D 10


First Vista
-
only version


Big change from D3D 9


Direct3D 10.1


Incremental tweak to D3D 10

Direct3D 10/10.1/11 vs. 9


Enumeration factored out to DXGI


Same DXGI used for 10, 10.1 and 11


Divide render/texture states into chunks


Chunks of state are immutable objects


“Device state” consists of set of
assigned state chunks


Introduces new
shader

stages beyond
vertex and pixel
shaders


Tighter API specification => no CAPS

Direct3D 11 Focus


Scalability and performance



Improving the development experience



Extending the reach of the GPU

Direct3D 11 New Features


Tessellation


Compute
Shader


Multithreading


Shader

Subroutines


Improved Texture Compression


Other Features

Tessellation

Direct3D 10 pipeline

Plus

Three new stages for
Tessellation

Input Assembler

Vertex Shader

Pixel Shader

Hull

Shader

Rasterizer

Output Merger

Tessellator

Domain
Shader

Geometry Shader

Stream Output

Hull
Shader

Hull

Shader

Tessellator

Domain

Shader

HS output:

Patch control pts after

Basis conversion

HS output:



TessFactors

(how much to tessellate)



fixed tessellator mode declarations

HS input:


patch control pts

One Hull Shader
invocation per
patch

Hull
Shader

Syntax

[
patchsize
(12)]

[
patchconstantfunc
(
MyPatchConstantFunc
)]

MyOutPoint

main(
uint

Id :
SV_ControlPointID
,


InputPatch
<
MyInPoint
, 12>
InPts
)

{


MyOutPoint

result;








result =
TransformControlPoint
(
InPts
[Id] );



return result;

}

Tessellator

Tessellator

Domain
Shader

Hull

Shader

TS input:



TessFactors

(how much to tessellate)



fixed tessellator mode declarations

TS output:



U V {W} domain
points

TS output:




topology

(to primitive assembly)

Note: Tessellator
does not see
control points

Tessellator
operates per
patch

Domain
Shader

Domain Shader

Hull

Shader

Tessellator

DS input:



U V {W} domain points

DS input:



control points



TessFactors

DS output:



one vertex

One Domain
Shader invocation
per point from
Tessellator

Domain
Shader

Syntax

void main( out
MyDSOutput

result,


float2
myInputUV

:
SV_DomainPoint
,


MyDSInput

DSInputs
,


OutputPatch
<
MyOutPoint
, 12>
ControlPts
,


MyTessFactors

tessFactors

)

{






result.Position

=


EvaluateSurfaceUV
(
ControlPoints
,
myInputUV

);

}


Single Pass Example

displacement

map

Evaluate

surface

including

displacement

domain
shader

patch

control points

Animate/skin

Control

Points

transformed

control points

vertex
shader

Transform basis,

Determine how

much to tessellate

control points

in Bezier patch

U V {W}

domain points

Sub
-
D Patch

Bezier Patch

hull
shader

Tess

Factors

Tessellate!

tessellator

Current Authoring Pipeline

(Rocket Frog Taken From Loop &Schaefer, "Approximating
Catmull
-
Clark Subdivision Surfaces with
Bicubic

Patches“)

Sub
-
D Modeling

Animation

Displacement Map

Polygon Mesh

Generate LODs

New Authoring Pipeline

(Rocket Frog Taken From Loop &Schaefer, "Approximating
Catmull
-
Clark Subdivision Surfaces with
Bicubic

Patches“)

Sub
-
D Modeling

Animation

Displacement Map

Optimally Tessellated Mesh

GPU

Tessellation Summary


Helps us get closer to eliminating “pointy heads”


Scales visual quality across PC hardware
configurations


Supports performance increases


Coarse model = compression, faster I/0 to GPU


Rendering tailored to each end user’s hardware


Better cross
-
platform (Windows + Xbox 360)
development experience


Xbox 360 has a subset of D3D11’s tessellation


Parity = ease of cross
-
platform development


Extra features = innovation for Windows gaming


Render content as the artist created it!


More on Tessellation


GameFest

2008 Slides and Audio


“Direct3D 11 Tessellation”


Kev

Gee, Microsoft



“Advanced Topics in GPU Tessellation”


Natasha
Tatarchuk
, AMD/ATI



“Water
-
Tight, Textured, Displaced Subdivision
Surface Tessellation Using Direct3D 11”


Ignacio
Castano
, NVIDIA

General Purpose GPU


Data Parallel Computing


GPU performance continues to grow


Many applications scale well to massive
parallelism without tricky code changes


Direct3D is the API for talking to GPU


How do we expand Direct3D to
GP
GPU?


Compute
Shader

Direct3D 10 pipeline

Plus

Three new stages for
Tessellation

Plus

Compute
Shader

Input Assembler

Vertex Shader

Pixel Shader

Hull

Shader

Rasterizer

Output Merger

Tessellator

Domain
Shader

Geometry Shader

Stream Output

Compute
Shader

Data Structure

Integrated with Direct3D


Fully supports all Direct3D resources


Targets graphics/media data types


Evolution of DirectX HLSL


Graphics pipeline updated to emit
general data structures…


…which can then be manipulated by
compute
shader



And then rendered by Direct3D again


Target Applications


Image/Post processing:


Image Reduction



Image Histogram


Image Convolution


Image FFT


A
-
Buffer/OIT


Ray
-
tracing,
radiosity
, etc.


Physics


AI

Computing a Histogram

Histogram()

{


shared
int

Histograms[16][256];

// array of 16



float3
vPixel

= load( sampler,
sv_ThreadID

);


float
fLuminance

= dot(
vPixel
, LUM_VECTOR );


int

iBin

=
fLuminance
*255.0f;



// compute bin to increment


int

iHist

=
sv_ThreadIDInGroup

& 16;
// use thread index


Histograms[
iHist
][
iBin
] += 1;
// update bin



// enable all threads in group to complete


SynchronizeThreadGroup
;

Computing a Histogram 2



// Write register histograms out to memory:


iBin

=
sv_ThreadIDInGroup.x
;


if (sv_ThreadID.x

< 256)


{


for (
iHist

= 0;
iHist

< 16;
iHist
++)


{


int2
destAddr

= int2(
iHist
,
iBin
);


OutputResource.add
(
destAddr
,


Histograms[
iHist
][
iBin
]);
// atomic


}


}

}

Compute
Shader

Summary


Enables much more general algorithms


Transparent parallel processing model


Full cross
-
vendor support


Broadest possible installed base



GameFest

2008:


“Direct3D 11 Compute
Shader



More
Generality for Advanced Techniques”


Chas Boyd, Microsoft

Multithreading


Enables distribution across threads of


Application code


Runtime


Driver


Device: free threaded resource creation


Immediate Context: your single primary device
for state & draws


Deferred Contexts: your per
-
thread devices for
state & draws


Display Lists: Recorded sequence of graphics
commands


Requires a driver update

Shader

Subroutines


Details


Calls must be fast


Binding applies to all primitives in a Draw call


Binding operation must be fast


Need parameter passing mechanism


Need access to textures, samplers, etc.


Advantages


Reduce register usage in
Über
-
shaders


Not worst case of all if statements


Allows specialization of subroutines


Improved Texture Compression


Why?



Existing block palette interpolations too
simple


Results often rife with blocking artifacts


No high dynamic range (HDR) support

New Texture Formats


BC6 (aka BC6H)


High dynamic range


6:1 compression (16
bpc

RGB)


Targeting high (not lossless) visual quality



BC7


LDR with alpha


3:1 compression for RGB or 4:1 for RGBA


High visual quality

Compression of New Formats


Block compression (unchanged)


Each block independent


Fixed compression ratio



Multiple block types (new)


Tailored to different types of content


Smooth gradients vs. noisy normal maps


Varied alpha vs. constant alpha



Decompression results must be bit
-
accurate with spec


Comparison Results 1

Orig

BC3

Orig

BC7

Abs Error

Comparison Results 2

Orig

BC3

Orig

BC7

Abs Error

Comparison Results 3

Abs Error

HDR Original at

given exposure

BC6 at

given exposure

Other Features


Addressable Stream Out


Draw Indirect


Pull
-
model attribute
eval


Improved Gather4


Min
-
LOD texture clamps


16K texture limits


Required 8
-
bit
subtexel
,
submip

filtering precision




Conservative oDepth


2 GB Resources


Geometry shader instance
programming model


Optional double support


Read
-
only depth or stencil
views

Thanks

Allison Klein

Senior Lead Program Manager

Direct3D

Microsoft


Chas. Boyd

Architect

Windows Desktop & Gaming Technology

Microsoft

Thank you to
our Sponsors!