Computer Graphics Hardware Computer Graphics Hardware An Overview

skillfulwolverineΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 6 μήνες)

65 εμφανίσεις

ComputerGraphicsHardware
Computer

Graphics

Hardware
An Overview
Graphics System
Monitor
Input devices
CPU/Memory
GPU
Raster Graphics System

Raster: An array of picture elements

Based on raster-scan TV technology

The screen (and a picture) consists of discrete pixels, and
each pixel has a small display area
video controller
x
y
A
Frame buffer
DAC
Frame Buffer

Frame buffer: the memory to hold the pixel
properties (color, alpha, depth, stencil mask, etc)

Properties of a frame buffer that affect the graphics
performance:
Size:screenresolution

Size:

screen

resolution


Depth: color level
1 bit/pixel: black and white
8 bits/pixel: 256 levels of gray or color pallet index
24 bits/pixel: 16 million colors

Speed:
refreshspeed

Speed:
refresh

speed
A
(way too) simple graphics system
Scan
Frame buffer can be part
ofthemainmemory
F
Controller
of

the

main

memory
CPU
Main Memory
F
rame
buffer
System bus
Problem?
Dedicated memory
Video memory: On-board frame buffer:
muchfastertoaccess
Frame
buffer
Scan
Controller
much

faster

to

access
CPU
Main Memory
System bus
Graphics Accelerator
Graphics Memory/
Framebuffer
Frame

buffer
Scan
Controller
Graphics
Processor
A dedicated processor
for graphics processing
CPU
Main Memory
System bus
Graphics Bus Interface
PCI based technology
Graphics Memory/
Frame buffer
Other
Peri
p
herals
Scan
Controller
Graphics
Processor
PCIe(8GB/s)
p
PCIe

(8

GB/s)

System Bus
CPU
Main Memory
Graphics Accelerators
What do GPUs do?

Graphics processing units (GPUs) are massively parallel
processors

Process geometry/pixels and produce images to be displayed on
the screen

Can also be used to perform general purpose computation (via
CUDA/OpenGL)
CUDA/OpenGL)

Evolved from simple video scan controllers, to special
purpose processors that implement a simple pipeline with
fidhiftilittl
fi
xe
d
grap
hi
cs
f
unc
ti
ona
lit
y,
t
o comp
l
ex many-core
architectures that contain several deep parallel pipelines

The latest GPU (Kepler GK110) contains 15x192 cores and
7.1 billions transistors

A graphics card can easily have more than 2GB of video
memory
CPUsvsGPUs

2005

CPUs

vs
.
GPUs


㈰〵

nVidia G80 GPU (2006)

128 streaming floating point processors @1.5Ghz

1.5 Gb Shared RAM with 86Gb/s bandwidth

㔰5䝦汯Gono湥捨楰⡳楮杬(pre捩獩潮c

㔰5

䝦汯G



潮o

捨楰

⡳楮杬(

灲散楳楯温
nVidia G80 GPU
Setup / Rstr / ZCull
Data Assembler
Application
Vertex assembly
Application
SP
SP
s
or
Vtx Thread Issue
Prim Thread Issue
Frag Thread Issue
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
Vertex assembly
Vertex operations
TF
h
read Proces
s
TF
TF
TF
TF
TF
TF
TF
Primitive assembly
Primitive operations
L1
T
h
L1
L1
L1
L1
L1
L1
L1
Rasterization
Fragment operations
L2
FB
L2
FB
L2
FB
L2
FB
L2
FB
L2
FB
Fragment operations
Framebuffer
nVidia Fermi GPU (2009)
nVidia Fermi GPU (2009)
nVidia Kepler GK110 (2012)
CPU/GPU Performance Gap
Why are GPU’s so fast?

Entertainment Industry has driven
the econom
y
of these chi
p
s?
yp

Males age 15-35 buy
$10B in video games / year

Moore’s Law ++

Simplified design (stream
processing)
processing)

Single-chip designs.
Modern GPU has more ALU’s
A Specialized Processor

Very Efficient For

Fast Parallel Floating Point Processing

Single Instruction Multiple Data Operations

High Computation per Memory Access

Not As Efficient For

Double Precision

Logical Operations on Integer Data

Branching-Intensive Operations
RdAM
ItiOti

R
an
d
om
A
ccess,
M
emory-
I
n
t
ens
i
ve
O
pera
ti
ons
The Rendering Pipeline

The process to generate two-dimensional images from given
virtual cameras and 3D objects

T
he pipeline stages implement various core graphics rendering
algorithms

Why should you know the pipeline?

Necessary for programming GPUs

Understandvarious graphics algorithms

䅮慬祺Ap敲景牭慮捥扯瑴汥湥捫

䅮慬祺A

灥牦潲浡湣p

扯瑴汥湥捫

桯獴
癥rtex
瑲楡湧汥
灩pel
浥浯特
楮瑥牦慣a
灲潣敳獩湧
獥瑵s
灲潣敳獩湧o
楮瑥牦慣a
The Rendering Pipeline

The basic construction –
three conceptual stages
Application

Each stage is a pipeline
and runs in parallel

䝲慰桩捳灥牦潲浡湣pis
䝥潭整特

䝲慰桩捳

灥牦潲浡湣p



摥瑥牭楮敤d批⁴桥⁳汯睥獴b
獴慧攠
䵤hit
䝥潭整特
剡steria穥r

M
o
d
ern grap
hi
cs sys
t
ems:

Software

hardware
Rasteriazer
Image
Host Interface

The host interface is the communication bridge
between the CPU and the GPU

It receives commands from the CPU and also pulls
geometry information from system memory

It outputs a
strea
m
of vertices in object space with
all their associated information (normals, texture
coordinates,pervertexcoloretc)
coordinates,

per

vertex

color

etc)

host
vertex
triangle
pixel
memory
interface
processing
setup
processing
interface
Vertex Processing

The vertex processing stage receives vertices from
the host interface in object space and outputs them
inscreenspace
in

screen

space

This may be a simple linear transformation, or a
com
p
lex o
p
eration involvin
g
mor
p
hin
g
effects
ppgpg

Normals, texcoords etc are also transformed

No new vertices are created in this stage, and no
vertices are discarded (input/output has 1:1
mapping)
host
interface
vertex
processing
triangle
setup
pixel
processing
memory
interface
Triangle setup

In this stage geometry information becomes raster
information(screenspacegeometryistheinput
information

(screen

space

geometry

is

the

input
,
pixels are the output)

Prior to rasterization, triangles that are backfacing or
are located outside the viewing frustrum are rejected

Some GPUs also do some hidden surface removal at
thisstage
this

stage
host
vertex
triangle
pixel
memory
host
interface
vertex
processing
triangle
setup
pixel
processing
memory
interface
Triangle Setup (cont)

A fragment is generated if and only if
itscenterisinsidethetriangle
its

center

is

inside

the

triangle

Every fragment generated has its
attributes com
p
uted to be the
p
perspective correct interpolation of
the three vertices that make up the
triangle
triangle
host
vertex
triangle
pixel
memory
host
interface
vertex
processing
triangle
setup
pixel
processing
memory
interface
Fragment Processing

Each fragment provided by triangle setup is fed into
fragmentprocessingasasetofattributes(position
fragment

processing

as

a

set

of

attributes

(position
,
normal, texcoord etc), which are used to compute
the final color for this pixel

T
he computations taking place here include texture
mapping and math operations

呹灩捡汬T瑨tb潴瑬敮散o楮浯摥牮慰灬楣慴楯湳

呹灩捡汬T

瑨t

扯瑴汥湥捫



浯摥牮

慰灬楣慴楯湳
桯獴
癥rtex
瑲楡湧汥
灩pel
浥浯特
楮瑥牦慣a
灲潣敳獩湧
獥瑵s
灲潣敳獩湧o
楮瑥牦慣a
Memory Interface

Fragment colors provided by the previous stage are
written to the framebuffer

Used to be the biggest bottleneck before fragment
processing took over
Bfthfilitft

B
e
f
ore
th
e
fi
na
l
wr
it
e occurs, some
f
ragmen
t
s are
rejected by the zbuffer, stencil and alpha tests

OnmodernGPUs,zandcolorarecompressedto
On

modern

GPUs,

z

and

color

are

compressed

to

reduce framebuffer bandwidth (but not size)
host
vertex
triangle
pixel
memory
host
interface
vertex
processing
triangle
setup
pixel
processing
memory
interface
Programmability in the GPU

Vertex and fragment processing, and now triangle
set-u
p,
are
p
ro
g
rammable
p,pg

The programmer can write programs that are
executed for every vertex as well as for every
ft
f
ragmen
t

This allows fully customizable geometry and shading
effectsthatgowellbeyondthegenericlookandfeel
effects

that

go

well

beyond

the

generic

look

and

feel

of older 3D applications
host
vertex
triangle
pixel
memory
host
interface
vertex
processing
triangle
setup
pixel
processing
memory
interface
T
he Graphics Pipeline
Diagram of a modern GPU
Input from CPU
f
Host inter
f
ace
Vertex processing
Triangle setup
Pili
Pi
xe
l
process
i
n
g
Memor
y
Interface
64bits to
memory
64bits to
memory
64bits to
memory
64bits to
memory
y
(courtesy: nvidia)
(courtesy: nvidia)
(courtesy: nvidia)
(courtesy: nvidia)
(courtesy: nvidia)
The Quest for Realism
(courtesy: nvidia)