Poster - High Speed Digital Systems Lab

spongemintSoftware and s/w Development

Dec 2, 2013 (3 years and 6 months ago)

52 views

High Speed Digital Systems Laboratory

VLSI Laboratory


DEPARTMENT OF ELECTRICAL ENGINEERING

TECHNION


ISRAEL INSTITUTE OF TECHNOLOGY

Graphics on Key

Eyal Sarfati and Eran Gilat

Supervisors: Dr. Shmuel Wimer, Amnon Stanislavsky and Mike Sumszyk



GPU is the key for high
-
performance
graphics needed in games, flight
simulations, virtual worlds, etc.




Mobile systems lack a suitable GPU.




External GPU with a commonly used
interface can open the door to a large
range of applications for such systems.

Project Motivation

Input :


A stream of data generated by a software
program and fed to the
GoK
. The data is sent
in two stages:

1.
At initialization, a list of triangles are sent
to the
GoK
.

2.
At streaming, a transformation for each
triangle is sent to the
GoK

every
40
msec

Output:


Real
-
time object animation at :

1.
640
x
480
pixels resolution.

2.
4000
polygons per object.


Project Goal

Develop a low
-
cost prototype which
performs
3
D animation and displays it on
a
2
D RGB screen

Project Requirements

Triangulation Data Structure

Rendering Algorithm

1.
Apply transformation on each triangle


2.
Project triangle on viewing plane


3.
Determine triangle visibility


4.
Determine projected triangle visibility
(Rasterization)


5.
Set color of visible points

System Overview
-

SoPC

GPU Architecture

Multicore architecture with shared interface results
several issues:

1.
Scheduling multiple requests to the common interface

2.
Synchronizing
32
bit memory accesses with
256
bit bus

3.
Data coherency


Arbiter :


Efficiently schedules all memory requests

Snoopy Cache :


Single line (
256
bit) , Write Back , Write Allocate X
11
.


Each line has
2
control bits: valid , dirty.


Uses a snooping mechanism for data coherency.



Data is read or written to cache only if the cache line is
exclusively owned by the requesting rasterization unit,
otherwise it waits until the cache line is available.

Arbiter Snooping Cache

The screen is described by pixels on a
2
D discrete
grid.

Therefore
2
transformations are needed:

1.
Transformation from
3
D to
2
D (projection)

2.
Transformation from points on a real grid to
points on a discrete grid (the screen grid).

Transformation + Pre
-
processor

Summary and Achievements

Scheduler FSM

Project triangle on viewing plane

USB

VGA

GoK

Each triangle is described by the following :

3
vertexes denoted by

3
RGB vectors denoted by

1
normal vector denoted by n









,,
x y z




,,
x y z




,,
x y z




,,
n a b c

R
G
B

R
G
B

R
G
B

USB

DDRII

Nios

System

Controller

GPU

Communication Bus

USB

Controller

DDRII

Controller

VGA

Controller

Prefetch & Visibility

Detection Unit

3
D Transformation

Unit

Triangle

pre
-
processor

FIFO task queue

Rasterization
10

Rasterization
1

Rasterization
0

Scheduler

Unit

Z
-
Buffer Arbiter

Snooping Cache

RGB Arbiter

Snooping Cache

Reset

All

StartScan
1
=
1

FIFORead
=
1

ResetScan
1
=
1

StartScan
2
=
1

FIFORead
=
1

ResetScan
2
=
1

StartScan
11
=
1

FIFORead
=
1

ResetScan
11
=
1

FIFO Empty =
0

Finish
1
=
1

FIFO Empty =
0

Finish
2
=
1

FIFO Empty =
0

Finish
11
=
1

FIFO Empty =
1

Reset

IDLE

.

.

.

.

1

ˆ
x
ˆ
z
ˆ
y


Successful implementation of a “Graphics on Key”
device which significantly enhance the graphic
performance of low power, low cost gadgets.



The device receives object data and user
animation commands, performs the required
computations and displays the animation on a
640
X
480
monitor at
25
f/s.




Project required specifications :
100
,
000
triangles/sec @
160
X
120
resolution.



Achieved performance
:
2
,
500
,
000
triangles/sec
@
640
X
480
resolution.

Sort
Coordinates
according to
y axis

Triangle
slopes
calculation

Create
2
data
structures

D
calculation



,,
a b c


T

FIFO

Reciprocal
unit

RGB Color
Set

Vertex /
Normal
Transform



,,



,,
T T T
a b c
Screen

Convert



1 2 3
,,
L L L
RGB
3
D Transformation

Unit

Triangle

pre
-
processor