Accelerating Belief Propagation

unknownlippsΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 4 χρόνια και 24 μέρες)

61 εμφανίσεις

ISTC
-
EC @ Cornell

Accelerating Belief Propagation
in Hardware

Skand

Hurkat

and José Martínez

Computer Systems Laboratory

Cornell University

http
://www.csl.cornell.edu
/


ISTC
-
EC @ Cornell

The Cornell Team


Prof. José Martínez (PI), Prof.
Rajit

Manohar

@
Computer Systems Lab


Prof.
Tsuhan

Chen

@
Advanced Multimedia Processing Lab


MS/Ph.D. students


Yuan
Tian
, MS

13


Skand

Hurkat


Xiaodong

Wang

ISTC
-
EC @ Cornell

The Cornell Graph

ISTC
-
EC @ Cornell

The Cornell Project


Provide hardware accelerators for
belief
propagation

algorithms on embedded
SoCs

(retail/car/home/mobile)


High speed


Very low power


Self
-
optimizing


Highly programmable

BP Accelerator within
SoC

Graph

Inference
Algorithm

Result

ISTC
-
EC @ Cornell

What is
b
elief propagation?

Belief propagation is a
message passing

algorithm for performing inference on
graphical
models
, such as Bayesian networks or Markov
Random Fields

ISTC
-
EC @ Cornell

What is
b
elief propagation?


Labelling problem


Energy as a measure of convergence


Minimize energy (MAP label estimation)


Exact results for trees


Converges in exactly two iterations


Approximate results for graphs with loops


Yields “good” results in practice


Minimum over large neighbourhoods


Close to optimal solution

ISTC
-
EC @ Cornell

N
ot all “that” alien to embedded

𝑠
0

𝑠
11

𝑠
12

𝑠
13

𝑠
21

𝑠
22

𝑠
2
3

𝑠
31

𝑠
32

𝑠
33

𝑠
41

𝑠
4
2

𝑠
4
3

𝑠
5

𝑠
0

𝑠
1

𝑠
2

𝑠
3

𝑠
4

𝑠
5

Remember the Viterbi algorithm?


Used extensively in digital communications

ISTC
-
EC @ Cornell

What does this mean?


Every mobile device uses Viterbi decoders


Error correction codes (
eg
: turbo codes)


Mitigating inter
-
symbol interference (ISI)


Increasing number of mobile applications
involve belief propagation


More general belief propagation accelerators can
greatly improve user experience with mobile
devices

ISTC
-
EC @ Cornell

Target markets

Retail/Car/Home/Mobile


Image processing


De
-
noising


Segmentation


Object detection


Gesture recognition


Handwriting recognition


Improved recognition
through context
identification


Speech recognition


Hidden Markov models are
key to speech recognition


Servers


Data mining tasks


Part
-
of
-
speech tagging


Information retrieval


“Knowledge graph” like
applications


Machine learning based
tasks


Constructive machine
learning


Recommendation systems


Scientific computing


Protein structure inference



ISTC
-
EC @ Cornell

Hardware accelerator for BP

BP Accelerator within
SoC

Graph

Inference
Algorithm

Result

ISTC
-
EC @ Cornell

Work done so far

Software


General purpose MRF
inference library


Support for arbitrary graphs


Floating point math


Parallel techniques for faster
inference


Library optimized for grid
graphs


Optimized data structures


Template can use any data type


Multiple inference techniques
optimized for early vision


Stereo matching in

200
ms

Hardware


High level synthesis of
message update unit


Vivado

HLS (C
-
to
-
gates) tool
used to synthesize message
update unit on
ZedBoard



2x
improvement in inference
speed on CPU+FPGA compared
to CPU
-
only inference


Fixed point math


GraphGen

collaboration


On
-
going work


Stereo matching task mapped
to multiple platforms



10x speedup on GPU w.r.t.
CPU only implementation

ISTC
-
EC @ Cornell

Work done so far

Software


General purpose MRF
inference library


Support for arbitrary graphs


Floating point math


Parallel techniques for faster
inference


Library optimized for grid
graphs


Optimized data structures


Template can use any data type


Multiple inference techniques
optimized for early vision


Stereo matching in

200
ms

Hardware


High level synthesis of
message update unit


Vivado

HLS (C
-
to
-
gates) tool
used to synthesize message
update unit on
ZedBoard



2x
improvement in inference
speed on CPU+FPGA compared
to CPU
-
only inference


Fixed point math


GraphGen

collaboration


On
-
going work


Stereo matching task mapped
to multiple platforms



10x speedup on GPU w.r.t.
CPU only implementation

ISTC
-
EC @ Cornell

Work done so far

Software


General purpose MRF
inference library


Support for arbitrary graphs


Floating point math


Parallel techniques for faster
inference


Library optimized for grid
graphs


Optimized data structures


Template can use any data type


Multiple inference techniques
optimized for early vision


Stereo matching in

200
ms

Hardware


High level synthesis of
message update unit


Vivado

HLS (C
-
to
-
gates) tool
used to synthesize message
update unit on
ZedBoard



2x
improvement in inference
speed on CPU+FPGA compared
to CPU
-
only inference


Fixed point math


GraphGen

collaboration


On
-
going work


Stereo matching task mapped
to multiple platforms



10x speedup on GPU w.r.t.
CPU only implementation

ISTC
-
EC @ Cornell

Hierarchical belief propagation

ISTC
-
EC @ Cornell

Results


Stereo Matching

0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
440000
445000
450000
455000
460000
465000
470000
475000
480000
U
p
d
a
t
e
s

E
n
e
r
g
y

Comparing inference algorithms on “Tsukuba”
benchmark

Updates
Energy
ISTC
-
EC @ Cornell

Work done so far

Software


General purpose MRF
inference library


Support for arbitrary graphs


Floating point math


Parallel techniques for faster
inference


Library optimized for grid
graphs


Optimized data structures


Template can use any data type


Multiple inference techniques
optimized for early vision


Stereo matching in

200
ms

Hardware


High level synthesis of
message update unit


Vivado

HLS (C
-
to
-
gates) tool
used to synthesize message
update unit on
ZedBoard



2x
improvement in inference
speed on CPU+FPGA compared
to CPU
-
only inference


Fixed point math


GraphGen

collaboration


On
-
going work


Stereo matching task mapped
to multiple platforms



10x speedup on GPU w.r.t.
CPU only implementation

ISTC
-
EC @ Cornell

Work done so far

Software


General purpose MRF
inference library


Support for arbitrary graphs


Floating point math


Parallel techniques for faster
inference


Library optimized for grid
graphs


Optimized data structures


Template can use any data type


Multiple inference techniques
optimized for early vision


Stereo matching in

200
ms

Hardware


High level synthesis of
message update unit


Vivado

HLS (C
-
to
-
gates) tool
used to synthesize message
update unit on
ZedBoard



2x
improvement in inference
speed on CPU+FPGA compared
to CPU
-
only inference


Fixed point math


GraphGen

collaboration


On
-
going work


Stereo matching task mapped
to multiple platforms



10x speedup on GPU w.r.t.
CPU only implementation

ISTC
-
EC @ Cornell

GraphGen

synthesis of BP
-
M


BP
-
M update (
logspace

messages)
implemented using
GraphGen

(Intel/CMU/UW)


GPU implementation

10x faster than CPU
based implementation


On
-
going work on FPGA based
implementation and on implementing
hierarchical update

ISTC
-
EC @ Cornell

Cornell Publications (2013 only)


3x Comp. Vision & Pattern Recognition (CVPR)


3x Asynchronous VLSI (ASYNC)


2x
Intl. Symp. Computer Architecture (ISCA)


1x Intl. Conf. Image Processing (ICIP)



1x ASPLOS (w/
GraphGen

folks, under review)

ISTC
-
EC @ Cornell

Year 3 Plans


GraphGen

extensions for BP applications


Multiple inference techniques


Extraction of “BP ISA”


Ops on arbitrary graphs


Efficient representation


Amplification work on UAV ensembles


Self
-
optimizing, collaborative
SoCs


One
-
day “graph” workshop with
GraphGen+UIUC

ISTC
-
EC @ Cornell

Accelerating Belief Propagation
in Hardware

Skand

Hurkat

and José Martínez

Computer Systems Laboratory

Cornell University

http
://www.csl.cornell.edu
/