PaCo
Probability

Based
Pa
th
Co
nfidence Prediction
Kshitiz Malik, Mayank Agarwal, Vikram Dhar,
Matthew Frank
Implicitly Parallel Architectures Group
University of Illinois at Urbana

Champaign
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Summary
Path Confidence: likelihood of correct path
Pipeline Gating, SMT Fetch
Conventional: use count of low

conf branches
Inaccurate
PaCo: Directly estimates
goodpath
probability
Highly accurate, modest hardware
Improves performance on gating, SMT Fetch
HPCA

14
February 18, 2008
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Outline
Overview
Motivation
Design
Evaluation
Applications
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Branch Confidence Prediction
Branch Confidence: Single Branches
Low Conf / High Conf
Applications:
Checkpointing
, Multipath etc
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Path Confidence Prediction
Path Confidence: likelihood of being on
correct path
Contributions from Multiple Branches
Conventionally: Count of (unexecuted) low

confidence branches.
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Path Confidence: Applications
Path Confidence: Multiple Branches
Count of low

confidence branches.
Applications: Pipeline Gating,
SMT Prioritization
Gated
Gate

Count
= 5
(Fetch gated when conf >= 5)
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Path Confidence: Applications
Path Confidence: Multiple Branches
Count of low

confidence branches.
Applications:
Pipeline Gating
, SMT Prioritization
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Issues with Conventional Approach
Path Confidence: Multiple Branches
Count of low

confidence branches.
Applications: Pipeline Gating, SMT Prioritization
Problem: implicit assumption that
High

conf branches never
mispredict
All low

conf branches have same
misprediction
probability
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Path Confidence using PaCo
Directly estimates
goodpath
probability
Highly Accurate: RMS error 3.8%
Modest Hardware: 60 bytes of counters
Excellent Performance in applications
Gating:
Badpath
Instrs
. Performance
Conv
↓
7%
↓
0.1%
PaCo
↓
32%
↓
0.01%
SMT Fetch Prioritization:
↑
perf
upto
23% (5.5% av.)
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Outline
Overview
Motivation
Design
Evaluation
Applications
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Branch Confidence Prediction
Classify branches as low conf. or high conf.
JRS predictor:
Count consecutive correct predictions
Below threshold: Low Confidence
Table of Miss

Distance Counters (MDCs)
+
Branch
PC
Global
Hist
5
4 bits
MDC Table
Mispredict
?
+
1
0
On
Branch
Execution
MDC
Value
MDC Value
<
Threshold?
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Conventional Path Confidence Prediction
Count of low

confidence branches = measure
of path confidence
Threshold

and

Count Approach
Inaccurate
Coarseness
No relation to
goodpath
probability
Branch
MDC
Table
Miss Distance
Counter Value
(4 bits)
Threshold
Function
Low Conf /
High Conf
(1 bit)
Path
Confidence
Sum
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Coarseness
All low

conf
branches are equal
Eg
: SMT prior.
gcc
:
a
pending
0.57
gpath
prob
vortex:
2
b
pend
.
Prob
= 0.88*0.88 = .78
yet, fetch from
gcc
!
Misprediction Rate (pct)
MDC Value
twolf
vortex
gcc
gzip
Miss Distance
Counter Value
(4 bits)
Threshold
Function
Low Conf /
High Conf
(1 bit)
a
b
JRS Threshold =
3
Low Conf
High Conf
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February
18
,
2008
Coarseness
Misprediction Rate (pct)
MDC Value
twolf
vortex
gcc
gzip
Low Conf
High Conf
JRS Threshold =
3
All low

conf
branches are equal
High

conf branches
don’t
mispredict
twolf
, vortex,
mdc
=3
Don’t affect conf.
Miss Distance
Counter Value
(
4
bits)
Threshold
Function
Low Conf /
High Conf
(1 bit)
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Sum ≠ Probability
Goodpath
Goodpath
Likelihood when 5 low

confidence branches are pending
Low Conf /
High Conf
(1 bit)
Path
Confidence
Sum
Gating at count=
5
Too aggressive for
gzip
, not useful for route
SMT Fetch:
gzip
and route, conf =
5
. Equal
bandwith
?
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February
18
,
2008
Sum ≠ Probability
Goodpath Prob
Goodpath Likelihood when 5 low

confidence branches are pending
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February
18
,
2008
Sum ≠ Probability
Hard to choose optimal gate

count
Different gate

counts for different benchmarks
Different gate

counts for different phases
Hard to compare path confidence of diff. apps
SMT Fetch Prioritization sub

optimal
Low Conf /
High Conf
(
1
bit)
Path
Confidence
Sum
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Outline
Overview
Motivation
Design
Evaluation
Applications
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Design
Finding ‘correct prediction probability’ for a
branch
MDC table good differentiator of
misprediction
rates
Find
misprediction
rate for each MDC value
No
thresholding
!
Other, more h/w intensive approaches possible
HPCA

14
February 18, 2008
Calculate
Goodpath
Probability directly
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Threshold

and

count vs. PaCo
HPCA

Practice Talk
February
13
,
2008
Branch
MDC
Table
Miss Distance
Counter Value
(
4
bits)
Threshold
Function
Low Conf /
High Conf
(1 bit)
Path
Confidence
Sum
Mispredict
Rate
Calculator
(MRT)
Mispredict
Probability
Path
Confidence
Product
PaCo
Mispred
Rate
Table
.
.
.
0
1
2
13
14
15
Misprediction
Rate
MDC Value
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Hardware Complexity
Remove floating point: scale to integer values
HPCA

14
February
18
,
2008
Hardware Complexity
Floating point MUL and DIV required
Use logarithms, remove
mul
/div
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
.
.
.
.
.
.
60 bytes of counters, 10

bit shift register
Correct
Preds
Mispreds
Correct
Preds
Mispreds
Correct
Preds
Mispreds
MDC
0
MDC 1
MDC
15
6
bits
Mispredict
Rate Calculator
10
bits
Log
Circuit
.
.
.
.
.
+
Branch
PC
Global
Hist
5
MDC Table
Encoded
Probability
12 bits
MDC
0
MDC
1
MDC 15
+
Path Confidence
Path Confidence Predictor
PaCo Hardware
Feedback from
Backend
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February
18
,
2008
Outline
Overview
Motivation
Design
Evaluation
Applications
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Evaluation: Prediction Accuracy
HPCA

14
February
18
,
2008
bzip
2
crafty
gap
gcc
gzip
mcf
parser
perl
twolf
vortex
place
route
Mean
RMS
Error
0.055
0.053
0.087
0.083
0.064
0.045
0.042
0.061
0.018
0.033
0.024
0.032
0.038
RMS Error = 0.038.
Example:
Predicted 60%
goodpath
likelihood
Should be within (60
±
3.8) = 56.2%

63.8%
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Observed Path Confidence in Percent (f)
Predicted Path Confdence in Percent. (x)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Accuracy: Reliability Diagram
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February 18, 2008
Outline
Overview
Motivation
Design
Evaluation
Applications
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Applications: Pipeline Gating
HPCA

14
February
18
,
2008
Redn in Badpath Instructions Exec (pct)
Performance Loss (pct)
JRS 3
JRS 7
JRS 11
JRS 15
GateCount
=1
GateCount
=
2
GateCount
=
10
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Applications: Pipeline Gating
HPCA

14
February
18
,
2008
Redn in Badpath Instructions Exec (pct)
Performance Loss (pct)
PaCo
JRS 3
JRS 7
JRS 11
JRS 15
0.1
%
perf
loss
32
%
redn
. in
badpath
instructions
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Applications: SMT Fetch Prioritization
IPC (harmonic mean)
jrs3
jrs7
jrs11
jrs15
paco
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Conclusions
Threshold

and

Count predictors are inaccurate
PaCo: Directly produces
goodpath
probability
Uses modest h/w by using logarithms
Highly accurate: low RMS error (3.8%)
PaCo does very well in Pipeline Gating and SMT
Fetch Prioritization
HPCA

14
February 18, 2008
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Questions?
HPCA

14
February 18, 2008
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Backup
1
: Comparison with WPUP
WPUP: Perfect Fetch gating
improves
average
performance by 2.3% (excl.
mcf
and parser)
Badpath
Instructions
Good:
prefetching
(useful: small ROBs, wide machines)
Bad: BTB/Cache pollution (
prob
: smaller BTBs/caches)
Prefetching
much less useful with 512 ROB
WPUP
PaCo
Machine
Width
8
4
ROB Size
128
256
Mem
.
Latency
300
cycles
100 cycles
BTB Size
4K entries
2K
entries
Caches
64K L1, 1MB L2
32K L1,
512KB L2
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Pipeline Parameters: Fetch Gating
HPCA

14
February
18
,
2008
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Pipeline Parameters: SMT
HPCA

14
February
18
,
2008
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

Practice Talk
February
13
,
2008
Branch
MDC
Table
Miss Distance
Counter Value
(
4
bits)
Threshold
Function
Low Conf /
High Conf
(
1
bit)
Path
Confidence
Sum
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
HPCA

14
February
18
,
2008
HPCA

14
February 18, 2008
+
Branch
PC
Global
Hist
5
MDC Table
Mispredict
?
+
1
0
4
bits
Implicitly Parallel Architectures Group
University of Illinois, Urbana

Champaign
Applications: SMT Fetch Prioritization
IPC (harmonic mean)
jrs3
jrs7
jrs11
jrs15
paco
Comments 0
Log in to post a comment