SableSpMT: A Software Framework for Analysing Speculative Multithreading in Java

lightnewsΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

72 εμφανίσεις

SableSpMT:ASoftwareFrameworkforAnalysing
SpeculativeMultithreadinginJava
ChristopherJ.F.PickettandClarkVerbrugge
SchoolofComputerScience,McGillUniversity
Montreal,Quebec,CanadaH3A2A7
fcpicke,clumpg@sable.mcgill.ca
September6th,2005
Outline
1
Introduction
2
Framework
3
ExperimentalAnalysis
4
Conclusions&FutureWork
Motivation
SpeculativeMultithreading(SpMT)isadynamicparallelisation
techniquethatshowsgoodpotentialspeedup.
Currentstatus:SpMThardwaredoesnotexist,andsoftware
SpMThasfocusedonloopsinnumericprograms.
Howdoweknowwhatfeaturestoincorporate?
CangenericSpMTbedoneentirelyinsoftware?
Isitreallyworthbuildingthishardware?
Manydierentstudies,withmanyvariables:
Sourcelanguage,threadpartitioningscheme,compilerframework,
hardwaresimulator,simulationparameters,softwarearchitecture.
Diculttoanalyseandcompareproposals.
Contributions
SableSpMT:softwareSpMTimplementationinJVM
Runsonrealmultiprocessors
Suitableasananalysisframework
Firstcompletesuchwork,handlesSPECjvm98atS100
Provideseveraldebuggingandanalysisfeatures.
Demonstrateexploitationofstaticanddynamicinfo.
Runtimeevaluation:
Overheadcosts
Twoparallelismmetrics
Performance
Outline
1
Introduction
2
Framework
3
ExperimentalAnalysis
4
Conclusions&FutureWork
SpeculativeMethodLevelParallelism(SMLP)
SableSpMTExecutionEnvironment
SpMTExecutionComponents
NumeroussoftwareSpMTcomponentsneeded:
Dependencebuer
Stackbuer
Returnvaluepredictors
Helperthreads
Priorityqueue
Modiedbytecodes
InteractionwithexistingVMservices:
Classloading
Objectallocation
Garbagecollection
Exceptionhandling
Nativemethods
Synchronization
Javamemorymodel
MultithreadedMode
Single-threadedSimulationMode
Outline
1
Introduction
2
Framework
3
ExperimentalAnalysis
4
Conclusions&FutureWork
ExampleComponentAnalysis:RVP
Frameworkcomponents:
Analyseindividuallyandindetail
Instrumentandextendtoaccomodatenewanalyses
Returnvalueprediction(RVP)iscriticalforSMLP.
Weimplementedsoftwareversionsofmanyhardwarepredictors.
Existingstride,contextpredictorsinhybrid:72%accuracy
Newmemoizationpredictoraddedtohybrid:81%accuracy
ManyRVPcongurationpropertiescanbevaried:e.g.per-callsite
(min,max)hashtablesizes,loadfactors,enabledpredictors.
Easytointroducenewpredictors.
ExampleComponentAnalysis:RVP
Twoneatanalysisresults:
1
Contextandmemoizationpredictorsbehavequitedierently,but
hybridallowsthemtocomplementeachother.
2
Memoryrequirementsoftable-basedpredictors:
Largecontexttable:callsiteproduceshighlyvariabledata
Largememoizationtable:callsiteconsumeshighlyvariabledata
Finally,runtimeprolingisusedtoimproveaccuracyandreduce
memoryrequirements.
C.J.F.PickettandC.Verbrugge.ReturnvaluepredictioninaJavavirtualmachine.Second
Value-PredictionandValue-BasedOptimizationWorkshop(VPW2)atASPLOSXI,Boston,
MA,Oct.2004.
StaticAnalysisIntegration
StaticAnalysisIntegration
ReturnValueUse(RVU):
unconsumed
inaccurate
static
10%
21%
dynamic
3%
14%
predictoraccuracy:gainupto7%
predictormemory:save3%
ParameterDependence(PD):
zerodependences
partialdependences
static
25%
23%
dynamic
7%
3%
memoizationaccuracy:gainupto13%
predictormemory:save2%
OverallSystemBehaviour
Speculationoverhead:
Non-speculativeThreadOverhead
parentexecution
comp
db
jack
javac
jess
mpeg
mtrt
rt
USEFULWORK
39%
24%
29%
30%
21%
59%
49%
58%
initializechild
2%
5%
3%
4%
4%
2%
1%
2%
enqueuechild
4%
10%
10%
9%
7%
3%
2%
2%
TOTALFORK
6%
15%
13%
13%
11%
5%
3%
4%
updatepredictor
7%
13%
12%
11%
12%
6%
7%
7%
deletechild
5%
5%
5%
4%
5%
2%
2%
2%
signalandwait
15%
14%
11%
11%
19%
8%
26%
11%
validateprediction
4%
4%
4%
5%
7%
3%
2%
3%
validatebuer
4%
6%
6%
5%
5%
3%
1%
2%
commitchild
5%
5%
7%
6%
6%
3%
2%
3%
abortchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
cleanupchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
proling
11%
10%
10%
12%
11%
7%
5%
6%
TOTALJOIN
53%
59%
57%
56%
67%
34%
47%
36%
PROFILING
2%
2%
1%
1%
1%
2%
1%
2%
Non-speculativeThreadOverhead
parentexecution
comp
db
jack
javac
jess
mpeg
mtrt
rt
USEFULWORK
39%
24%
29%
30%
21%
59%
49%
58%
initializechild
2%
5%
3%
4%
4%
2%
1%
2%
enqueuechild
4%
10%
10%
9%
7%
3%
2%
2%
TOTALFORK
6%
15%
13%
13%
11%
5%
3%
4%
updatepredictor
7%
13%
12%
11%
12%
6%
7%
7%
deletechild
5%
5%
5%
4%
5%
2%
2%
2%
signalandwait
15%
14%
11%
11%
19%
8%
26%
11%
validateprediction
4%
4%
4%
5%
7%
3%
2%
3%
validatebuer
4%
6%
6%
5%
5%
3%
1%
2%
commitchild
5%
5%
7%
6%
6%
3%
2%
3%
abortchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
cleanupchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
proling
11%
10%
10%
12%
11%
7%
5%
6%
TOTALJOIN
53%
59%
57%
56%
67%
34%
47%
36%
PROFILING
2%
2%
1%
1%
1%
2%
1%
2%
Non-speculativeThreadOverhead
parentexecution
comp
db
jack
javac
jess
mpeg
mtrt
rt
USEFULWORK
39%
24%
29%
30%
21%
59%
49%
58%
initializechild
2%
5%
3%
4%
4%
2%
1%
2%
enqueuechild
4%
10%
10%
9%
7%
3%
2%
2%
TOTALFORK
6%
15%
13%
13%
11%
5%
3%
4%
updatepredictor
7%
13%
12%
11%
12%
6%
7%
7%
deletechild
5%
5%
5%
4%
5%
2%
2%
2%
signalandwait
15%
14%
11%
11%
19%
8%
26%
11%
validateprediction
4%
4%
4%
5%
7%
3%
2%
3%
validatebuer
4%
6%
6%
5%
5%
3%
1%
2%
commitchild
5%
5%
7%
6%
6%
3%
2%
3%
abortchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
cleanupchild
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
proling
11%
10%
10%
12%
11%
7%
5%
6%
TOTALJOIN
53%
59%
57%
56%
67%
34%
47%
36%
PROFILING
2%
2%
1%
1%
1%
2%
1%
2%
SpeculativeThreadOverhead
helperexecution
comp
db
jack
javac
jess
mpeg
mtrt
rt
IDLE
86%
82%
78%
78%
78%
55%
53%
71%
INITIALIZECHILD
3%
4%
4%
4%
4%
2%
5%
4%
startup
<1%
<1%
<1%
<1%
<1%
<1%
1%
<1%
querypredictor
3%
5%
4%
4%
6%
5%
15%
8%
usefulwork
5%
6%
10%
10%
10%
34%
20%
13%
shutdown
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
proling
<1%
<1%
<1%
<1%
<1%
1%
2%
1%
EXECUTECHILD
9%
12%
16%
16%
17%
41%
40%
24%
CLEANUPCHILD
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
PROFILING
1%
1%
1%
1%
<1%
1%
1%
<1%
SpeculativeThreadOverhead
helperexecution
comp
db
jack
javac
jess
mpeg
mtrt
rt
IDLE
86%
82%
78%
78%
78%
55%
53%
71%
INITIALIZECHILD
3%
4%
4%
4%
4%
2%
5%
4%
startup
<1%
<1%
<1%
<1%
<1%
<1%
1%
<1%
querypredictor
3%
5%
4%
4%
6%
5%
15%
8%
usefulwork
5%
6%
10%
10%
10%
34%
20%
13%
shutdown
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
proling
<1%
<1%
<1%
<1%
<1%
1%
2%
1%
EXECUTECHILD
9%
12%
16%
16%
17%
41%
40%
24%
CLEANUPCHILD
<1%
<1%
<1%
<1%
<1%
<1%
<1%
<1%
PROFILING
1%
1%
1%
1%
<1%
1%
1%
<1%
ParallelismMetrics
Speculativethreadlengths:
Inhardwaresimulations,max40machineinstructionsisgreat
Insoftware,wecanget100sofbytecodeinstructions
<10bytecodes
>90bytecodes
committed
committed
STmodechildren
30%
15%
MTmodechildren
80%
2%
Speculativecoverage:
Percentageofentireprogramexecutedsuccessfullyinparallel.
4processors,MTmode,RVP:19%
4processors,MTmode,+RVP:33%
ExecutionTimesandRelativeSpeedup
experiment
comp
db
jack
javac
jess
mpeg
mtrt
rt
mean
SpMTmustfail
1297s
931s
293s
641s
665s
669s
1017s
1530s
722s
SpMTmaypass
1224s
733s
211s
468s
405s
662s
559s
736s
539s
relativespeedup
1.06x
1.27x
1.39x
1.37x
1.64x
1.01x
1.82x
2.08x
1.34x
vanillaSableVM
368s
144s
43s
108s
77s
347s
55s
67s
120s
actualslowdown
3.33x
5.09x
4.91x
4.33x
5.26x
1.91x
10.16x
10.99x
4.49x
ExecutionTimesandRelativeSpeedup
experiment
comp
db
jack
javac
jess
mpeg
mtrt
rt
mean
SpMTmustfail
1297s
931s
293s
641s
665s
669s
1017s
1530s
722s
SpMTmaypass
1224s
733s
211s
468s
405s
662s
559s
736s
539s
relativespeedup
1.06x
1.27x
1.39x
1.37x
1.64x
1.01x
1.82x
2.08x
1.34x
vanillaSableVM
368s
144s
43s
108s
77s
347s
55s
67s
120s
actualslowdown
3.33x
5.09x
4.91x
4.33x
5.26x
1.91x
10.16x
10.99x
4.49x
Outline
1
Introduction
2
Framework
3
ExperimentalAnalysis
4
Conclusions&FutureWork
Conclusions&FutureWork
Conclusions:
NewsoftwareSpMTframeworkforJava
Facilitatesexperimentalanalysis,proling,anddevelopmentofnew
techniques
FutureWork:
Dierentspeculationmodes:loop,lock,basicblock
Movespeculationcomponentsintolanguage-independentlibrary
Performanceimprovements,actualspeedup
IBMTestarossaJITandJ9VMintegration