Recent PostgreSQL Optimizer Improvements (in 7.4)

arizonahoopleΔιαχείριση Δεδομένων

28 Νοε 2012 (πριν από 4 χρόνια και 10 μήνες)

240 εμφανίσεις

RecentPostgreSQLOptimizerImprovements1
RecentPostgreSQLOptimizerImprovements
TomLane
PostgreSQL-RedHatEditionGroup
RedHat,Inc.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements2
Outline
Iplantodividethistalkintotworoughlyequalparts:
1.What’saqueryoptimizer,anyway?

Therearemultiplepossiblequeryplans

Weneedtodeviseagoodplan,asquicklyaspossible
2.What’snewinPostgreSQL7.4?

Severalimportantplannerimprovementsarecoming

Hashingisusedmuchmorethanbefore

WHEREfooIN(SELECT...)
ismuchsmarterthanbefore
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements3
Stepsofplanneroperation
QuerytransformationTrytotransformthequeryintoanequivalent,
butmore-efficient-to-execute,query.
QueryanalysisDeriveinformationaboutthequery.
PlangenerationSystematicallyconsideralternativequeryplans,
estimateexecutioncostsofeachone,choosetheonewithlowest
estimatedcost.
Thereissomeoverlapbetweenquerytransformationandqueryanalysis,
butit’susefultothinkofthemasseparateoperations.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements4
Querytransformation:flatteningsub-SELECTs
Thefirstmajorstepinquerytransformationistopullupsub-SELECTsin
theFROMclauseintothemainquery.Thesub-SELECTscouldhavebeen
writtenbytheuser,orcouldhavebeenexpandedfromviewreferences.
Forexample,supposewehavetheoriginalquery
SELECT...FROMaview,mytable
WHEREaview.key=mytable.ref;
and
aview
isaviewdefinedlikethis:
CREATEVIEWaviewAS
SELECTbasetable.key,...
FROMbasetable,anothertable
WHEREbasetable.key=anothertable.aref;
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements5
Querytransformation:flatteningsub-SELECTs
Viewexpansionwouldtransformthequerytothis:
SELECT...FROM
(
SELECT...FROMbasetable,anothertable
WHEREbasetable.key=anothertable.aref
)ASaview,
mytable
WHEREaview.key=mytable.ref;
Nowtheplannercanflattenthisinto:
SELECT...FROM
basetable,anothertable,mytable
WHEREbasetable.key=anothertable.arefAND
basetable.key=mytable.ref;
Nowwehavetheabilitytoconsiderdifferentjoinorders,whichthe
sub-SELECTformwouldnotletusdo.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements6
Querytransformation:expandfunctionsin-line
SimpleSQLfunctions(thosethatjustcomputeasingleexpression’s
result)areexpandedin-line,likemacros.
Forexample,thebuilt-infunction
log(numeric)
isdefinedas
CREATEFUNCTIONlog(numeric)RETURNSnumericAS
’SELECTlog(10,$1)’LANGUAGESQL;
so
SELECTlog(numfld)FROMfoo;
becomes
SELECTlog(10,numfld)FROMfoo;
thusavoidingoneleveloffunctioncallatruntime.
Thisexpansionisnewin7.4.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements7
Querytransformation:simplifyconstantexpressions
Forexample,
2+2
isreducedto
4.
Thisismoreusefulthanitmightlook.Althoughperhapsauserwouldn’t
writeaquerycontainingobviousconstantsubexpressions,theprevious
transformationsoftenexposeopportunitiesforexpressionsimplification.
Evaluatingasubexpressiononceduringplanningbeatsdoingitoncefor
eachrowofthequery.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements8
Queryanalysis
Inqueryanalysisoperations,webegintodisassemblethequeryintothe
piecesthatthemainplanningstepswilldealwith.
Oneinterestingstepexaminestheequalityconstraintsenforcedbythe
query’sWHEREclause.Thisisusefulprimarilybecauseitletstheplanner
makeinferencesaboutsortorder:ifwehave
WHEREa=b,thenjoin
outputthatissortedby
a
isalsosortedby
b.Knowingthismightletus
saveasortstep.
Avaluablebyproductofdeterminingtheequalitypropertiesisthatwecan
deduceimpliedequalities.Considerourpreviousexample:
SELECT...FROM
basetable,anothertable,mytable
WHEREbasetable.key=anothertable.arefAND
basetable.key=mytable.ref;
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements9
Queryanalysis:deducingimpliedequality
Wecandeducethat
aref
and
ref
mustbeequal,sincethey’rebothequal
to
key.Sothequeryiseffectivelytransformedto
SELECT...FROM
basetable,anothertable,mytable
WHEREbasetable.key=anothertable.arefAND
basetable.key=mytable.ref
AND
anothertable.aref=mytable.ref
;
Thisopensthepossibilityofjoining
anothertable
to
mytable
first,
whichcouldbeawinifthey’rebothsmallwhile
basetable
islarge.
Althoughthisappearstoaddaredundantcomparisontothequery,
whichevercomparisonendsupbeingthethirdtobecheckedwillbe
discarded;sothere’snoruntimepenalty.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements10
Queryanalysis:evenmoreimpliedequality
In7.3andearlier,equalitydeductiononlyhappenedforequalityclauses
relatingsimplevariables,butin7.4ithappensforequalityclausesrelating
anyexpressions.
1
Forexample,consider
SELECT*FROMaviewWHEREkey=42;
Afterviewexpansionandflattening,thisbecomes
SELECT...FROMbasetable,anothertable
WHEREbasetable.key=anothertable.arefAND
basetable.key=42;
1
Exceptthosecontainingvolatilefunctions.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements11
Queryanalysis:evenmoreimpliedequality
Impliedequalitydeductionwill(in7.4)expandthisto:
SELECT...FROMbasetable,anothertable
WHEREbasetable.key=anothertable.arefAND
basetable.key=42
AND
anothertable.aref=42
;
andnowthe
key=aref
comparisonisfoundtoberedundant,leaving:
SELECT...FROMbasetable,anothertable
WHEREbasetable.key=42AND
anothertable.aref=42;
Inthisformthequeryistrivialtobuildanefficientplanfor.Notethatthe
usercouldnothaveexpressedthequeryinthiswaytobeginwith;
aview
mightnothaveexposed
anothertable.aref
tohimatall.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements12
Queryanalysis:distributionofWHEREclauses
NextweattachWHEREclausestothelowestlevelatwhichtheycanbe
evaluated.Aclausementioningjustasingletable(a.x=42)isattached
tothattable.Itwillgetusedwhenwepreparescanplansforthattable.
Joinclauses(a.x=b.y)areattachedtodatastructuresrepresentingthe
setsoftablestheymention;they’llbeusedwhenweformajoinincluding
allthetablesmentioned.
IfwehaveaWHEREclausethatmentionsonlyoutputsofasub-SELECT
(onethat’snotbeenflattened),wetrytopushitdowninsidethe
sub-SELECT.Therearesemanticrestrictionsonwhetherwecandothis
withoutchangingtheresults,though.7.4analyzesthesituationmore
carefullythanpriorreleasesandisabletopushdownsuchclausesin
morecases.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements13
Themainplanningprocess
Afterallthepreparatoryworkisdone,wegeneratepossibleplansand
estimatecosts.
First,generatepossiblescanplansforeachindividualtable:simple
sequentialscanisalwayspossible,aswellasindexscansusingeachof
theindexesthatarerelevanttothequery.Estimateexecutioncostfor
each.Ifthere’sonlyonetable,we’redone—justtakethecheapestplan.
Ifthere’smorethanonetable,generatejoinplansforeachpairoftables.
Wecandosimplenestloop,nestloopwithinnerindexscan,mergejoin,or
hashjoin.Mergejoinneedssortedinputs,sotherearemultiple
possibilitiesdependingonwhetherweperformexplicitsortstepsoruse
presortedinputs(forinstance,theresultofanindexscanisalready
sorted,ifit’sabtreeindex).
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements14
Typesofjoinplans
NestedLoop
aag
Table 2
aagaaraai
aayaagaasaaraaa
aay
aai
Table 1
NestedloopwithinnerindexscanusesanindexonTable2tofetchjust
therowsmatchingthecurrentTable1row.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements15
Typesofjoinplans
MergeJoin
Sorted
Sorted
Table 2
aabaacaad
aaaaabaabaafaaf
aacaae
Table 1
aaa
HashJoin
Hashed
Table 1
aar
aak
aayaaraamaaoaaw
Table 2
aayaag
aas
aak
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements16
Themainplanningprocess,continued
Next(ifthere’satleastthreetables)lookforwaystojoinpairsoftables
toindividualtablestomakethree-wayjoinedrelations,usingthecheapest
pairwisejoinswefoundbefore.
Forexample,given
SELECT...FROMtab1,tab2,tab3
WHEREtab1.col1=tab2.col2ANDtab1.col3=tab3.col4;
thefirstpassgeneratessequentialandindexscanplansforeachof
tab1,
tab2,
tab3
separately.Thesecondpassgeneratesplansthatjoin
tab1
and
tab2,aswellasplansthatjoin
tab1
and
tab3.(Wecouldalso
considerplansthatjoin
tab2
and
tab3,butthisavenueisrejected
becausethereisnoWHEREclausethatconstrainsthatjoin,sowe’dhave
togeneratetheentirecross-productofthosetwotables.)Finally,thethird
passgeneratesplansthatcombine
tab3
withthejoinof
tab1
and
tab2,
andalsoplansthatcombine
tab2
withthejoinof
tab1
and
tab3.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements17
Themainplanningprocess,continued
Anyofthesesequencesmightbethecheapestwaytocomputetheresult,
dependingonfactorssuchasthesizesofthetables,theconstraints
specifiedinthequery’sWHEREclause,andtheavailableindexes.
Ifwehavemoretables,wenextlookforwaystomakefour-wayjoins,
five-wayjoins,etc,untilwehavejoinedallthetablesinthequery.Each
passusestheresultsofpriorpasses.
Oncewehavethebestplansforthecompletejoinset,wearealmost
done.Wejustaddonanyadditionalstepsneededforgrouping,
aggregation,orsorting(ifspecifiedbythequery),andwe’redone.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements18
Whichplansdoweneedtokeepateachlevel?
Itturnsoutthatit’snotsufficienttorememberjustthecheapestplanfor
eachsetofjoinedtables.Becausemergejoinneedssortedinput,itmaybe
bettertouseamore-expensivesubplanthatdeliversappropriatelysorted
resultsthantousethecheapestsubplanandthendoanexplicitsort.
Forexample,ifweneedtojoinonaclauselike
WHEREtab1.col1=
tab2.col2,andthereisanindexon
tab1.col1,thenthecheapestplan
forscanning
tab1
isasequentialscan.Afullindexscanusingtheindex
on
col1
iscertainlygoingtobeslower.Buttheindexscancouldbeused
directlyasaninputformergejoin,soitmightbebettertousethatthan
todoasequentialscanandsort.
Weendupkeepingthecheapestplanforeachinterestingsortorder
(where“interesting”means“isrelatedtoacandidatemergejoinclause
oran
ORDERBY
or
GROUPBY
key”).Sousuallytherewillbeseveralplans
foragiventableorjoinsetthatsurviveforconsiderationatthenextlevel.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements19
BetterimplementationofWHEREfooIN(SELECT...)
Weusedtoimplement
IN(SELECT...)
inthecrudestwaypossible:
eachtimethevalueoftheconditionisneeded(possiblyonceforeveryrow
oftheouterquery),runthesub-SELECTtoseeifitproducesarow
matchingtheleft-handside.Thisisessentiallyanestloopjoin,andit’s
reallyslowifeithertheouterqueryorthesub-SELECThasalotofrows.
Somepeoplehand-optimize
IN
into
EXISTS:
SELECT...WHEREcolIN(SELECTsubcolFROMsubtab);
becomes
SELECT...WHERE
EXISTS(SELECT*FROMsubtabWHEREsubcol=col);
Thisstillisanestloop—butif
subtab.subcol
isindexed,youcanat
leastmakeitbeanestloopwithinnerindexscan.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements20
BetterimplementationofWHEREfooIN(SELECT...)
7.4hasseveralotherwaystodo
IN:
Special-purposehashtablecodeLoadtherowsofthesub-SELECT
intoanin-memoryhashtable,andthenprobethehashtableforeach
outerrow.
Semi-joinConvertthe
IN
toasemi-joinandapplythestandardplanning
processtodiscoverthebestjoinplan.(“Semi-join”isajointhatemits
onlythoseleft-handrowsthatjointoatleastoneright-handrow,and
onlyonceeach.)
DISTINCTfollowedbyregularjoinRuntheoutputofthesub-SELECT
throughaDISTINCTfilter(toeliminateduplicaterows)andthentreat
the
IN
likeasimpleequalityjoin.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements21
BetterIN(SELECT...):special-purposehashtable
Thenastythingabout
IN
isthattheSQLspecdictatesthatcertaincases
involvingNULLsinthesub-SELECTresultshouldcausethe
IN
toyield
“unknown”(NULL)ratherthan“false”whenit’sunabletofindmatching
rows.Anordinaryhashjoincan’thandlethis,especiallynotfora
multi-column
IN
whererowsmaybeonlypartlyNULL.
BymakingspecialentriesforrowsthatarewhollyorpartlyNULL,
wewereabletomakethespecial-purposehashtablecodeproducefully
spec-compliantresultsfor
IN.Therefore,wecanalsousethiscodefor
NOTIN
andforcaseswherethe
IN
operatorisnestedinsideother
booleanoperations.
Alimitationisthatthehashtablewillperformbadly,orevenfailoutright,
ifthesub-SELECTproducesmorerowsthanwillfitinmemory.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements22
BetterIN(SELECT...):semi-join
Inthesemi-joinapproach,weplanthequeryasifthe
IN
wereasimple
equalityjoin;thatis,
SELECT...FROMtab
WHEREfooIN(SELECTbarFROMsubtab)
isalmostthesameas
SELECT...FROMtab,subtab
WHEREfoo=bar;
Themaindifferenceisthatweonlywantoneoutputrowforeach
tab
that
hasoneormorematchesin
subtab.Wehandlethisbyapplyingaspecial
“joinrule”inthejoinplanstep.Thejoinrulerequiresthat
tab
betheouter
sideofthejoin,anditjuststopslookingforinnermatchesassoonasit
findsone.(Thisissortofthereverseofwhathappensforaleftouterjoin.)
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements23
BetterIN(SELECT...):semi-join
Whenweimplement
IN
thisway,wecannottellthedifferencebetween
caseswhere
IN
shoulddeliveraFALSEresultandcaseswhereitshould
deliveraNULLresult.Thereforethesemi-joinapproachcanonlybeused
when
IN
appearsatthetoplevelofWHERE(anditcan’tbe
NOTIN).
InthatcontextthedifferencebetweenFALSEandNULLdoesn’tmatter.
Consideringagain
SELECT...WHEREcolIN(SELECTsubcolFROMsubtab);
where
subtab.subcol
isindexed,oneofthesemi-joinpossibilitiesisto
useanestloopwithinnerindexscanon
subtab.Thus,thesystemcan
nowautomaticallydiscoverthealternativethatyouusedtohavetouse
EXISTS
togetittotakenoticeof.Notonlythat,butitcanalsoconsider
mergeorhashjoins,whichmightbeconsiderablysuperiordependingon
thenumbersofrowsinvolved.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements24
BetterIN(SELECT...):DISTINCTfollowedbyjoin
Ifweeliminateduplicaterowsfromtheresultofthesub-SELECT,thenwe
canjustjoinittotheouterquerywitharegularequalityjoin—we’ve
ensuredtherecan’tbemultipleoutputrowsforasingleouter-queryrow.
Again,thisonlyworksatthetoplevelofWHERE.Itsadvantageoverthe
semi-joinwayisthatthedistinct-ifiedsub-SELECTcanbeeithertheinner
sideortheoutersideoftheresultingjoinplan.
Consider
SELECT*FROMbig
table
WHEREkeyIN(SELECTidFROMsmall
table);
If
big
table.key
isindexed,thenit’sprobablygoingtobebesttochange
thesub-SELECTtoSELECTDISTINCTandusetheresultastheouter
sideofanestloopwithinnerindexscan.Thisistheonlyimplementation
thatcanavoidreadingalltherowsof
big
table.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements25
BetterIN(SELECT...):whichwayshouldweuse?
When
IN
appearsatthetoplevelofWHERE,the7.4plannerconvertsthe
IN
toeitherasemi-joinorDISTINCT-and-joinplan.Itusesitsnormal
approachofestimatingcoststochoosewhichplantouse.
Whenwehave
NOTIN
attoplevel,oreither
IN
or
NOTIN
belowthetop
levelofWHERE,weusethespecializedhashtablecodeifweestimate
thetablewillfitinto
SORT
MEM
kilobytesofmemory.Otherwisewefallback
totheoldnested-loopimplementation.
Thereisn’tanyneedtoconsiderthespecializedhashcodeasan
alternativeforatop-level
IN,sinceasemi-joinusingahashjoinplandoes
approximatelythesamething.Wedon’tneedtoexplicitlyconsiderfalling
backtotheoldnestloop-likeimplementationeither,sinceasemi-joinusing
anestloopjoinplanisequivalenttothat.Thesepossibilitieswillbe
automaticallyconsideredalongwiththeotheralternatives.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements26
Hash-basedgroupingandgroupedaggregation
In7.3andbefore,ifyouwrite
GROUPBYfoo,theimplementationalways
involvesscanningtheinputinorderby
foo
(viaeitheranexplicitsortor
anindexscan).Thengroupboundariesaredetectedbycomparingthe
foo
columnorcolumnsinsuccessiverows.Ifweareaggregating,werunthe
aggregateaccumulationfunctionsoneachrowasitarrives,thenemit
resultswhentheendofagroupisdetected.
In7.4,thereisanotheralternative,whichistoloadtheincomingdatainto
anin-memoryhashtableindexedbythegroupingcolumns.Afterwehave
readalltheinputdata,wetraversethehashtableandemitoneresultrow
foreachhashentry;thatis,oneresultforeachgroup.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements27
Hash-basedgroupingandgroupedaggregation
Ifweareaggregating,westillruntheaccumulationfunctionsoneachrow
asitarrives.Thehashtableentryforeachgrouphastoholdthetransient
stateforeachaggregate.Sothiswayismorememory-intensivethanthe
oldway,butaslongastherearenottoomanygroups,it’snotaproblem.
Thebigadvantageofthehash-basedapproachisthatwedon’thaveto
pre-sorttheinput.Adisadvantageisthattheoutputwillappearinno
particularorder.Soifyouaskedfor
GROUPBYfooORDERBYfoo,
wehavetodoanexplicitsortoftheoutputinstead.Thiscanstillbeabig
winthough,sincetherearelikelytobemanyfeweroutputrowsthaninput
rows.
Theplannerchoosesbetweenthesetwoapproachesinitsusualway:
estimatethecostsofbothandpickthecheaperone.Italsochecks
whetherthehashtableisexpectedtofitin
SORT
MEM
kilobytes—ifnot,
itwon’tusehashing.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements28
Mergeandhashjoinsonexpressions
Themergeandhashjoinmethodswereformerlyconsideredonlyforjoin
clausesthatequatedtwosimplevariables(a.x=b.y).Nowtheyare
consideredforjoinclausesthatequateanyexpressionsoverdifferentsets
oftables.
Forexample,
SELECT*FROMa,bWHEREa.foo=b.bar+1;
isnowacandidateformergeorhashjoining,whereasbefore7.4itwould
alwayshavebeendoneasabrute-forcenestloopjoin.
Also,ahashjoincanusemultiplehashkeysratherthanonlyone.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements29
ExplicitJOINsyntaxdecoupledfromexecutionplans
PostgressupportstheSQL92-specsyntaxforjoins:
SELECT...FROMtab1LEFTJOINtab2ON
condition
...
SELECT...FROMtab1[INNER]JOINtab2ON
condition
...
WhentherearenestedJOINconstructs,the7.3plannerdoesnotsearch
forthebestjoinorder—itonlyconsidersthejoinorderthatcorresponds
totheJOINsyntaxstructure.Thisissemanticallynecessaryformany
situationsinvolvingouterjoins,butifthejoinsareallINNERthenJOINis
reallyjustanotherwaytowriteFROMtab1,tab2...WHERE...
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements30
ExplicitJOINsyntaxdecoupledfromexecutionplans
Ontheplusside,thisbehaviorcancutplanningtimedramaticallyfor
many-tablequeries.
Ontheminusside,youmaygetaverybadplaniftheforcedjoinorderisn’t
agoodchoice.
Asof7.4,thedefaultbehaviorwillbetoconsideronlyouterJOINsas
constrainingthejoinorder.WhenyouwritenestedinnerJOINsyou’llget
asearchforthebestjoinorder.
Ifyoualreadyhaveagoodjoinorderandyouneedtocutplanningtime,
youcangettheoldbehaviorbysetting
JOIN
COLLAPSE
LIMIT
to1.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements31
Asuccessstory
Afewmonthsago,theArsDigitaguys(whonowworkatRedHat)
cametomeforhelpwithportingtheirACSsystemtoPostgreSQL.They
hadproblemswithfourqueriesthatwereusedindisplayingwebpagesin
ACS.ThesequerieseachtookseveralsecondstoexecuteinPostgreSQL,
makingtheperformanceunacceptable.Thesamequeriesranfinein
Oracle.
But...theyweretestingonPostgreSQL7.2.
Iwasabletotellthemthatalltheirproblemswerealreadysolved.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements32
Asuccessstory
Measuredruntimesfortheproblemqueries(samedata,samemachine):
PG7.2PG7.3Asof14-Mar-03
Query14560.62msec4461.84msec213.14msec
Winfrom:searchingforgoodjoinorderinsteadoffollowingJOINstructure
Query26612.94msec1.27msec1.05msec
Winfrom:pushingdownWHEREclausesintoUNION(alreadydonein7.3)
Query310375.64msec3276.02msec3.23msec
7.3winfrom:evaluatingINlastamongclausesattachedtosameplannode
7.4winfrom:convertingINtospecial-purposehashtable
Query43314.57msec2729.78msec0.54msec
Winfrom:searchingforgoodjoinorder
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements33
Asuccessstory
Mindyou,theseareprettyuglyqueries—here’sthefirstone:
select*
from(select*
fromvc
objectsjoinacs
objects
on(vc
objects.object
id=acs
objects.object
id)
joincms
items
on(vc
objects.object
id=cms
items.item
id))results
where(cms
items
ancestors<=’19056/’)
and(cms
items
ancestors=substr(’19056/’,1,
length(cms
items
ancestors)))
orderbycms
items
ancestors;
Buttheyarereal-worldexamples.
TomLaneO’ReillyOpenSourceConvention,July2003
RecentPostgreSQLOptimizerImprovements34
Summary
There’slotsofgoodnewstuffcomingin7.4.
Sinceinternalhashtablesareoptionsinmanymoreplacesthanbefore,
andeachonecanuseupto
SORT
MEM
kilobytes,it’smoreimportantthan
evertomakesureyousetthatparameterintelligently.Toosmallwill
handicapperformancebypreventinghash-basedplansfrombeing
considered,buttoolargecandriveyoursystemintoswaphell.
Ifyou’vehand-optimized
IN
queriesinto
EXISTS
queries,youmightwant
tothinkaboutswitchingback.Under7.4using
EXISTS
couldbea
de-optimization,becauseitconstrainstheplannertouseonlyoneofthe
typesofplansitwouldconsiderifyouused
IN.
TomLaneO’ReillyOpenSourceConvention,July2003