Perl Regular Expression Quick Reference Card ... - Top Web Hosts

whooploafΛογισμικό & κατασκευή λογ/κού

13 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

81 εμφανίσεις

PerlRegularExpressionQuickReferenceCard
Revision0.1(draft)forPerl5.8.5
IainTruskett(formattingbyAndrewFord)refcards.comTM
ThisisaquickreferencetoPerl’sregularexpressions.Forfull
informationseetheperlreandperlopmanualpages.
Operators
=determinestowhichvariabletheregexisapplied.Initsab-
sence,$_isused.
$var=/foo/;
!determinestowhichvariabletheregexisapplied,andnegates
theresultofthematch;itreturnsfalseifthematchsucceeds,
andtrueifitfails.
$var!/foo/;
m/pattern/igmsoxc
searchesastringforapatternmatch,applyingthegivenop-
tions.
icase-insensitive
gglobal–alloccurrences
mmultilinemode–and$matchinternallines
smatchasasingleline–.matches\n
ocompilepatternonce
xextendedlegibility–freewhitespaceandcom-
ments
cdon’tresetposonfailedmatcheswhenusing/g
Ifpatternisanemptystring,thelastsuccessfullymatched
regexisused.Delimitersotherthan‘/’maybeusedforboth
thisoperatorandthefollowingones.
qr/pattern/imsox
letsyoustorearegexinavariable,orpassonearound.Mod-
ifiersasform//andarestoredwithintheregex.
s/pattern/replacement/igmsoxe
substitutesmatchesofpatternwithreplacement.Modifiers
asform//withoneaddition:
eevaluatereplacementasanexpression
‘e’maybespecifiedmultipletimes.replacementisinter-
pretedasadoublequotedstringunlessasingle-quote(’)is
thedelimiter.
?pattern?
islikem/pattern/butmatchesonlyonce.Noalternatede-
limiterscanbeused.Mustberesetwithreset.
Syntax
\Escapesthecharacterimmediatelyfollowingit
.Matchesanysinglecharacterexceptanewline(un-
less/sisused)
Matchesatthebeginningofthestring(orline,if/m
isused)
$Matchesattheendofthestring(orline,if/misused)
*Matchestheprecedingelement0ormoretimes
+Matchestheprecedingelement1ormoretimes
?Matchestheprecedingelement0or1times
{...}Specifiesarangeofoccurrencesfortheelementpre-
cedingit
[...]Matchesanyoneofthecharacterscontainedwithin
thebrackets
(...)Groupssubexpressionsforcapturingto$1,$2...
(?:...)Groupssubexpressionswithoutcapturing(cluster)
|Matcheseitherthesubexpressionprecedingorfol-
lowingit
\1,\2...ThetextfromtheNthgroup
Escapesequences
Theseworkasinnormalstrings.
\aAlarm(beep)
\eEscape
\fFormfeed
\nNewline
\rCarriagereturn
\tTab
\038AnyoctalASCIIvalue
\x7fAnyhexadecimalASCIIvalue
\x{263a}Awidehexadecimalvalue
\cxControl-x
\N{name}Anamedcharacter
\lLowercasenextcharacter
\uTitlecasenextcharacter
\LLowercaseuntil\E
\UUppercaseuntil\E
\QDisablepatternmetacharactersuntil\E
\EEndcasemodification
Thisoneworksdifferentlyfromnormalstrings:
\bAnassertion,notbackspace,exceptinacharacter
class
Characterclasses
[amy]Match‘a’,‘m’or‘y’
[f-j]Dashspecifiesrange
[f-j-]Dashescapedoratstartorendmeans‘dash’
[f-j]Caretindicates“matchanycharacterexceptthese”
Thefollowingsequencesworkwithinorwithoutacharacterclass.
Thefirstsixarelocaleaware,allareUnicodeaware.Thedefault
characterclassequivalentaregiven.Seetheperllocaleandperlu-
nicodemanpagesfordetails.
\dAdigit[0-9]
\DAnondigit[0-9]
\wAwordcharacter[a-zA-Z0-9
]
\WAnon-wordcharacter[a-zA-Z0-9
]
\sAwhitespacecharacter[\t\n\r\f]
\SAnon-whitespacecharacter[\t\n\r\f]
\CMatchabyte(withUnicode,‘.’matchesacharac-
ter)
\pPMatchP-named(Unicode)property
\p{...}MatchUnicodepropertywithlongname
\PPMatchnon-P
\P{...}MatchlackofUnicodepropertywithlongname
\XMatchextendedunicodesequence
POSIXcharacterclassesandtheirUnicodeandPerlequivalents:
alnumIsAlnumAlphanumeric
alphaIsAlphaAlphabetic
asciiIsASCIIAnyASCIIchar
blankIsSpaceHorizontalwhitespace(GNUextension)[\t]
cntrlIsCntrlControlcharacters
digitIsDigitDigits\d
graphIsGraphAlphanumericandpunctuation
lowerIsLowerLowercasechars(localeandUnicodeaware)
printIsPrintAlphanumeric,punct,andspace
punctIsPunctPunctuation
spaceIsSpaceWhitespace[\s\ck]
IsSpacePerlPerl’swhitespacedefinition\s
upperIsUpperUppercasechars(localeandUnicodeaware)
wordIsWordAlphanumericplus
(Perlextension)\w
xdigitIsXDigitHexadecimaldigit[0-9A-Fa-f]
Withinacharacterclass:
POSIXtraditionalUnicode
[:digit:]\d\p{IsDigit}
[:digit:]\D\P{IsDigit}
Anchors
Allarezero-widthassertions.
Matchstringstart(orline,if/misused)
$Matchstringend(orline,if/misused)orbefore
newline
\bMatchwordboundary(between\wand\W)
\BMatchexceptatwordboundary(between\wand\w
or\Wand\W)
\AMatchstringstart(regardlessof/m)
\ZMatchstringend(beforeoptionalnewline)
\zMatchabsolutestringend
\GMatchwherepreviousm//gleftoff
123
Quantifiers
Quantifiersaregreedybydefault–matchthelongestleftmost.
MaximalMinimalAllowedrange
{n,m}{n,m}?Mustoccuratleastntimesbutnomore
thanmtimes
{n,}{n,}?Mustoccuratleastntimes
{n}{n}?Mustoccurexactlyntimes
**?0ormoretimes(sameas{0,})
++?1ormoretimes(sameas{1,})
???0or1time(sameas{0,1})
Thereisnoquantifier{,n}–thatgetsunderstoodasaliteral
string.
Extendedconstructs
(?#text)Acomment
(?imxs-imsx:...)Enable/disableoption(asperm//modifiers)
(?=...)Zero-widthpositivelookaheadassertion
(?!...)Zero-widthnegativelookaheadassertion
(?<=...)Zero-widthpositivelookbehindassertion
(?<!...)Zero-widthnegativelookbehindassertion
(?>...)Grabwhatwecan,prohibitbacktracking
(?{code})Embeddedcode,returnvaluebecomes$R
(??{code})Dynamicregex,returnvalueusedasregex
(?(cond)yes|no)condbeingintegercorrespondingtocaptur-
ingparens
(?(cond)yes)oralookaround/evalzero-widthassertion
Variables
$
Defaultvariableforoperatorstouse
$*Enablemultilinematching(deprecated;notin5.9.0
orlater)
$&Entirematchedstring
$`Everythingpriortomatchedstring
$'Everythingaftertomatchedstring
Theuseofthoselastthreewillslowdownallregexuse
withinyourprogram.Consulttheperlvarmanpagefor
@LAST_MATCH_STARTtoseeequivalentexpressionsthatwon’t
causeslowdown.SeealsoDevel::SawAmpersand.
$1,$2...HoldtheXthcapturedexpr
$+Lastparenthesizedpatternmatch
$NHoldsthemostrecentlyclosedcapture
$RHoldstheresultofthelast(?{...})expr
@-Offsetsofstartsofgroups.$-[0]holdsstartof
wholematch
@+Offsetsofendsofgroups.$+[0]holdsendofwhole
match
Capturedgroupsarenumberedaccordingtotheiropeningparen.
Functions
lcLowercaseastring
lcfirstLowercasefirstcharofastring
ucUppercaseastring
ucfirstTitlecasefirstcharofastring
posReturnorsetcurrentmatchposition
quotemetaQuotemetacharacters
resetReset?pattern?status
studyAnalyzestringforoptimizingmatching
splitUseregextosplitastringintoparts
Thefirstfouroftheseareliketheescapesequences\L,\l,\U,
and\u.ForTitlecase,seebelow.
Terminology
Titlecase
Unicodeconceptwhichmostoftenisequaltouppercase,butfor
certaincharactersliketheGerman‘sharps’(ß)thereisadiffer-
ence.
Seealso
perlretutforatutorialonregularexpressions.
perlrequickforarapidtutorial.
perlreformoredetails.
perlvarfordetailsonthevariables.
perlopfordetailsontheoperators.
perlfuncfordetailsonthefunctions.
perlfaq6forFAQsonregularexpressions.
Theremoduletoalterbehaviourandaiddebugging.
“Debuggingregularexpressions”inperldebug
perluniintro,perlunicode,charnamesandlocalefordetailson
regexesandinternationalisation.
MasteringRegularExpressionsbyJeffreyFriedl
(http://regex.info/)forathoroughgroundingandreference
onthetopic.
Authors
ThiscardwascreatedbyAndrewFord.
Theoriginaldocument(perlreref.pod)ispartofthestandard
Perldistribution.ItwaswrittenbyIainTruskett,withthanksto
DavidP.C.Wollmann,RichardSoderberg,SeanM.Burke,Tom
Christiansen,JimCromie,andJeffreyGoffforusefuladvice.
PerlRegularExpressionQuickReferenceCard
Revision0.1(draft)forPerlversionPerl5.8.5[July2005]
Arefcards.comTM
quickreferencecard.
refcards.comisatrademarkofFord&MasonLtd.
PublishedbyFord&MasonLtd.
c
IainTruskett.Thisdocumentmaybedistributedunderthesameterms
asPerlitself.Downloadfromrefcards.com.
456