notes/regex.md

4.8 KiB

https://cheatography.com/davechild/cheat-sheets/regular-expressions/

Anchors                                                       Quantifiers                                     Groups and Ranges

^  Start of string, or start of line in multi-line pattern    * 0 or more  {3}   Exactly 3                    .       Any character except new line (\n)
\A Start of string                                            + 1 or more  {3,}  3 or more                    (a|b)   a or b
$  End of string, or end of line in multi-line pattern        ? 0 or 1     {3,5} 3, 4 or 5                    (...)   Group
\Z End of string                                              Add a ? to a quantifier to make it ungreedy.    (?:...) Passive (non-capturing) group
\b Word boundary                                                                                              [abc]   Range (a or b or c)
\B Not word boundary                                          Escape Sequences                                [^abc]  Not (a or b or c)
\< Start of word                                                                                              [a-q]   Lower case letter from a to q
\> End of word                                                \  Escape following character                   [A-Q]   Upper case letter from A to Q
                                                              \Q Begin literal sequence                       [0-7]   Digit from 0 to 7
Character Classes                                             \E End literal sequence                         \x      Group/subpattern number "x"
                                                                                                              Ranges are inclusive.
\c Control character                                                                                          
\s White space                                                                                                Pattern Modifiers
\S Not white space                                            Common Metacharacters
\d Digit                                                                                                      g   Global match
\D Not digit                                                  ^  [  .  $                                      i * Case-insensitive
\w Word                                                       {  *  (  \                                      m * Multiple lines
\W Not word                                                   +  )  |  ?                                      s * Treat string as single line
\x Hexadecimal digit                                          <  >                                            x * Allow comments and whitespace in pattern
\O Octal digit                                                The escape character is usually \               e * Evaluate replacement
                                                                                                              U * Ungreedy pattern
POSIX                                                         Special Characters                              * PCRE modifier

[:upper:]   Upper case letters                                \n   New line                                   String Replacement
[:lower:]   Lower case letters                                \r   Carriage return
[:alpha:]   All letters                                       \t   Tab                                        $n nth non-passive group
[:alnum:]   Digits and letters                                \v   Vertical tab                               $2 "xyz" in /^(abc(xyz))$/
[:digit:]   Digits                                            \f   Form feed                                  $1 "xyz" in /^(?:abc)(xyz)$/
[:xdigit:]  Hexadecimal digits                                \xxx Octal character xxx                        $` Before matched string
[:punct:]   Punctuation                                       \xhh Hex character hh                           $' After matched string
[:blank:]   Space and tab                                                                                     $+ Last matched string
[:space:]   Blank characters                                                                                  $& Entire matched string
[:cntrl:]   Control characters                                                                                Some regex implementations use \ instead of $.
[:graph:]   Printed characters
[:print:]   Printed characters and spaces
[:word:]    Digits, letters and underscore

Assertions

?=          Lookahead assertion
?!          Negative lookahead
?<=         Lookbehind assertion
?!= or ?<!  Negative lookbehind
?>          Once-only Subexpression
?()         Condition [if then]
?()|        Condition [if then else]
?#          Comment