Metacharacters and their behavior in the context of regular expressions:
\ (backslash)
Quote the next metacharacter.
For example, foo$ matches "foo" at the end of the string,
but foo\$ matches "foo$".
^ (caret)
Matches the position at the beginning of the string.
If Multiline mode (m modifier), ^ also matches the position following '~n' or '~r'.
For example, ^\d+ matches one or more digits at the begining of the string.
\A
Match only at beginning of string.
$ (dollar sign)
Matches the position at the end of the string.
If Multiline mode (m modifier), $ also matches the position preceding '~n' or '~r'.
For example, \w+$ matches a word at the end of the string.
\Z
Match only at the end of string, or before newline at the end.
\z
Match only at the end of string.
* (asterisk)
Matches the preceding character or subexpression zero or more times.
For example, Go*gle matches "Ggle" and "Goooogle".
* is equivalent to {0,}.
+ (plus)
Matches the preceding character or subexpression one or more times.
For example, Go+gle matches "Gogle" and "Goooogle", but not "Ggle".
+ is equivalent to {1,}.
? (question mark)
Matches the preceding character or subexpression zero or one time.
For example, Go?gle matches "Ggle" and "Gogle", but not "Google".
? is equivalent to {0,1}.
{n}
Matches the preceding character or subexpression exactly n times.
For example, Go{2}gle matches "Google".
{n,}
Matches the preceding character or subexpression at least n times.
For example, Go{2,}gle matches "Google" and "Gooogle", but not "Gogle".
{n,m}
Matches the preceding character or subexpression at least n but not more than m times.
For example, Go{2,4}gle matches "Google" and "Gooogle", but not "Gogle" and "Gooooogle".
? (question mark)
When this character immediately follows any of the other quantifiers
(*, +, ?, {n}, {n,}, {n,m}), the matching pattern is "non-greedy".
By default, a quantified pattern is "greedy", it will match as many times as possible.
A "non-greedy" pattern matches as little of the searched string as possible.
For example, v.*y matches "very very greedy" in "I am very very greedy".
But v.*?y matches only "very" in the same string.
. (period)
Matches any single character except "~n" and "~r".
In Singleline mode (s modifier) matches any character whatsoever, even "~n" and "~r".
(pattern)
A subexpression that matches pattern and captures the match.
The captured match can be used in backreference costruct and substitution construct.
To match parentheses characters ( ), use "\(" or "\)". See also:
of_getmatch function.
(?:pattern)
A subexpression that matches pattern but does not capture the match for possible later use.
(?=pattern)
A zero-width positive look-ahead assertion.
This is a non-capturing match, that is, the match is not captured for possible later use.
For example, /\w+(?=\t)/ matches a word followed by a tab, without including the tab.
(?!pattern)
A zero-width negative look-ahead assertion.
This is a non-capturing match, that is, the match is not captured for possible later use.
For example, /\w+(?!\t)/ matches a word that is not followed by a tab.
(?<=pattern)
A zero-width positive look-behind assertion. Works only for fixed-width look-behind.
This is a non-capturing match, that is, the match is not captured for possible later use.
For example, /(?<=\t)\w+/ matches a word that follows a tab, without including the tab.
(?<!pattern)
A zero-width negative look-behind assertion. Works only for fixed-width look-behind.
This is a non-capturing match, that is, the match is not captured for possible later use.
For example, /(?<!\t)\w+/ matches a word that does not follow a tab.
(?#text)
A comment. The text is ignored.
For example, / \d+ (?#one or more digits) \w+/x
x|y
Matches either x or y. For example, mood|food matches "mood" or "food".
It is equivalent to (m|f)ood.
[abc]
A character class. Matches any one of the enclosed characters.
For example,
[lmn] matches the "l" in "apple".
[0123456789] matches any digit. It's equivalent to [0-9] and \d.
[\da-fA-F] matches any hexadecimal character.
[^abc]
A negative character class. Matches any character not enclosed.
For example,
[^of] matches the "d" in "food".
[^0123456789] matches any non-digit character. It's equivalent to [^0-9] and \D.
[^\da-fA-F] matches any character except hexadecimal.
\w
Matches any word character including underscore. Equivalent to [0-9A-Za-z_].
For example,
\w matches the "a" in "(#a)".
\w+ matches the whole word.
\W
Matches any nonword character. Equivalent to [^0-9A-Za-z_] and [^\w].
For example, \W matches the "#" in "#A09F".
\b
Matches a word boundary.
A word boundary is a spot between two characters that has a \w
on one side of it and a \W on the other side of it (in either order),
counting the imaginary characters off the beginning and end of the string
as matching a \W.
For example,
er\b matches the "er" in "lighter" but not the "er"' in "terra".
\brol matches the "rol" in "role" and "roll" but not in "control".
\B
Matches a nonword boundary.
For example,
er\B matches the "er" in "terra but not in "lighter".
\Brol matches the "rol" in "control" but not in "role" and "roll".
\d
Matches a digit character. Equivalent to [0-9], [0123456789] and [^\D].
For example,
\d matches the "3" in "(#a3)".
\d+ matches the whole number.
\D
Matches any nondigit character. Equivalent to [^0-9] and [^\d].
For example, \D matches the "F" in "09F".
\n
Matches a newline character. Equivalent to \x0A.
\r
Matches a carriage return character. Equivalent to \x0D.
\t
Matches a tab character. Equivalent to \x09.
\s
Matches any white space character including space, tab, and so on.
\S
Matches any non-white space character. Equivalent to [^\s].
\xnn
Where nn are hexadecimal digits, matches the character whose numeric value is nn.
For example, \x41 matches "A". Allows ASCII codes to be used in regular expressions.
\n
Where n is a positive integer.
A
reference back to captured matches.
For example, (.)\1 matches two consecutive identical characters.