Nika-Soft logo Nika-Soft logo

PBRegexp

Metacharacters

Metacharacters and their behavior in the context of regular expressions:
\ (backslash)
Quote the next metacharacter. For example, foo$ matches "foo" at the end of the string, but foo\$ matches "foo$".
^ (caret)
Matches the position at the beginning of the string. If Multiline mode (m modifier), ^ also matches the position following '~n' or '~r'. For example, ^\d+ matches one or more digits at the begining of the string.
\A
Match only at beginning of string.
$ (dollar sign)
Matches the position at the end of the string. If Multiline mode (m modifier), $ also matches the position preceding '~n' or '~r'. For example, \w+$ matches a word at the end of the string.
\Z
Match only at the end of string, or before newline at the end.
\z
Match only at the end of string.
* (asterisk)
Matches the preceding character or subexpression zero or more times. For example, Go*gle matches "Ggle" and "Goooogle". * is equivalent to {0,}.
+ (plus)
Matches the preceding character or subexpression one or more times. For example, Go+gle matches "Gogle" and "Goooogle", but not "Ggle". + is equivalent to {1,}.
? (question mark)
Matches the preceding character or subexpression zero or one time. For example, Go?gle matches "Ggle" and "Gogle", but not "Google". ? is equivalent to {0,1}.
{n}
Matches the preceding character or subexpression exactly n times. For example, Go{2}gle matches "Google".
{n,}
Matches the preceding character or subexpression at least n times. For example, Go{2,}gle matches "Google" and "Gooogle", but not "Gogle".
{n,m}
Matches the preceding character or subexpression at least n but not more than m times. For example, Go{2,4}gle matches "Google" and "Gooogle", but not "Gogle" and "Gooooogle".
? (question mark)
When this character immediately follows any of the other quantifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is "non-greedy". By default, a quantified pattern is "greedy", it will match as many times as possible. A "non-greedy" pattern matches as little of the searched string as possible. For example, v.*y matches "very very greedy" in "I am very very greedy". But v.*?y matches only "very" in the same string.
. (period)
Matches any single character except "~n" and "~r". In Singleline mode (s modifier) matches any character whatsoever, even "~n" and "~r".
(pattern)
A subexpression that matches pattern and captures the match. The captured match can be used in backreference costruct and substitution construct. To match parentheses characters ( ), use "\(" or "\)". See also: of_getmatch function.
(?:pattern)
A subexpression that matches pattern but does not capture the match for possible later use.
(?=pattern)
A zero-width positive look-ahead assertion. This is a non-capturing match, that is, the match is not captured for possible later use. For example, /\w+(?=\t)/ matches a word followed by a tab, without including the tab.
(?!pattern)
A zero-width negative look-ahead assertion. This is a non-capturing match, that is, the match is not captured for possible later use. For example, /\w+(?!\t)/ matches a word that is not followed by a tab.
(?<=pattern)
A zero-width positive look-behind assertion. Works only for fixed-width look-behind. This is a non-capturing match, that is, the match is not captured for possible later use. For example, /(?<=\t)\w+/ matches a word that follows a tab, without including the tab.
(?<!pattern)
A zero-width negative look-behind assertion. Works only for fixed-width look-behind. This is a non-capturing match, that is, the match is not captured for possible later use. For example, /(?<!\t)\w+/ matches a word that does not follow a tab.
(?#text)
A comment. The text is ignored. For example, / \d+ (?#one or more digits) \w+/x
x|y
Matches either x or y. For example, mood|food matches "mood" or "food". It is equivalent to (m|f)ood.
[abc]
A character class. Matches any one of the enclosed characters.
For example,
[lmn] matches the "l" in "apple".
[0123456789] matches any digit. It's equivalent to [0-9] and \d.
[\da-fA-F] matches any hexadecimal character.
[^abc]
A negative character class. Matches any character not enclosed.
For example,
[^of] matches the "d" in "food".
[^0123456789] matches any non-digit character. It's equivalent to [^0-9] and \D.
[^\da-fA-F] matches any character except hexadecimal.
\w
Matches any word character including underscore. Equivalent to [0-9A-Za-z_].
For example,
\w matches the "a" in "(#a)".
\w+ matches the whole word.
\W
Matches any nonword character. Equivalent to [^0-9A-Za-z_] and [^\w].
For example, \W matches the "#" in "#A09F".
\b
Matches a word boundary. A word boundary is a spot between two characters that has a \w on one side of it and a \W on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a \W.
For example,
er\b matches the "er" in "lighter" but not the "er"' in "terra".
\brol matches the "rol" in "role" and "roll" but not in "control".
\B
Matches a nonword boundary.
For example,
er\B matches the "er" in "terra but not in "lighter".
\Brol matches the "rol" in "control" but not in "role" and "roll".
\d
Matches a digit character. Equivalent to [0-9], [0123456789] and [^\D].
For example,
\d matches the "3" in "(#a3)".
\d+ matches the whole number.
\D
Matches any nondigit character. Equivalent to [^0-9] and [^\d].
For example, \D matches the "F" in "09F".
\n
Matches a newline character. Equivalent to \x0A.
\r
Matches a carriage return character. Equivalent to \x0D.
\t
Matches a tab character. Equivalent to \x09.
\s
Matches any white space character including space, tab, and so on.
\S
Matches any non-white space character. Equivalent to [^\s].
\xnn
Where nn are hexadecimal digits, matches the character whose numeric value is nn.
For example, \x41 matches "A". Allows ASCII codes to be used in regular expressions.
\n
Where n is a positive integer. A reference back to captured matches. For example, (.)\1 matches two consecutive identical characters.
Copyright © NikaSoft 2004-2005