RegEXP
Regular expressions
· Coding · RegExp tester/tracer ·
Special Characters
. [ { ( ) \ ^ $ | ? * +
symbol | function | example |
---|---|---|
\ |
Escape symbol - makes the next character literal | \. dot \* star \\ backslash |
General tokens
symbol | function | example |
---|---|---|
\n |
Newline | |
\N |
Anything but a newline | |
\t |
Tab | |
\0 |
Null character |
Meta sequences
symbol | function | example |
---|---|---|
. |
Any single character other than newline (or including line terminators with the /s flag) | /.+/ = a b c |
a|b |
a OR b | |
\s |
Any whitespace | |
\S |
Any non-whitespace | |
\d |
Any digit | |
\D |
Any non-digit | |
\w |
Any word character | |
\W |
Any non-word character | |
\h |
Horizontal whitespace character | |
\r |
Carriage return | |
\R |
Unicode newlines |
Character classes
symbol | function | example |
---|---|---|
[abc] |
A single character of a b c | |
[^abc] |
A character except a b c | |
[a-zA-Z] |
A character in range a - z or A-Z | |
[a-z] |
A character in range a - z | |
[^a-z] |
A character not in range a - z | |
[[:<:]] |
Start of word. POSIX equivalent of the \b (word boundary) is interpreted as \b(?=\w) |
|
[[:>:]] |
End of word. POSIX equivalent of the \b word boundary is interpreted as \\b(?<=\\w) |
|
[[:alnum:]] |
Letters and digits. Equivalent to [A-Za-z0-9] |
|
[[:alpha:]] |
Letters. Equivalent to [A-Za-z] . |
|
[[:[[TXT file]]:]] |
ASCII codes 0 - 127. Equivalent to [\x00-\x7F] |
|
[[:blank:]] |
Space or Tab only (not new lines). Equivalent to [ \t] |
|
[[:word:]] |
Word character, letters, numbers, underscores. POSIX equivalent to \w or [a-zA-Z0-9_] |
|
[[:punct:]] |
Matches characters that are not whitespace, letters or numbers. |
Quantifiers
symbol | function | example |
---|---|---|
a? |
Zero or one of a. | .? = Zero or one any characters |
a* |
Zero or more of a. Greedy quantifier - matches as many characters as possible |
|
a+ |
One or more of a | |
a{3} |
Exactly 3 of a | |
a{3,} |
3 or more of a | |
a{3,6} |
Between 3 and 6 of a | |
a*? |
Lazy quantifier - matches as few characters as possible | |
a*+ |
Possessive quantifier |
Anchors
symbol | function | example |
---|---|---|
\b |
A word boundary | |
\B |
Non-word boundary | |
^ or \A |
Start of string | |
$ or \Z |
End of string | |
\z |
Absolute end of string |
Groups
symbol | function | example |
---|---|---|
(?: ...) |
Match anything enclosed | |
(...) |
Capture anything enclosed |
Substitution
symbol | function | example |
---|---|---|
$1 |
Contents of capture group 1 | |
$` | Contents before match | |
$' | Contents after match | |
$& | Complete match content | |
\x20 |
Hexadecimal replacement values | |
\x{06fa} |
Hexadecimal replacement values | |
\t |
Insert Tab | |
\r |
Insert carriage return | |
\n |
Insert Newline | |
\f |
Insert form-feed |
Character Escapes
symbol | code | meaning |
---|---|---|
\a |
\u007 | bell |
\b |
\u008 | Backspace |
\t |
\u009 | Tab |
\r |
\u00D | CR |
\v |
\u00B | Vertical tab |
\f |
\u00C | Formfeed |
\n |
\u00A | NL |
\e |
\u001B | Escape |
\nNN |
Octal character | |
\xNN |
Hex character | |
\uNNNN |
Unicode character |
Modifiers
symbol | function | example |
---|---|---|
g | Global | |
m | Multiline | |
i | Case sensitive | |
u | Unicode | |
U | Ungreedy | |
x | Ignore whitespace/verbose |
Useful ones
Get array content without parenthesis per word
(?:\[([^\]]+)\]) /gm
*"(\w+)": "?(\w*)"?,/gm
MySQL : table structure parser
^(\w+)\s+(\w+)(\(\d*\))?.*$
^\w{4,}@+\w{2,}(.com|.co.uk|.net|.info|.xyz)$ gm
Regex of email with ability to add gmail's +example system
root email = user@gmail.com gmail+ = user+extra@gmail.com
^[a-zA-Z0-9_.]+[+]?[a-zA-Z0-9]+[@]{1}[a-z0-9]+[\.][a-z]+$ /gm
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])? /g
\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\b /ig
\b(?:(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9]))|\:(\d*)\b /g
^(?1)){1,6}$ /gmi
^(?:[[:xdigit:{1,4}:){5}:(?:[[:xdigit:{1,4}:){1,6}:$ /gm
<.*?script.*\/?> /ig
^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$ /gm
(?:https?:\/\/)?(?:(?:(?:www\.?)?youtube\.com(?:\/(?:(?:watch\?.*?(v=[^&\s]+).*)|(?:v(\/.*))|(channel\/.+)|(?:user\/(.+))|(?:results\?(search_query=.+))))?)|(?:youtu\.be(\/.*)?))
(\[((?:\[^\[\)*)\]\([ \t]*()<?((?:\([^)]*\)|[^()\s])*?)>?[ \t]*((['"])(.*?)\6[ \t]*)?\)) /g
\/\/(?![\S]{2,}\.[\w]).*|\/\*(.|\n)+?\*\/ /g
(?:\s)\s /g
\b0x(?:[0-9A-Fa-f]{6}|0-9A-Fa-f]{8})\b