RegEXP

Regular expressions

· Coding · RegExp tester/tracer ·

Special Characters

. [ { ( ) \ ^ $ | ? * +

symbol function example
\ Escape symbol - makes the next character literal \. dot
\* star
\\ backslash

General tokens

symbol function example
\n Newline
\N Anything but a newline
\t Tab
\0 Null character

Meta sequences

symbol function example
. Any single character other than newline (or including line terminators with the /s flag) /.+/ = a b c
a|b a OR b
\s Any whitespace
\S Any non-whitespace
\d Any digit
\D Any non-digit
\w Any word character
\W Any non-word character
\h Horizontal whitespace character
\r Carriage return
\R Unicode newlines

Character classes

symbol function example
[abc] A single character of a b c
[^abc] A character except a b c
[a-zA-Z] A character in range a - z or A-Z
[a-z] A character in range a - z
[^a-z] A character not in range a - z
[[:<:]] Start of word.
POSIX equivalent of the \b (word boundary) is interpreted as \b(?=\w)
[[:>:]] End of word.
POSIX equivalent of the \b word boundary is interpreted as \\b(?<=\\w)
[[:alnum:]] Letters and digits. Equivalent to [A-Za-z0-9]
[[:alpha:]] Letters.
Equivalent to [A-Za-z].
[[:[[TXT file]]:]] ASCII codes 0 - 127.
Equivalent to [\x00-\x7F]
[[:blank:]] Space or Tab only (not new lines).
Equivalent to [ \t]
[[:word:]] Word character, letters, numbers, underscores.
POSIX equivalent to \w or [a-zA-Z0-9_]
[[:punct:]] Matches characters that are not whitespace, letters or numbers.

Quantifiers

symbol function example
a? Zero or one of a. .? = Zero or one any characters
a* Zero or more of a.
Greedy quantifier - matches as many characters as possible
a+ One or more of a
a{3} Exactly 3 of a
a{3,} 3 or more of a
a{3,6} Between 3 and 6 of a
a*? Lazy quantifier - matches as few characters as possible
a*+ Possessive quantifier

Anchors

symbol function example
\b A word boundary
\B Non-word boundary
^ or \A Start of string
$ or \Z End of string
\z Absolute end of string

Groups

symbol function example
(?: ...) Match anything enclosed
(...) Capture anything enclosed

Substitution

symbol function example
$1 Contents of capture group 1
$` Contents before match
$' Contents after match
$& Complete match content
\x20 Hexadecimal replacement values
\x{06fa} Hexadecimal replacement values
\t Insert Tab
\r Insert carriage return
\n Insert Newline
\f Insert form-feed

Character Escapes

symbol code meaning
\a \u007 bell
\b \u008 Backspace
\t \u009 Tab
\r \u00D CR
\v \u00B Vertical tab
\f \u00C Formfeed
\n \u00A NL
\e \u001B Escape
\nNN Octal character
\xNN Hex character
\uNNNN Unicode character

Modifiers

symbol function example
g Global
m Multiline
i Case sensitive
u Unicode
U Ungreedy
x Ignore whitespace/verbose

Useful ones

Get array content without parenthesis per word

(?:\[([^\]]+)\]) /gm

Extract json to csv

*"(\w+)": "?(\w*)"?,/gm

MySQL : table structure parser

^(\w+)\s+(\w+)(\(\d*\))?.*$

Match an email address

^\w{4,}@+\w{2,}(.com|.co.uk|.net|.info|.xyz)$ gm

Regex of email with ability to add gmail's +example system
root email = user@gmail.com gmail+ = user+extra@gmail.com

^[a-zA-Z0-9_.]+[+]?[a-zA-Z0-9]+[@]{1}[a-z0-9]+[\.][a-z]+$ /gm

RFC 2822 Email validation

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])? /g

IP4 address

\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\b /ig

IP proxy scrap

\b(?:(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9]))|\:(\d*)\b /g

Match an IPv6 address

^(?1)){1,6}$ /gmi

OR

^(?:[[:xdigit:{1,4}:){5}:(?:[[:xdigit:{1,4}:){1,6}:$ /gm

Detect script tag

<.*?script.*\/?> /ig

Phone number

^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$ /gm

Youtube URL

(?:https?:\/\/)?(?:(?:(?:www\.?)?youtube\.com(?:\/(?:(?:watch\?.*?(v=[^&\s]+).*)|(?:v(\/.*))|(channel\/.+)|(?:user\/(.+))|(?:results\?(search_query=.+))))?)|(?:youtu\.be(\/.*)?))

Makdown link

(\[((?:\[^\[\)*)\]\([ \t]*()<?((?:\([^)]*\)|[^()\s])*?)>?[ \t]*((['"])(.*?)\6[ \t]*)?\)) /g

JS Comment

\/\/(?![\S]{2,}\.[\w]).*|\/\*(.|\n)+?\*\/ /g

Trim whitespace

(?:\s)\s /g

Hex color

\b0x(?:[0-9A-Fa-f]{6}|0-9A-Fa-f]{8})\b