What Are Regular Expressions?
Regular expressions (regex or regexp) are sequences of characters that define a search pattern. They are one of the most powerful tools in a developer's toolkit, used for searching, validating, extracting, and replacing text. While the syntax can look intimidating at first, mastering regex will save you hundreds of hours of manual string processing.
A regular expression like ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ can validate an email address in a single line of code — no loops, no manual character checking. That's the power of regex.
Regular expressions are supported in virtually every programming language: JavaScript, Python, Java, Ruby, PHP, Go, Rust, and many more. The core syntax is largely the same across languages, though there are some differences in advanced features.
Basic Regex Syntax
Literal Characters
The simplest regex just matches literal text. The pattern hello matches the exact string "hello" anywhere in the input.
The Dot (.)
The dot matches any single character except a newline (unless the s flag is used):
- Pattern:
h.tmatches "hat", "hot", "hit", "h1t", "h@t"
Anchors (^ and $)
- ^ matches the start of the string (or line in multiline mode)
- $ matches the end of the string (or line in multiline mode)
- Pattern:
^hello$matches only the exact string "hello", nothing more, nothing less
Escaping Special Characters (\)
To match a literal special character, escape it with a backslash:
\.matches a literal period\+matches a literal plus sign\\matches a literal backslash
Character Classes
Character classes let you match any one character from a specified set.
Custom Character Classes ([])
[abc]matches "a", "b", or "c"[a-z]matches any lowercase letter[A-Z]matches any uppercase letter[0-9]matches any digit[a-zA-Z0-9]matches any alphanumeric character[^abc]matches any character except a, b, or c (the ^ inside [] means NOT)
Shorthand Character Classes
These are built-in shortcuts for common character sets:
| Shorthand | Matches | Equivalent |
\d | Any digit | [0-9] |
\D | Any non-digit | [^0-9] |
\w | Word character | [a-zA-Z0-9_] |
\W | Non-word character | [^a-zA-Z0-9_] |
\s | Whitespace | space, tab, newline |
\S | Non-whitespace | any non-whitespace |
Quantifiers
Quantifiers specify how many times the preceding element should match.
Basic Quantifiers
- **** — matches 0 or more** times:
a*matches "", "a", "aa", "aaa" - + — matches 1 or more times:
a+matches "a", "aa", "aaa" but NOT "" - ? — matches 0 or 1 time (makes it optional):
colou?rmatches "color" and "colour"
Exact Quantifiers ({})
{n}— exactly n times:\d{4}matches exactly 4 digits{n,}— n or more times:\d{2,}matches 2 or more digits{n,m}— between n and m times:\d{2,4}matches 2, 3, or 4 digits
Greedy vs Lazy Quantifiers
By default, quantifiers are greedy — they match as much as possible. Adding ? after a quantifier makes it lazy (matches as little as possible):
Input: bold and italic
Greedy: <.+> matches bold and italic (everything)
Lazy: <.+?> matches then then then (each tag)
Groups and Captures
Parentheses () create groups that serve two purposes: grouping expressions for quantifiers, and capturing the matched text.
Capturing Groups
const match = "2026-03-22".match(/(\d{4})-(\d{2})-(\d{2})/);
// match[1] = "2026" (year)
// match[2] = "03" (month)
// match[3] = "22" (day)
Non-Capturing Groups (?:)
When you need to group for quantifier purposes but don't need the captured text:
(?:https?://) groups "http://" or "https://" without capturing it.
Named Capturing Groups (?)
Modern JavaScript and Python support named groups:
const match = "2026-03-22".match(/(?\d{4})-(?\d{2})-(?\d{2})/);
// match.groups.year = "2026"
// match.groups.month = "03"
// match.groups.day = "22"
Lookahead and Lookbehind
- Positive lookahead
(?=...): match only if followed by the pattern - Negative lookahead
(?!...): match only if NOT followed by the pattern - Positive lookbehind
(?<=...): match only if preceded by the pattern - Negative lookbehind
(?: match only if NOT preceded by the pattern
Example: \d+(?= dollars) matches numbers followed by " dollars" but doesn't include " dollars" in the match.
Regex Flags
Flags modify how the regex engine processes the pattern:
| Flag | Name | Effect |
g | Global | Find all matches, not just the first |
i | Case-insensitive | Ignore upper/lowercase |
m | Multiline | ^ and $ match start/end of each line |
s | Dotall | . matches newlines too |
u | Unicode | Treat pattern as Unicode |
y | Sticky | Match only from lastIndex position |
Common Regex Patterns (Ready to Use)
Email Address
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
URL (http/https)
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&\/=]*)
US Phone Number
^(\+1)?[\s.-]?\(?[0-9]{3}\)?[\s.-]?[0-9]{3}[\s.-]?[0-9]{4}$
Matches: (555) 123-4567, 555-123-4567, +1 555 123 4567
Date (YYYY-MM-DD)
^\d{4}-(0[1-9]1[0-2])-(0[1-9] [12][0-9]
3[01])$
IPv4 Address
^((25[0-5]2[0-4][0-9] [01]?[0-9][0-9]?)\.){3}(25[0-5] 2[0-4][0-9]
[01]?[0-9][0-9]?)$
US ZIP Code
^\d{5}(-\d{4})?$
Strong Password (min 8 chars, uppercase, lowercase, digit, special char)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Credit Card Number (basic format)
^(?:4[0-9]{12}(?:[0-9]{3})?5[1-5][0-9]{14} 3[47][0-9]{13} 3(?:0[0-5]
[68][0-9])[0-9]{11})$
Using Regex in JavaScript
JavaScript has two ways to create regex: literals and the RegExp constructor.
// Regex literal (preferred for static patterns)
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/i;
// RegExp constructor (use when pattern is dynamic)
const dynamicPattern = new RegExp("hello" + variablePart, "gi");
// .test() — returns true/false
emailRegex.test("user@example.com"); // true
emailRegex.test("not-an-email"); // false
// .match() — returns array of matches
"foo bar baz".match(/\b\w+\b/g); // ["foo", "bar", "baz"]
// .replace() — replace matches
"hello world".replace(/world/, "regex"); // "hello regex"
"2026-03-22".replace(/(\d{4})-(\d{2})-(\d{2})/, "$3/$2/$1"); // "22/03/2026"
// .split() — split string by regex
"one,two;three four".split(/[,;\s]+/); // ["one", "two", "three", "four"]
Using Regex in Python
Python's re module provides comprehensive regex support:
import repattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
# re.match() — matches only at the start of the string
result = re.match(pattern, "user@example.com")
# re.search() — finds first match anywhere in the string
result = re.search(r"\d+", "There are 42 items in 3 boxes")
print(result.group()) # "42"
# re.findall() — returns all matches as a list
numbers = re.findall(r"\d+", "There are 42 items in 3 boxes")
print(numbers) # ["42", "3"]
# re.sub() — replace matches
result = re.sub(r"\s+", "-", "hello world foo")
print(result) # "hello-world-foo"
# Compile for performance when reusing the same pattern
email_re = re.compile(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}", re.IGNORECASE)
Regex Performance Tips
1. Be specific with character classes. Use [0-9] instead of . when you know you're matching digits. The more specific your pattern, the faster it runs.
2. Avoid catastrophic backtracking. Nested quantifiers like (a+)+ on a string that doesn't match can cause exponential backtracking and hang your application. Test your patterns with edge cases.
3. Compile patterns that are used repeatedly. In Python, use re.compile(). In JavaScript, define the regex literal once outside the loop.
4. Use non-capturing groups (?:) instead of capturing groups () when you don't need the captured text. This saves memory.
5. Anchor your patterns with ^ and $ when you need to match the full string. Without anchors, the engine searches the entire string unnecessarily.
Common Regex Mistakes
- Forgetting to escape special characters:
.in a pattern matches any character. Use\.to match a literal period. - Using greedy matching when lazy is needed: Always consider if your quantifiers should be lazy (
*?,+?) for patterns inside larger strings. - Not accounting for edge cases: Email regex in production should handle international domains. IP address regex should validate ranges 0-255.
- Relying on regex for HTML parsing: Don't parse HTML with regex. Use a proper DOM parser or library.
Test and Debug Your Regex
The best way to learn regex is to experiment. Use our free Regex Tester to test your patterns against real input strings, see all matches highlighted, and understand what each part of your pattern matches. For complex patterns, try our AI Regex Generator which can convert plain English descriptions into working regex patterns.