Regex Tutorial for Developers
Regular expressions (regex) are one of the most powerful β and most feared β tools in a developer's arsenal. A 20-character regex can replace 50 lines of string-manipulation code. But they can also be cryptic, slow-to-write, and notoriously hard to read. This guide takes you from the basics to advanced patterns with practical, real-world examples.
1. What Is a Regular Expression?
A regular expression is a sequence of characters that defines a search pattern. Regex engines use these patterns to find, match, validate, extract, or replace text within strings.
// JavaScript example
const emailRegex = /^[\w.-]+@[\w.-]+\.[a-z]{2,}$/i;
console.log(emailRegex.test('[email protected]')); // true
console.log(emailRegex.test('not-an-email')); // false2. Building Blocks: The Core Syntax
2.1 Literal Characters
The simplest regex matches exact characters:
/hello/ β matches "hello" in "say hello world" /JSON/ β matches "JSON" in "parsing JSON data"
2.2 Anchors
| Anchor | Meaning | Example |
|---|---|---|
| ^ | Start of string/line | /^hello/ matches "hello world" but not "say hello" |
| $ | End of string/line | /world$/ matches "hello world" but not "world peace" |
| \b | Word boundary | /\bcat\b/ matches "cat" but not "catfish" |
| \B | Non-word boundary | /\Bcat\B/ matches "concatenate" internal cat |
2.3 Character Classes
. Any character except newline \d Digit [0-9] \D Non-digit \w Word character [a-zA-Z0-9_] \W Non-word character \s Whitespace (space, tab, newline) \S Non-whitespace [abc] One of: a, b, or c [^abc] Any character EXCEPT a, b, c [a-z] Lowercase letter range [A-Z0-9] Uppercase or digit
2.4 Quantifiers
* Zero or more (greedy)
+ One or more (greedy)
? Zero or one (optional)
{n} Exactly n times
{n,} At least n times
{n,m} Between n and m times
// Lazy (non-greedy) versions β add ?
*? +? ?? {n,m}?3. Groups and Capturing
// Capturing group β saves matched text
const dateRegex = /(\d{4})-(\d{2})-(\d{2})/;
const match = '2026-03-15'.match(dateRegex);
// match[1] = "2026", match[2] = "03", match[3] = "15"
// Named capturing group
const named = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const m = '2026-03-15'.match(named);
console.log(m.groups.year); // "2026"
console.log(m.groups.month); // "03"
// Non-capturing group β group without saving
const noCapture = /(?:https?|ftp):\/\//;
// Alternation (OR)
const proto = /^(http|https|ftp):/;4. Lookahead and Lookbehind
Lookarounds match a position, not characters β they do not consume input:
// Positive lookahead: X followed by Y
/\d+(?= dollars)/.exec('100 dollars') // matches "100"
/\d+(?= dollars)/.exec('100 euros') // null
// Negative lookahead: X NOT followed by Y
/\d+(?! dollars)/.exec('100 euros') // matches "100"
// Positive lookbehind: X preceded by Y
/(?<=\$)\d+/.exec('$100') // matches "100"
// Negative lookbehind: X NOT preceded by Y
/(?<!\$)\d+/.exec('β¬100') // matches "100"5. Flags / Modifiers
| Flag | JS | Python | Meaning |
|---|---|---|---|
| Case insensitive | i | re.IGNORECASE | /hello/i matches HELLO |
| Global (find all) | g | findall() | /a/g finds every "a" |
| Multiline | m | re.MULTILINE | ^ and $ match line start/end |
| Dot matches newline | s | re.DOTALL | . matches \n too |
| Extended/verbose | β | re.VERBOSE | Allows whitespace and comments in regex |
6. Real-World Patterns
Email validation
/^[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}$/Covers most common email formats. RFC 5322 full compliance requires a much more complex pattern.
URL matching
/https?:\/\/[\w.-]+(?:\/[\w./?=&#%-]*)? /
Matches http and https URLs with optional path and query string.
IPv4 address
/^(25[0-5]|2[0-4]\d|[01]?\d\d?)(\.(25[0-5]|2[0-4]\d|[01]?\d\d?)){3}$/Validates each octet is 0β255.
Password strength
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/At least 8 chars, with uppercase, lowercase, digit, and special character.
Hex color code
/^#([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$/Matches both 3-digit (#fff) and 6-digit (#ffffff) hex colors.
7. Performance Tips
- Avoid catastrophic backtracking: Patterns like
(a+)+can cause exponential time complexity on certain inputs. Use atomic groups or possessive quantifiers where available. - Anchor your patterns: Use
^and$to prevent the engine from scanning the entire string unnecessarily. - Prefer specific character classes:
[0-9]is faster than.when you only need digits. - Compile once, reuse: In Python (
re.compile()) and Java, compile the regex once and reuse the compiled object across calls. - Limit greedy quantifiers: Use lazy quantifiers (
*?,+?) when you need the shortest match to avoid over-matching.
8. Conclusion
Regular expressions reward investment. Once you understand anchors, quantifiers, groups, and lookarounds, you can solve complex text processing problems in seconds. The key is practice β build patterns incrementally, test them against multiple inputs, and document them with comments for future maintainers.
Use our free Regex Tester to build and test patterns in real time, directly in your browser.