What is a regular expression in JavaScript?

A regular expression (regex) is a pattern used to match, search, and replace text. JavaScript regex literals use /pattern/flags syntax. Common flags: g (global — find all matches), i (case insensitive), m (multiline), s (dotAll — . matches newlines). Regex is used for validation, string parsing, and text transformation.

What is the difference between test() and match() in JavaScript regex?

regex.test(string) returns a boolean — true if the pattern is found. string.match(regex) returns an array of matched substrings (or null). With the g flag, match() returns all matches. Without g, it returns the first match with capture groups. Use test() for validation, match() for extraction.

What is the global flag gotcha in JavaScript regex?

When a regex has the g flag, it maintains a lastIndex property between calls. Calling .test() or .exec() repeatedly on the same regex object advances lastIndex, causing alternating true/false results on the same string. Either create a new regex each call, or use string.match() which resets correctly.

Advanced2 questionsFull Guide

JavaScript Regular Expressions Interview Questions

Regex is frequently tested in JavaScript interviews. Learn pattern syntax, flags, lookahead/lookbehind, and common pitfalls.

The Mental Model

Picture a very precise stencil. You press the stencil against a block of text and wherever the stencil's cut-out shape matches the text beneath, you've found a match. The stencil is your regex pattern. The block of text is your string. "Does this shape appear in the text?" is a test. "Where does this shape appear?" is a search. "Replace every place this shape appears with a different shape" is a replace operation. What makes the stencil powerful — and difficult — is that it describes shapes, not exact text. Instead of saying "match the word 'color'", you can say "match any sequence of lowercase letters." Instead of "match '2024'", you can say "match any four consecutive digits." The stencil can describe patterns that match millions of possible strings with a single concise expression. The key insight: regex is a mini-language inside JavaScript for describing text patterns. It has its own syntax, its own rules, its own operators. When you learn to read and write it fluently, string validation, extraction, and transformation tasks that would require dozens of lines of code collapse into a single expression. But regex that isn't understood isn't maintained — it degrades into untouchable magic. The goal is fluency, not mystery.

The Explanation

Creating regex — literal vs constructor

// Regex literal — compiled at parse time, best for static patterns
const pattern = /hello/i   // i flag = case-insensitive

// RegExp constructor — compiled at runtime, required for dynamic patterns
const word    = 'hello'
const dynamic = new RegExp(word, 'i')  // same as /hello/i
const escaped = new RegExp(word.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'i')
// Must manually escape special characters in dynamic patterns!

// The 6 flags — each changes how matching works:
/pattern/g  // global — find ALL matches, not just first
/pattern/i  // case-insensitive — 'A' matches 'a'
/pattern/m  // multiline — ^ and $ match line starts/ends, not just string start/end
/pattern/s  // dotAll — . matches newlines too (default: . doesn't match \n)
/pattern/u  // unicode — enables Unicode code point escapes, fixes surrogate pairs
/pattern/y  // sticky — match only at lastIndex position, advance or fail

The essential character classes and quantifiers

// Character classes — what to match
.       // any character except newline (use /s flag to include newline)
\d      // digit [0-9]
\D      // non-digit [^0-9]
\w      // word character [a-zA-Z0-9_]
\W      // non-word character
\s      // whitespace (space, tab, newline, etc.)
\S      // non-whitespace
[abc]   // any one of: a, b, or c
[^abc]  // any character EXCEPT a, b, c
[a-z]   // range: any lowercase letter
[a-zA-Z0-9]  // union of ranges

// Anchors — where to match (zero width — they match position, not characters)
^       // start of string (or start of line with /m flag)
$       // end of string (or end of line with /m flag)
\b      // word boundary — between \w and \W
\B      // non-word boundary

// Quantifiers — how many times to match the preceding element
*       // 0 or more (greedy)
+       // 1 or more (greedy)
?       // 0 or 1 (makes preceding optional)
{3}     // exactly 3 times
{2,5}   // 2 to 5 times (greedy)
{2,}    // 2 or more times
*?      // 0 or more (LAZY — as few as possible)
+?      // 1 or more (LAZY)
{2,5}?  // 2 to 5 (LAZY)

// Examples:
/^\d{4}-\d{2}-\d{2}$/    // ISO date: 2024-03-15
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/  // email (simplified)
/https?:\/\/\S+/          // URL starting with http or https
/\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/  // IPv4 address

Groups — capturing, non-capturing, named

// Capturing group — (pattern) — captures the match for extraction
const date = '2024-03-15'
const match = date.match(/(\d{4})-(\d{2})-(\d{2})/)
match[0]  // '2024-03-15' — full match
match[1]  // '2024'       — first capture group
match[2]  // '03'         — second capture group
match[3]  // '15'         — third capture group

// Non-capturing group — (?:pattern) — group without capturing
// Use when you need to group for quantifiers but don't need the captured value
/(?:https?|ftp):\/\/\S+/  // groups 'https?' | 'ftp' without capturing
// Faster: non-capturing groups don't allocate capture memory

// Named capturing group — (?pattern) — ES2018
const { groups } = '2024-03-15'.match(/(?\d{4})-(?\d{2})-(?\d{2})/)
groups.year   // '2024'
groups.month  // '03'
groups.day    // '15'

// Backreferences — reference a previous capture group
/(['"]).*?\1/  // matches 'single quoted' or "double quoted" strings
              // \1 refers back to whatever group 1 matched (the quote character)

// Named backreference — ES2018
/(?['"]).*?\k/  // same as above with named group

The six regex methods — which to use when

const str = 'The price is $19.99 and $5.50'

// 1. regex.test(str) → boolean — does the pattern exist?
/\d+/.test(str)  // true — fast existence check, no allocation

// 2. str.match(regex) → array or null
// Without /g: returns first match + captured groups
str.match(/\$(\d+\.\d{2})/)
// ['$19.99', '19.99', index: 13, input: '...', groups: undefined]

// With /g: returns array of all full matches — NO captured groups!
str.match(/\$\d+\.\d{2}/g)  // ['$19.99', '$5.50']
// Note: with /g flag, captured groups are lost from match()

// 3. str.matchAll(regex) → iterator of all matches WITH groups — ES2020
// regex MUST have /g flag
for (const m of str.matchAll(/\$(?\d+\.\d{2})/g)) {
  console.log(m.groups.amount)  // '19.99', then '5.50'
}
// Or collect: [...str.matchAll(pattern)]

// 4. str.search(regex) → index of first match or -1
str.search(/\$/)  // 13 — index, no captured groups

// 5. str.replace(regex, replacement) / str.replaceAll(string, replacement)
str.replace(/\$(\d+\.\d{2})/g, '€$1')  // 'The price is €19.99 and €5.50'
// $1 in replacement string = first capture group
// $& = entire match, $` = before match, $' = after match

// Replacement as a function — full power
str.replace(/\$(\d+\.\d{2})/g, (match, amount) => {
  return '€' + (parseFloat(amount) * 0.92).toFixed(2)
})  // converts USD to EUR inline

// 6. str.split(regex) — split on a pattern
'one1two2three3four'.split(/\d/)  // ['one', 'two', 'three', 'four']
'one1two2three'.split(/(\d)/)     // includes captures: ['one','1','two','2','three']

Lookahead and lookbehind — match without consuming

// Lookahead (?=...) — match X only if followed by Y
'100px 200em 50px'.match(/\d+(?=px)/g)   // ['100', '50'] — digits followed by 'px'
// The 'px' is not included in the match — it's just a condition

// Negative lookahead (?!...) — match X only if NOT followed by Y
'100px 200em 50px'.match(/\d+(?!px|\d)/g) // ['200'] — digits not followed by px or digit

// Lookbehind (?<=...) — match X only if preceded by Y — ES2018
'$100 €200 $50'.match(/(?<=\$)\d+/g)   // ['100', '50'] — digits preceded by $

// Negative lookbehind (?



Greedy vs lazy — the matching strategy
const html = 'bold and italic'

// Greedy — matches as MUCH as possible (default)
html.match(/<.+>/)   // ['bold and italic'] — too much!
// The .+ expands as far right as it can before yielding

// Lazy — matches as LITTLE as possible (+? *?)
html.match(/<.+?>/g) // ['', '', '', ''] — correct
// The .+? expands only as far as needed before the > can match

// Catastrophic backtracking — a real performance issue
// A pathological regex can take exponential time on certain inputs
// Example: /^(a+)+$/ on 'aaaaaaaaaaaaaaaaaab'
// The nested quantifiers cause exponential backtracking attempts
// This is how ReDoS (Regular Expression Denial of Service) attacks work

// Safe patterns:
// - Avoid nested quantifiers on overlapping patterns: (a+)+, (a|ab)+
// - Use atomic groups (not in JS yet) or possessive quantifiers if available
// - Test performance with inputs that contain near-matches

The lastIndex gotcha with /g flag
// Regex objects with /g are STATEFUL — they remember where they left off
const re = /\d+/g

re.test('abc 123')  // true  — lastIndex set to 7
re.test('abc 123')  // false — starts from lastIndex 7, finds nothing
re.test('abc 123')  // true  — lastIndex reset to 0 after failed match

// This bites you when you reuse a /g regex:
const emails = /[\w.-]+@[\w.-]+\.\w+/g
emails.test('a@b.com')   // true, lastIndex advances
emails.test('c@d.com')   // might be false or wrong — stale lastIndex!

// Fix: reset lastIndex or use a fresh regex each time
re.lastIndex = 0
// OR use a function that creates a fresh regex:
const hasEmail = str => /[\w.-]+@[\w.-]+\.\w+/.test(str)  // new regex each call
// Literal regex in a function is compiled once but lastIndex resets per regex instance

Common Misconceptions

⚠️Many devs think regex with the /g flag returns captured groups when using .match() — but actually when you call str.match(/pattern/g), the return value is an array of full match strings only — captured groups are completely absent. To get all matches WITH captured groups, use str.matchAll(/pattern/g) which returns an iterator of full match objects including groups (ES2020), or use a while loop with regex.exec().

⚠️Many devs think /g regexes are stateless and safe to share — but actually RegExp objects with the /g flag maintain a lastIndex property that tracks where the next search should start. Reusing a /g regex across multiple test() or exec() calls produces inconsistent results because lastIndex is not reset between calls. This is one of the most confusing stateful behaviors in JavaScript — always reset lastIndex or create a new regex instance.

⚠️Many devs think . in a regex matches any character — but actually . matches any character except newline (\n, \r, \u2028, \u2029) by default. To match truly any character including newlines, you must use the /s flag (dotAll mode, ES2018) or use a character class workaround like [\s\S] which matches whitespace OR non-whitespace — covering every possible character.

⚠️Many devs think complex regex is always the right tool for parsing complex formats — but actually regex is the wrong tool for parsing recursive or context-free grammars like HTML, JSON, nested brackets, and programming languages. The famous "don't parse HTML with regex" rule exists because HTML is not a regular language — nesting and context require a proper parser. Regex handles flat patterns; for structure, use a parser.

⚠️Many devs think catastrophic backtracking is a theoretical concern — but actually ReDoS (Regular Expression Denial of Service) is a real attack class with documented CVEs affecting Node.js, web servers, and validation libraries. Patterns with nested quantifiers on overlapping character classes (like /(a|aa)+/) can take exponential time on carefully crafted near-match inputs, hanging Node.js event loops for seconds or minutes with only a few hundred characters of input.

Where You'll See This in Real Code

→Zod and Yup — the most popular validation libraries in the JavaScript ecosystem — use regex internally for string validations like .email(), .url(), and .uuid(). Zod's email validator is a carefully chosen regex that balances RFC compliance with practical false-negative avoidance. Understanding regex makes these validators debuggable when they reject valid-looking input, rather than being black boxes.

→ESLint's rule detection and source code analysis uses regex extensively for quick pattern matching before invoking the full AST parser — for performance, simple checks like detecting console.log calls use regex to skip files that couldn't possibly match before parsing. Babel and TypeScript transforms also use regex for detecting comment directives (// @ts-ignore, // eslint-disable-next-line) since these are structurally simple patterns in the source text.

→URL routing in frameworks like Express.js and React Router originally used regex for route matching — Express's path-to-regexp library converts route strings like '/users/:id/posts/:postId' into regex patterns with named capture groups. Understanding how :id becomes (?<id>[^/]+) in a regex makes the routing behavior — what paths match, which don't — predictable rather than magical.

→Log parsing in production monitoring is almost entirely regex-based — structured log parsers in tools like Logstash, Datadog, and custom log processors use grok patterns (named regex patterns) to extract structured fields from unstructured log lines. A single regex extracts timestamp, log level, service name, request ID, and message from a log line in one pass, then the captured groups become queryable structured data.

→Syntax highlighting in code editors — VS Code, CodeMirror, Monaco — uses TextMate grammar files which are essentially hierarchical regex patterns that tokenize source code into colored segments. Every color you see in a code editor corresponds to a regex match. The language grammar for JavaScript syntax highlighting is a carefully composed set of several hundred regex patterns for keywords, strings, comments, identifiers, and operators.

⚡Interview Cheat Sheet

✦Literal /pattern/flags: static; new RegExp(str, flags): dynamic — escape special chars in dynamic
✦Flags: g (all matches), i (case-insensitive), m (multiline ^$), s (dot matches \n), u (unicode), y (sticky)
✦\d \w \s and their uppercase inverses; [abc] custom class; [^abc] negated class
✦Quantifiers: * (0+), + (1+), ? (0/1), {n,m}; add ? for lazy: *? +? {n,m}?
✦Anchors: ^ $ \b \B — zero-width position matchers
✦(group): capturing; (?:group): non-capturing; (?<name>group): named capture (ES2018)
✦test(): boolean; match(): first match or all strings with /g; matchAll(): all with groups; replace(): string or fn
✦/g regex is stateful — lastIndex persists between exec()/test() calls
✦Greedy expands max; lazy (+? *?) expands min — use lazy for HTML tag matching
✦Lookahead (?=) (?!); lookbehind (?<=) (?<!): conditions without consuming characters
✦ReDoS: nested quantifiers on overlapping patterns cause exponential backtracking

💡How to Answer in an Interview

1.The global flag + .test() alternating bug is a classic gotcha question
2.Email validation regex is a classic implementation request — have a solid one ready
3.Named capture groups: /(?<year>\d{4})/ → match.groups.year — shows modern knowledge

`Practice Questions`

2 questions

#01MediumModern JSPRO
Regex with global flag alternates true/false

#02What are the basics of regular expressions in JavaScript?
MediumModern JS PRO💡 Pattern matching: literal chars, character classes, quantifiers, groups, flags (g, i, m)

`Related Topics`

JavaScript Type Coercion Interview Questions
Intermediate·4–8 Qs

🎯

`Can you answer these under pressure?`

Reading answers is not the same as knowing them. Practice saying them out loud with AI feedback — that's what builds real interview confidence.

Practice Free →Try Output Quiz