Skip to content

regex-syntax: escape ASCII whitespace in escape() and escape_into()#1353

Closed
tryHivemind wants to merge 1 commit into
rust-lang:masterfrom
tryHivemind:fix/escape-whitespace
Closed

regex-syntax: escape ASCII whitespace in escape() and escape_into()#1353
tryHivemind wants to merge 1 commit into
rust-lang:masterfrom
tryHivemind:fix/escape-whitespace

Conversation

@tryHivemind
Copy link
Copy Markdown

escape() output is not safe inside verbose mode (?x:...) where ASCII whitespace is ignored, breaking the documented guarantee. Fix escape_into() to emit hex escapes for ASCII whitespace so output is valid in all regex contexts. Fixes #1323

escape() documents that its output "may be safely used as a literal in
a regular expression", but unescaped whitespace characters break that
guarantee when the result is embedded in a verbose-mode group (?x:...),
where ASCII whitespace is treated as insignificant and ignored:

    let lit = regex_syntax::escape(" "); // returns " "
    Regex::new(&format!("^(?x:{lit})$")).unwrap().is_match(" ") // false!

Fix escape_into() to emit \xNN hex escapes for ASCII whitespace
characters, making the output valid in all regex contexts including
verbose mode. Non-ASCII whitespace (e.g. U+2000 EN QUAD) is not a
concern because verbose mode only ignores ASCII whitespace per the
regex crate's own definition.

Fixes #1323
@tryHivemind tryHivemind closed this by deleting the head repository May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

regex_syntax::escape() does not escape whitespace

1 participant