Skip to content

perf: literal-prefix capture-extraction fast path#1350

Draft
Dandandan wants to merge 6 commits into
rust-lang:masterfrom
Dandandan:perf/literal-prefix-capture-extract
Draft

perf: literal-prefix capture-extraction fast path#1350
Dandandan wants to merge 6 commits into
rust-lang:masterfrom
Dandandan:perf/literal-prefix-capture-extract

Conversation

@Dandandan
Copy link
Copy Markdown

@Dandandan Dandandan commented May 19, 2026

For anchored regexes of the shape

^<literal-prefix-set>([^X]+)X.*$    with replacement `\${1}` / `\$1`

capture 1's bounds are structurally trivial: skip the prefix, find the terminator with memchr. This PR makes both the meta engine and Regex::replacen exploit that.

Also see BurntSushi/rebar#28 for the benchmark - here it shows a 14x improvement.

Changes

  • regex-automata — new LiteralPrefixCapture strategy in meta/strategy.rs. Recognizes the shape via HIR walking (single pattern, anchored start+end, default flags, ASCII terminator, finite literal-alternation prefix set capped at 32). Strategy methods extract the slots directly with memchr, bypassing PikeVM/BoundedBacktracker. Wires in alongside the existing reverse strategies. Pattern fails recognition -> Err(core) falls through.
  • regexRegex::replacen gets a borrowed-output fast path for replacements that are exactly \$N / \${N}, detected via a new Replacer::single_capture_ref method (default None; opted into for &str / String / Cow<str>). When limit == 1 and the match covers the whole haystack, returns Cow::Borrowed of the captured slice — no Captures::expand, no output String allocation.

Bench

500k synthetic Referer-style rows, 5-iter mean, same machine:

Pattern, corpus Before After
^https?://(?:www\.)?([^/]+)/.*\$, 80% match 281 ms 39 ms (7.3x)
^key=([^,]+),.*\$, 100% match 113 ms 27 ms (4.2x)

Remaining overhead

Regex::captures_read with a reused CaptureLocations is only ~1.2x the prototype direct-memchr path on the same workload, so the meta-engine + Strategy dispatch is essentially free. The remaining gap to the prototype is dominated by the per-call Captures slot allocation in Regex::captures — pooling that is a separate change.

@Dandandan Dandandan force-pushed the perf/literal-prefix-capture-extract branch from 72bb5fe to a41f810 Compare May 19, 2026 07:41
For anchored patterns of the shape

    ^<literal-prefix-set>([^X]+)X.*$    with replacement `${1}` (or `$1`)

capture 1's bounds are structurally trivial — skip the prefix, find the
terminator with memchr — so the engine doesn't need to track captures
at all.

Two changes work together:

1. A new `LiteralPrefixCapture` strategy in `regex-automata`'s meta
   engine recognizes the shape via HIR walking (single-pattern only,
   anchored at both ends, default flags, ASCII terminator, finite
   literal-alternation prefix set capped at 32 variants). Strategy
   methods extract the match and capture-1 slots directly with memchr,
   bypassing PikeVM / BoundedBacktracker. Wires in alongside the
   existing reverse strategies.

2. `Regex::replacen` gets a borrowed-output fast path for replacements
   that are exactly `$N` / `${N}`. Detected via a new
   `Replacer::single_capture_ref` method (default `None`, opted into
   for `&str`/`String`/`Cow<str>`). For `limit == 1` with a match
   covering the whole haystack, returns `Cow::Borrowed` of the
   captured slice — no `Captures::expand`, no output string
   allocation.

Bench (500k synthetic Referer rows, 5-iter mean, on the same machine):

    Regex::replacen, q28 pattern, 80% match
        before:  281 ms
        after:    39 ms   (7.3x)

    Regex::replacen, ^key=([^,]+),.*$, 100% match
        before:  113 ms
        after:    27 ms   (4.2x)

Tests: 257 / 257 pass (regex-automata --lib + --test integration, regex
--test integration). No regressions.
@Dandandan Dandandan force-pushed the perf/literal-prefix-capture-extract branch from a41f810 to 54e8d02 Compare May 19, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant