perf: literal-prefix capture-extraction fast path#1350
Draft
Dandandan wants to merge 6 commits into
Draft
Conversation
72bb5fe to
a41f810
Compare
For anchored patterns of the shape
^<literal-prefix-set>([^X]+)X.*$ with replacement `${1}` (or `$1`)
capture 1's bounds are structurally trivial — skip the prefix, find the
terminator with memchr — so the engine doesn't need to track captures
at all.
Two changes work together:
1. A new `LiteralPrefixCapture` strategy in `regex-automata`'s meta
engine recognizes the shape via HIR walking (single-pattern only,
anchored at both ends, default flags, ASCII terminator, finite
literal-alternation prefix set capped at 32 variants). Strategy
methods extract the match and capture-1 slots directly with memchr,
bypassing PikeVM / BoundedBacktracker. Wires in alongside the
existing reverse strategies.
2. `Regex::replacen` gets a borrowed-output fast path for replacements
that are exactly `$N` / `${N}`. Detected via a new
`Replacer::single_capture_ref` method (default `None`, opted into
for `&str`/`String`/`Cow<str>`). For `limit == 1` with a match
covering the whole haystack, returns `Cow::Borrowed` of the
captured slice — no `Captures::expand`, no output string
allocation.
Bench (500k synthetic Referer rows, 5-iter mean, on the same machine):
Regex::replacen, q28 pattern, 80% match
before: 281 ms
after: 39 ms (7.3x)
Regex::replacen, ^key=([^,]+),.*$, 100% match
before: 113 ms
after: 27 ms (4.2x)
Tests: 257 / 257 pass (regex-automata --lib + --test integration, regex
--test integration). No regressions.
a41f810 to
54e8d02
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For anchored regexes of the shape
capture 1's bounds are structurally trivial: skip the prefix, find the terminator with memchr. This PR makes both the meta engine and
Regex::replacenexploit that.Also see BurntSushi/rebar#28 for the benchmark - here it shows a 14x improvement.
Changes
regex-automata— newLiteralPrefixCapturestrategy inmeta/strategy.rs. Recognizes the shape via HIR walking (single pattern, anchored start+end, default flags, ASCII terminator, finite literal-alternation prefix set capped at 32). Strategy methods extract the slots directly with memchr, bypassing PikeVM/BoundedBacktracker. Wires in alongside the existing reverse strategies. Pattern fails recognition ->Err(core)falls through.regex—Regex::replacengets a borrowed-output fast path for replacements that are exactly\$N/\${N}, detected via a newReplacer::single_capture_refmethod (defaultNone; opted into for&str/String/Cow<str>). Whenlimit == 1and the match covers the whole haystack, returnsCow::Borrowedof the captured slice — noCaptures::expand, no outputStringallocation.Bench
500k synthetic Referer-style rows, 5-iter mean, same machine:
^https?://(?:www\.)?([^/]+)/.*\$, 80% match^key=([^,]+),.*\$, 100% matchRemaining overhead
Regex::captures_readwith a reusedCaptureLocationsis only ~1.2x the prototype direct-memchr path on the same workload, so the meta-engine + Strategy dispatch is essentially free. The remaining gap to the prototype is dominated by the per-callCapturesslot allocation inRegex::captures— pooling that is a separate change.