meta: fix non-leftmost match in reverse-suffix/inner optimizations#1346
Open
stefanobaghino wants to merge 1 commit into
Open
meta: fix non-leftmost match in reverse-suffix/inner optimizations#1346stefanobaghino wants to merge 1 commit into
stefanobaghino wants to merge 1 commit into
Conversation
The reverse-suffix and reverse-inner meta strategies returned on the first successful reverse search, which assumed that the leftmost suffix (resp. inner literal) occurrence corresponds to the leftmost-first match. That assumption fails when the regex prefix is non-monotonic — e.g. an optional group that can absorb the suffix character between consecutive occurrences — so a strictly later literal occurrence may have a strictly earlier overall match start. Both loops now track the best (smallest-start) candidate and bound each subsequent reverse search by `best.offset()`. The existing limited reverse DFA returns `Quadratic` the moment scanning past that bound is needed, which propagates to a `Core` fallback. An early-exit triggers as soon as the candidate already starts at `input.start()`. Adds a regression test covering the minimal repro from rust-lang#1345 (`[^()]*(?:\([^()]*\))?[^()]*:` on `$(:):`). Closes rust-lang#1345.
This was referenced May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1345.
ReverseSuffix::try_search_half_startandReverseInner::try_search_fullreturned on the first successful reverse search, which assumed that the leftmost occurrence of the suffix (resp. inner) literal corresponds to the end of the leftmost-first match. That assumption breaks when the regex prefix is non-monotonic — e.g. an optional group whose body can absorb the literal between consecutive occurrences — so a strictly later literal occurrence may have a strictly earlier overall match start. The minimal repro filed in #1345 is[^()]*(?:\([^()]*\))?[^()]*:on$(:):: META was returning2..3, while the leftmost-first match is0..5(the optional group absorbs(:)and pins to the second:).Both loops now track the best (smallest-start) candidate and bound each subsequent reverse search by
best.offset(). The existing limited reverse DFA inregex-automata/src/meta/limited.rsreturnsRetryError::Quadraticthe moment scanning past the bound is required, which propagates through?and triggers the existingCorefallback inStrategy::search. The loops also early-exit when the candidate already starts atinput.start(), so the common (monotonic-prefix) cases stay one-shot.The bug was reduced from real-world
.sublime-syntaxgrammars driven through fancy-regex's seek-prefilter approximation; see fancy-regex/fancy-regex#249 for the downstream context.testdata/regression.tomlgets the minimal repro asnon-monotonic-reverse-suffix; it failed againstmetaand passed againstpikevm/hybridbefore this change.