Skip to content

fix: dispatch LLMLingua-2 models by loosened name markers (#168)#248

Open
ousamabenyounes wants to merge 1 commit into
microsoft:mainfrom
ousamabenyounes:fix/issue-168
Open

fix: dispatch LLMLingua-2 models by loosened name markers (#168)#248
ousamabenyounes wants to merge 1 commit into
microsoft:mainfrom
ousamabenyounes:fix/issue-168

Conversation

@ousamabenyounes
Copy link
Copy Markdown

@ousamabenyounes ousamabenyounes commented Apr 11, 2026

What does this PR do?

Fixes #168

`is_begin_of_new_word` and `get_pure_token` in `llmlingua/utils.py` dispatch on literal substrings — `"bert-base-multilingual-cased"` for the BERT family and `"xlm-roberta-large"` for the XLM-RoBERTa family. Users who download LLMLingua-2 weights and load them from a renamed local directory (for example the reporter's `/home/webservice/llm/compressFromNet/llmlingua-2-xlm`) fall straight through to `raise NotImplementedError` deep inside `compress_prompt_llmlingua2`, with no hint of what went wrong.

Fix

Factor the dispatch into `_model_family(model_name)` and broaden the marker lists:

Family Old markers New markers
BERT `bert-base-multilingual-cased`, `tinybert`, `mobilebert` + `bert-base-multilingual` (no `-cased` suffix), `llmlingua-2-bert`
XLM-RoBERTa `xlm-roberta-large`, `slingua`, `securitylingua` + `xlm-roberta` (no `-large` suffix), `llmlingua-2-xlm`

`None`/empty `model_name` now returns `None` cleanly instead of crashing on the old `"..." in model_name` check, and the fallback `NotImplementedError` now lists the supported markers so diagnosing a bad model_name no longer requires reading the source.

The canonical model names, `tinybert`, `mobilebert`, `slingua` and `securitylingua` all keep working unchanged.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Was this discussed/approved via a Github issue? Please add a link to it if that's the case — [Bug]: NotImplementedError #168.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? — `tests/test_issue_168.py` has 17 new tests (runs in ~5s, no model download): the reporter's exact local path, canonical XLM-RoBERTa / BERT names, bare `xlm-roberta-base`, `tinybert`, `mobilebert`, `slingua`, unknown models, empty/`None` input, and end-to-end `is_begin_of_new_word` / `get_pure_token` round-trips for both local-path families.

Verification

  • Baseline: 2 tests pass, 3 tests fail with a pre-existing `ValueError: too many values to unpack (expected 2)` in `iterative_compress_prompt` (transformers DynamicCache API change — addressed separately in fix: normalize past_key_values across transformers DynamicCache API (#210) #246).
  • Post-fix: 2 baseline tests still pass + 17 new tests pass (19 total), same 3 pre-existing failures — zero new regressions.

Generated by Ora Studio
Vibe coded by ousamabenyounes


Vibe Coded by Ousama Ben Younes
Developed With Ora Studio (Claude Code)

…icrosoft#235)

When use_slingua=True, __init__ already called init_llmlingua2 correctly,
but compress_prompt only checked self.use_llmlingua2 and fell through to
the LLMLingua-1 causal-LM path, which passes past_key_values to
XLMRobertaForTokenClassification and crashes.

Generated by Claude Code
Vibe coded by ousamabenyounes

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: NotImplementedError

1 participant