Skip to content

feat(vendor): add LLM2Vec embedding model#637

Open
davidberenstein1957 wants to merge 2 commits intomainfrom
feat/vlm-pr-1-vendor
Open

feat(vendor): add LLM2Vec embedding model#637
davidberenstein1957 wants to merge 2 commits intomainfrom
feat/vlm-pr-1-vendor

Conversation

@davidberenstein1957
Copy link
Copy Markdown
Member

@davidberenstein1957 davidberenstein1957 commented Apr 25, 2026

Summary

Vendors OneIG's LLM2Vec implementation used by downstream OneIG-based VLM metrics.

This PR includes two minimal vendor compatibility fixes:

  • honor constructor-provided doc_max_length
  • honor convert_to_numpy=True in encode()

Stack Position

Files

  • src/pruna/evaluation/metrics/vendor/NOTICE.oneig_llm2vec
  • src/pruna/evaluation/metrics/vendor/oneig_llm2vec/llm2vec.py
  • src/pruna/evaluation/metrics/vendor/oneig_llm2vec/modeling_llama_encoder.py
  • src/pruna/evaluation/metrics/vendor/oneig_llm2vec/models/bidirectional_llama.py

Test Plan

python -c "from pruna.evaluation.metrics.vendor.oneig_llm2vec.llm2vec import LLM2Vec; print('OK')"

Review Focus

  • License and attribution completeness
  • Correctness of the two vendor compatibility fixes
  • No coupling to Pruna internals

Review Flow (Order)

Review the stack in this exact order:

  1. feat(vendor): add LLM2Vec embedding model #637 vendor
  2. feat(infrastructure): add VLM base classes and utilities #638 infrastructure
  3. feat(text-metrics): split qa_accuracy #645 qa_accuracy
  4. feat(text-metrics): split oneig_alignment #646 oneig_alignment
  5. feat(text-metrics): split text_score pair #647 text_score pair
  6. feat(text-metrics): split oneig_reasoning #648 oneig_reasoning
  7. feat(vision-metrics): split vqa #649 vqa
  8. feat(vision-metrics): split vie_score #650 vie_score
  9. feat(vision-metrics): split img_edit_score #651 img_edit_score
  10. feat(e2e-tests): stacked e2e after split metrics #641 e2e tests

This PR in the flow (1/10)

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit c933c4d. Configure here.

self.pooling_mode = pooling_mode
self.skip_instruction = skip_instruction
self.max_length = max_length
self.doc_max_length = 512
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructor ignores doc_max_length parameter, hardcodes 512

High Severity

self.doc_max_length is hardcoded to 512 instead of using the doc_max_length parameter. Any custom value passed to __init__ (including values loaded from llm2vec_config.json via from_pretrained) is silently discarded. This affects _convert_to_str, which uses self.doc_max_length for document length truncation, and save, which persists the (always-512) value back to config.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c933c4d. Configure here.

all_embeddings = torch.cat(all_embeddings, dim=0)
all_embeddings = all_embeddings[np.argsort(length_sorted_idx)]
all_embeddings = all_embeddings.to(torch.float32)
return all_embeddings
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_to_numpy parameter is accepted but never applied

Medium Severity

The convert_to_numpy parameter in encode() is documented as returning a NumPy array when True, but the return path always calls torch.cat, indexes with np.argsort, and calls .to(torch.float32) — unconditionally returning a torch.Tensor. The _encode helper also accepts convert_to_numpy but never uses it. A caller passing convert_to_tensor=False, convert_to_numpy=True will silently receive a tensor instead of the expected numpy.ndarray.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c933c4d. Configure here.

davidberenstein1957 and others added 2 commits May 8, 2026 11:01
- Add LLM2Vec from OneIG vendor source
- Includes Llama encoder and bidirectional models
- Self-contained, no dependencies on Pruna internals
- Licensed under Apache 2.0
Patch two upstream llm2vec behavior bugs found in review so downstream VLM
metrics use caller-provided doc_max_length and can return numpy when requested.
Document Pruna's vendor deviations in NOTICE for traceability.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant