Skip to content

feat(vision-metrics): split vie_score#650

Open
davidberenstein1957 wants to merge 1 commit intofeat/vlm-pr-4a-vqafrom
feat/vlm-pr-4b-vie-score
Open

feat(vision-metrics): split vie_score#650
davidberenstein1957 wants to merge 1 commit intofeat/vlm-pr-4a-vqafrom
feat/vlm-pr-4b-vie-score

Conversation

@davidberenstein1957
Copy link
Copy Markdown
Member

@davidberenstein1957 davidberenstein1957 commented Apr 28, 2026

Summary

Splits vie_score into its own stacked PR, adds VieScoreMetric, and wires GEditBench benchmark entry with focused VIE coverage.

Stack Position

Files

  • src/pruna/evaluation/metrics/metric_vie_score.py
  • src/pruna/evaluation/benchmarks.py
  • tests/evaluation/test_vision_metrics.py

Test Plan

uv run pytest tests/evaluation/test_vision_metrics.py -k vie_score

Review Focus

  • VIE sub-score parsing/aggregation
  • GEditBench wiring

Review Flow (Order)

Review the stack in this exact order:

  1. feat(vendor): add LLM2Vec embedding model #637 vendor
  2. feat(infrastructure): add VLM base classes and utilities #638 infrastructure
  3. feat(text-metrics): split qa_accuracy #645 qa_accuracy
  4. feat(text-metrics): split oneig_alignment #646 oneig_alignment
  5. feat(text-metrics): split text_score pair #647 text_score pair
  6. feat(text-metrics): split oneig_reasoning #648 oneig_reasoning
  7. feat(vision-metrics): split vqa #649 vqa
  8. feat(vision-metrics): split vie_score #650 vie_score
  9. feat(vision-metrics): split img_edit_score #651 img_edit_score
  10. feat(e2e-tests): stacked e2e after split metrics #641 e2e tests

This PR in the flow (8/10)

Adds VieScoreMetric with GEditBench benchmark wiring and focused unit coverage while keeping image-edit scoring changes for the next stacked PR.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant