feat(vision-metrics): split vie_score by davidberenstein1957 · Pull Request #650 · PrunaAI/pruna

davidberenstein1957 · 2026-04-28T13:04:10Z

Summary

Splits vie_score into its own stacked PR, adds VieScoreMetric, and wires GEditBench benchmark entry with focused VIE coverage.

Stack Position

Base: PR feat(vision-metrics): split vqa #649 (feat/vlm-pr-4a-vqa)
Next: PR feat(vision-metrics): split img_edit_score #651 (feat/vlm-pr-4c-img-edit-score)
Final integration: PR feat(e2e-tests): stacked e2e after split metrics #641 (feat/vlm-pr-5-e2e-tests)
Canonical umbrella reference: PR feat(evaluation): add VLMMetrics #545 (feat/metrics-vlm-support)

Files

src/pruna/evaluation/metrics/metric_vie_score.py
src/pruna/evaluation/benchmarks.py
tests/evaluation/test_vision_metrics.py

Test Plan

uv run pytest tests/evaluation/test_vision_metrics.py -k vie_score

Review Focus

VIE sub-score parsing/aggregation
GEditBench wiring

Review Flow (Order)

Review the stack in this exact order:

feat(vendor): add LLM2Vec embedding model #637 vendor
feat(infrastructure): add VLM base classes and utilities #638 infrastructure
feat(text-metrics): split qa_accuracy #645 qa_accuracy
feat(text-metrics): split oneig_alignment #646 oneig_alignment
feat(text-metrics): split text_score pair #647 text_score pair
feat(text-metrics): split oneig_reasoning #648 oneig_reasoning
feat(vision-metrics): split vqa #649 vqa
feat(vision-metrics): split vie_score #650 vie_score
feat(vision-metrics): split img_edit_score #651 img_edit_score
feat(e2e-tests): stacked e2e after split metrics #641 e2e tests

This PR in the flow (8/10)

Review after PR feat(vision-metrics): split vqa #649.
Next PR to review: feat(vision-metrics): split img_edit_score #651.
Confirm this PR's tests and scope before continuing.

Adds VieScoreMetric with GEditBench benchmark wiring and focused unit coverage while keeping image-edit scoring changes for the next stacked PR. Made-with: Cursor

This was referenced Apr 28, 2026

feat(text-metrics): add text-based VLM judge metrics #639

Closed

feat(vision-metrics): add vision-based VLM judge metrics #640

Closed

feat(vision-metrics): split vie_score into dedicated branch

693f888

Adds VieScoreMetric with GEditBench benchmark wiring and focused unit coverage while keeping image-edit scoring changes for the next stacked PR. Made-with: Cursor

davidberenstein1957 force-pushed the feat/vlm-pr-4a-vqa branch from 3d76a02 to 92109f0 Compare May 8, 2026 09:01

davidberenstein1957 force-pushed the feat/vlm-pr-4b-vie-score branch from 4c9b3d3 to 693f888 Compare May 8, 2026 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(vision-metrics): split vie_score#650

feat(vision-metrics): split vie_score#650
davidberenstein1957 wants to merge 1 commit intofeat/vlm-pr-4a-vqafrom
feat/vlm-pr-4b-vie-score

davidberenstein1957 commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidberenstein1957 commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack Position

Files

Test Plan

Review Focus

Review Flow (Order)

This PR in the flow (8/10)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidberenstein1957 commented Apr 28, 2026 •

edited

Loading