Skip to content

Releases: CSHVienna/LLMScholarBench

v2.0.0-rc.1

17 Feb 11:27
ae22342

Choose a tag to compare

v2.0.0-rc.1 Pre-release
Pre-release

Submission artifact for KDD 2026 (Datasets and Benchmarks Track). This release corresponds to the code and plotting scripts used in the submitted manuscript and may change after rebuttal and acceptance.

v1.0.0

12 Feb 12:01

Choose a tag to compare

This release provides the full codebase for systematically querying large language models and evaluating their outputs in the context of scholar recommendation. It includes the LLMCaller module for structured and reproducible LLM querying, and the Auditor module for systematic evaluation of model responses. The evaluation includes measuring consistency and actuality, and a descriptive analysis of biases wrt. gender, ethnicity and popularity.

This release corresponds to the paper "Whose Name Comes Up? Auditing LLM-Based Scholar Recommendations" arXiv:2506.00074 and reflects the methods and experiments reported.