Multi-turn tool call chat completions conversations#712
Conversation
sjmonson
left a comment
There was a problem hiding this comment.
Not finished reviewing; will add more comments in a bit. Github is acting up and not letting me add to this review for some reason.
|
This pull request has merge conflicts that must be resolved before it can be |
dbutenhof
left a comment
There was a problem hiding this comment.
Finally got through this. Looks good. Needs a rebase.
|
@jaredoconnell, this project requires a linear history on feature branches. You can do this by running: |
|
augment review |
🤖 Augment PR SummarySummary: This PR adds support for pre-planned multi-turn tool-calling conversations when benchmarking OpenAI-compatible Changes:
🤖 Was this summary useful? React with 👍 or 👎 |
|
This pull request has merge conflicts that must be resolved before it can be |
|
@Mergifyio rebase |
❌ Base branch update has failedDetailsGit reported the following error: |
|
I will rebase locally. I enabled |
dbutenhof
left a comment
There was a problem hiding this comment.
The code looks good; I ran it against my server with Qwen/Qwen3-0.6B and the GuideLLM sample command from tool_calling.md, and it seems to run OK:
ℹ Tool Call Metrics Statistics (Completed Requests)
|===========|======|=======|=======|=======|======|=======|======|======|
| Benchmark | Output Tokens |||| Output Count ||||
| Strategy | Per Request || Per Second || Per Request || Per Second ||
| | Mdn | p95 | Mdn | Mean | Mdn | p95 | Mdn | Mean |
|-----------|------|-------|-------|-------|------|-------|------|------|
| constant | 77.0 | 692.0 | 152.7 | 309.4 | 1.0 | 19.0 | 1.5 | 4.8 |
|===========|======|=======|=======|=======|======|=======|======|======|Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Adds variable size responses for synthetic data, and better handles edge cases for external datasets. Also improves documentation. Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Extracts functionality to new static methods. Assisted-by: Claude Code Sonnet 4.5 Signed-off-by: Jared O'Connell <joconnel@redhat.com>
These are the diffs recommended in the comments. They are untested, and require some follow-up changes. Co-authored-by: Samuel Monson <smonson@irbash.net> Signed-off-by: Jared O'Connell <46976761+jaredoconnell@users.noreply.github.com>
Moves documentation. Switches fully to exceptions to stop conversations early. Also includes info gained from vLLM contributor. Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Claude 4.6 Opus High Thinking Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Claude 4.6 Opus High Thinking Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Removed all worker logic except changed conversations to end if any exception occurs. Assisted-by: Cursor AI Claude 4.6 Opus High Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Assisted-by: Cursor AI Signed-off-by: Jared O'Connell <joconnel@redhat.com>
5fb71c1 to
fbc667c
Compare
Summary
This PR adds client-side chat completions conversations to the http backend.
Details
autoorrequiredto the model. Good for testing various scenarios. Models behave differently depending on the value set.requiredis best for predictability.Test Plan
Related Issues
Use of AI
## WRITTEN BY AI ##)