fix: Qwen3.5 tool-call chat-template tokenization#634
Conversation
fcfa43c to
e3c3d73
Compare
…nizer-level fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PeterWofford
left a comment
There was a problem hiding this comment.
Nice fix — clean scope, good test, and the template detection ("tool_call.arguments|items" check) is a smart way to avoid unnecessary work for non-Qwen models.
A few things I noticed while testing this against our IVR nav training pipeline (Qwen3.5-35B-A3B):
1. assert isinstance(arguments_json, str) is too strict (line 58)
If arguments are already a dict (e.g. from an upstream normalization path), this crashes instead of being a no-op. We hit this when combining your PR with another fix that normalized earlier in the pipeline.
Suggest replacing the assert with a type check:
arguments = function["arguments"]
if isinstance(arguments, str):
arguments = json.loads(arguments)
elif not isinstance(arguments, dict):
raise TypeError(f"Expected str or dict for tool_call arguments, got {type(arguments)}")2. No json.loads error handling
If a tool_call has malformed JSON in arguments, this will raise JSONDecodeError with no context. Low probability but worth a defensive try/except for a library:
try:
arguments = json.loads(arguments_json)
except (json.JSONDecodeError, ValueError):
arguments = {}3. chat_template can be a dict
Some multi-template tokenizers return dict[str, str] from tokenizer.chat_template. The assert isinstance(chat_template, str) on line 35 would crash for those. Could guard with:
if not isinstance(chat_template, str):
return messages4. Test coverage gap — context messages
The test covers a Choice object (generated turn) but not a plain dict message with string tool_call arguments. Training data context messages often have this shape too and flow through the same path. Worth adding a test case with a context dict that has string arguments alongside the Choice.
Tested this PR end-to-end on our Qwen3.5-35B-A3B GRPO+RULER training run — with the assert relaxed, it works correctly. Training is running clean.
|
choosing to not to be too defensive unless these scenarios actually would happen |
This PR fixes Qwen3.5 training-time tokenization for assistant tool calls.
What changed
Choicewith tool calls whosefunction.argumentsare still OpenAI-style JSON stringstokenizer.apply_chat_template()when the loaded chat template usestool_call.arguments|itemsWhy
Qwen3.5's Hugging Face chat template iterates over tool-call arguments with
tool_call.arguments|items, which requires a mapping.ART was passing the OpenAI response shape through training preprocessing unchanged, where
Choice.message.tool_calls[*].function.argumentsis a JSON string. Once a sampled trajectory with assistant tool calls reached tokenization, Jinja raisedTypeError: Can only get item pairs from a mapping.ART already handled this shape on the rendering side; the training preprocessing path was the missing piece.
Impact
Qwen3.5 training runs that include assistant tool calls no longer fail during
apply_chat_template().