Skip to content

fix: keep all CallToolResult content items#6149

Open
tsubasakong wants to merge 1 commit intoAstrBotDevs:masterfrom
tsubasakong:fix/6140-tool-result-content
Open

fix: keep all CallToolResult content items#6149
tsubasakong wants to merge 1 commit intoAstrBotDevs:masterfrom
tsubasakong:fix/6140-tool-result-content

Conversation

@tsubasakong
Copy link

@tsubasakong tsubasakong commented Mar 12, 2026

Closes #6140

Modifications / 改动点

  • iterate through every CallToolResult.content item instead of only reading content[0]

  • keep caching image results for multimodal review while appending any accompanying text/resource content into the same tool result passed back to the LLM

  • add a regression test covering mixed image + text tool results

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

  • python3.11 -m pytest tests/test_tool_loop_agent_runner.py -q
  • ruff format astrbot/core/agent/runners/tool_loop_agent_runner.py tests/test_tool_loop_agent_runner.py
  • ruff check astrbot/core/agent/runners/tool_loop_agent_runner.py tests/test_tool_loop_agent_runner.py
  • git diff --check

Checklist / 检查清单

  • 😊 No new feature is introduced in this PR, so no prior feature discussion was needed. / 此 PR 未引入新功能,因此无需额外的功能预讨论。
  • 👀 My changes have been well-tested, and the verification steps / test results are provided above. / 我的更改已经过测试,上方已提供验证步骤和结果。
  • 🤓 I did not introduce any new dependencies. / 我没有引入新的依赖。
  • 😮 My changes do not introduce malicious code. / 我的更改没有引入恶意代码。

Summary by Sourcery

确保工具循环代理在处理 CallToolResult 时,保留并呈现其中的所有内容项,包括图文混合输出。

Bug Fixes:

  • 处理包含多个内容项的 CallToolResult 响应,确保所有文本和图片条目都会传递给 LLM,而不是只使用第一个。
  • 当某个工具返回空的内容列表时,提供回退消息,并以更安全的方式处理不受支持的资源类型。

Tests:

  • 添加回归测试,验证图文混合的 CallToolResult 内容会完整包含在最终的工具消息中,并且图片会以正确的元数据进行缓存。
Original summary in English

Summary by Sourcery

Ensure tool loop agent preserves and surfaces all content items from CallToolResult, including mixed image and text outputs.

Bug Fixes:

  • Handle CallToolResult responses with multiple content items so that all text and image entries are surfaced to the LLM instead of only the first one.
  • Provide a fallback message when a tool returns an empty content list and handle unsupported resource types more safely.

Tests:

  • Add a regression test verifying that mixed image and text CallToolResult content is fully included in the resulting tool message and that images are cached with correct metadata.

Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
  • 👀 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 12, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a limitation where CallToolResult objects were only partially processed, leading to potential data loss, especially in multimodal interactions. By updating the processing logic to handle all content items within a CallToolResult, the agent can now fully utilize complex tool outputs, improving its understanding and response generation capabilities. A new test case has been introduced to validate this enhanced functionality.

Highlights

  • Tool Result Processing: Modified the tool result handling to iterate through all items in CallToolResult.content instead of only processing the first item, ensuring all information from tool calls is captured.
  • Multimodal Content Handling: Enhanced the caching of image results for multimodal review, now appending any accompanying text or resource content into the same tool result passed back to the LLM.
  • Regression Test: Added a new regression test to cover scenarios involving mixed image and text content within tool results, ensuring the fix works as expected.
Changelog
  • astrbot/core/agent/runners/tool_loop_agent_runner.py
    • Refactored tool result processing to iterate over all content items in CallToolResult.
    • Implemented logic to combine text, image, and embedded resource content into a single tool result string.
    • Added a check for empty CallToolResult.content.
  • tests/test_tool_loop_agent_runner.py
    • Imported SimpleNamespace for mocking purposes.
    • Introduced MockMixedContentToolExecutor to simulate tool calls returning both image and text content.
    • Added test_tool_result_includes_all_calltoolresult_content to verify comprehensive content processing from tool results.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@auto-assign auto-assign bot requested review from Raven95676 and Soulter March 12, 2026 17:50
@dosubot dosubot bot added the area:core The bug / feature is about astrbot's core, backend label Mar 12, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

你好——我发现了 2 个问题,并给出了一些总体性的反馈:

  • 现在 ImageContentEmbeddedResource 的图片处理分支中都重复了相同的缓存和消息处理逻辑;可以考虑抽取一个小的辅助函数(例如 cache_tool_image(content_item, index)),以减少重复,并让后续改动集中在一个地方。
  • 在混合内容聚合逻辑中,你目前是通过 "\n\n".join(result_parts) 把结果拼成单个字符串;如果后续的使用方需要区分各个独立部分,可能更适合保留结构化表示(例如拆成多条消息或带标签的格式),而不是把所有内容都压平成纯文本。
  • 新增的测试 test_tool_result_includes_all_calltoolresult_content 依赖硬编码的 tool_call_idtool_namecall_123, test_tool);如果这些值在 runner 中发生变动,测试就会在没有明显原因的情况下失败,因此可以考虑从实际的 tool call 中推导这些值,或者放宽断言,避免过于紧耦合。
给 AI Agent 的提示
Please address the comments from this code review:

## Overall Comments
- 现在 `ImageContent``EmbeddedResource` 的图片处理分支中都重复了相同的缓存和消息处理逻辑;可以考虑抽取一个小的辅助函数(例如 `cache_tool_image(content_item, index)`),以减少重复,并让后续改动集中在一个地方。
- 在混合内容聚合逻辑中,你目前是通过 `"\n\n".join(result_parts)` 把结果拼成单个字符串;如果后续的使用方需要区分各个独立部分,可能更适合保留结构化表示(例如拆成多条消息或带标签的格式),而不是把所有内容都压平成纯文本。
- 新增的测试 `test_tool_result_includes_all_calltoolresult_content` 依赖硬编码的 `tool_call_id``tool_name``call_123`, `test_tool`);如果这些值在 runner 中发生变动,测试就会在没有明显原因的情况下失败,因此可以考虑从实际的 tool call 中推导这些值,或者放宽断言,避免过于紧耦合。

## Individual Comments

### Comment 1
<location path="tests/test_tool_loop_agent_runner.py" line_range="94" />
<code_context>
         return generator()


+class MockMixedContentToolExecutor:
+    """模拟返回图片 + 文本的工具执行器"""
+
</code_context>
<issue_to_address>
**suggestion (testing):** 考虑添加一个混合内容的 mock,以同时覆盖 EmbeddedResource 分支

`MockMixedContentToolExecutor` 目前很好地覆盖了 `ImageContent + TextContent` 路径,但实现中对 `EmbeddedResource``TextResourceContents` 和类图片的 `BlobResourceContents`)也有专门的处理。请扩展这个 mock(或新增一个),使其同时返回文本和图片类型的 `EmbeddedResource`,从而在测试中覆盖这些分支,减少资源处理相关回归的可能性。

建议实现如下:

```python
class MockMixedContentToolExecutor:
    """模拟返回图片 + 文本的工具执行器
    同时覆盖 EmbeddedResource 文本与图片分支,以便测试资源处理逻辑。
    """

    @classmethod
    def execute(cls, tool, run_context, **tool_args):
        async def generator():
            # 内联文本内容
            inline_text = SimpleNamespace(
                type="text",
                text="mock inline text from mixed-content tool",
            )

            # 内联图片内容(保持与现有测试中图片内容约定一致,如需可调整字段名)
            inline_image = SimpleNamespace(
                type="image",
                image_url="mock://inline-image",
            )

            # 模拟 TextResourceContents 对应的 EmbeddedResource
            text_embedded_resource = SimpleNamespace(
                kind="text",
                uri="mock://embedded-text-resource",
                mime_type="text/plain",
                text="mock embedded text resource",
            )

            text_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=text_embedded_resource,
            )

            # 模拟 BlobResourceContents(图片类)对应的 EmbeddedResource
            image_embedded_resource = SimpleNamespace(
                kind="image",
                uri="mock://embedded-image-resource",
                mime_type="image/png",
                data=b"\x89PNG\r\n\x1a\n",  # 仅作占位,不必是真实图片
            )

            image_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=image_embedded_resource,
            )

            tool_name = getattr(tool, "name", "mock_mixed_content_tool")

            # 保持与其他 mock executor 一致:返回一次性流结果,
            # 其中包含文本、图片以及两种 EmbeddedResource
            yield [
                SimpleNamespace(
                    role="tool",
                    name=tool_name,
                    content=[
                        inline_text,
                        inline_image,
                        text_embedded_content,
                        image_embedded_content,
                    ],
                )
            ]

        return generator()

```

1. 如果生产代码中 `EmbeddedResource` / `TextResourceContents` / `BlobResourceContents` 有专门的类或字段名(例如 `EmbeddedResource`, `TextResourceContent`, `BlobResourceContent` 等),请将上面使用的 `kind`, `uri`, `mime_type`, `text`, `data`、以及 `type="embedded_resource"` 等字段替换为项目中实际使用的类型和字段。
2. 确认其他 mock 执行器在测试中的返回结构(例如是否直接返回 `SimpleNamespace(role=..., content=[...])`、或封装在更高层对象中),如果有差异,请将 `yield [...]` 部分调整为与现有 mock 的结构完全一致,以确保新的 EmbeddedResource 分支被现有测试路径正确消费。
</issue_to_address>

### Comment 2
<location path="tests/test_tool_loop_agent_runner.py" line_range="400" />
<code_context>


+@pytest.mark.asyncio
+async def test_tool_result_includes_all_calltoolresult_content(
+    runner, mock_provider, provider_request, mock_hooks, monkeypatch
+):
</code_context>
<issue_to_address>
**suggestion (testing):** 增加一个变体测试,在 CallToolResult.content 为空时验证“无内容”分支

由于现在存在一个 `if not res.content:` 分支会追加 "The tool returned no content." 并提前返回,请新增一个测试,让 mock 工具返回 `CallToolResult(content=[])`(或在合法的情况下返回 `content=None`)。该测试应验证:的确走到了这个分支,并且不会调用 `save_image`。

建议实现如下:

```python
    async def text_chat(self, **kwargs) -> LLMResponse:
        self.call_count += 1
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"


@pytest.mark.asyncio
async def test_tool_result_handles_empty_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
    """当工具返回空 content 时,应走 'no content' 分支且不保存图片。"""

    from astrbot.core.agent.tool_image_cache import tool_image_cache
    from astrbot.core.provider import CallToolResult

    mock_provider.should_call_tools = True
    mock_provider.max_calls_before_normal_response = 1

    saved_images = []

    def fake_save_image(*args, **kwargs):
        saved_images.append((args, kwargs))

    # monkeypatch 掉图片保存函数,方便断言不会被调用
    monkeypatch.setattr(tool_image_cache, "save_image", fake_save_image)

    # 让 mock 工具返回空的 CallToolResult.content,以覆盖 `if not res.content` 分支
    async def fake_tool_callable(*args, **kwargs):
        return CallToolResult(content=[])

    # 根据现有测试/MockProvider 约定,挂上假的工具实现
    mock_provider.tool_callable = fake_tool_callable

    result = await runner.run(provider_request)

    # 断言走到了 "no content" 分支
    # (如果实现使用了其它提示文案,请同步更新下面的断言)
    assert "The tool returned no content." in result.msgs[-1].content[0].text
    # 不应保存任何图片
    assert saved_images == []
    # 工具生命周期 Hook 仍然应该被调用
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"

@pytest.mark.asyncio
async def test_tool_result_includes_all_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):

```

1. 确认 `astrbot.core.provider.CallToolResult` 的导入路径及构造签名是否为 `CallToolResult(content=[...])`,如不一致请调整导入路径和构造参数。
2. 上面的断言 `result.msgs[-1].content[0].text` 假设最终的 LLM 输出以这种结构暴露文本;如果实际返回结构不同,请根据现有的 `test_tool_result_includes_all_calltoolresult_content` 中用于检查 tool result 文本的方式进行同步修改。
3. `mock_provider.tool_callable` 是根据常见模式猜测的挂载点,请对照项目中 MockProvider 的实现,改为实际用于返回工具结果的属性或方法(例如 `mock_provider.tools[0].fn``mock_provider.mock_tool_result` 等),保证在 runner 调用工具时会返回 `CallToolResult(content=[])`4. 如果 “no content” 分支的提示文案与 `"The tool returned no content."` 略有出入,请将断言字符串改成与实现完全一致,或使用 `in` 断言其中的关键片段。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你喜欢我们的评审,欢迎帮忙分享 ✨
请帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据这些反馈改进之后的代码评审。
Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

  • The image-handling branches for ImageContent and EmbeddedResource now duplicate the same cache-and-message logic; consider extracting a small helper (e.g. cache_tool_image(content_item, index)) to reduce repetition and keep future changes in one place.
  • In the mixed-content aggregation you currently "\n\n".join(result_parts) into a single string; if downstream consumers ever need to distinguish individual parts, it might be worth keeping a structured representation (e.g. separate messages or a tagged format) rather than flattening everything into text.
  • The new test test_tool_result_includes_all_calltoolresult_content relies on the hardcoded tool_call_id and tool_name (call_123, test_tool); if those ever change in the runner, the test will fail unexpectedly, so consider deriving these values from the actual tool call or relaxing the assertion to avoid tight coupling.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The image-handling branches for `ImageContent` and `EmbeddedResource` now duplicate the same cache-and-message logic; consider extracting a small helper (e.g. `cache_tool_image(content_item, index)`) to reduce repetition and keep future changes in one place.
- In the mixed-content aggregation you currently `"\n\n".join(result_parts)` into a single string; if downstream consumers ever need to distinguish individual parts, it might be worth keeping a structured representation (e.g. separate messages or a tagged format) rather than flattening everything into text.
- The new test `test_tool_result_includes_all_calltoolresult_content` relies on the hardcoded `tool_call_id` and `tool_name` (`call_123`, `test_tool`); if those ever change in the runner, the test will fail unexpectedly, so consider deriving these values from the actual tool call or relaxing the assertion to avoid tight coupling.

## Individual Comments

### Comment 1
<location path="tests/test_tool_loop_agent_runner.py" line_range="94" />
<code_context>
         return generator()


+class MockMixedContentToolExecutor:
+    """模拟返回图片 + 文本的工具执行器"""
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a mixed content mock that also exercises EmbeddedResource branches

`MockMixedContentToolExecutor` usefully covers the `ImageContent + TextContent` path, but the implementation also has specific handling for `EmbeddedResource` (`TextResourceContents` and image-like `BlobResourceContents`). Please extend this mock (or add another) to return both text and image `EmbeddedResource` values so those branches are exercised by tests and resource-handling regressions are less likely.

Suggested implementation:

```python
class MockMixedContentToolExecutor:
    """模拟返回图片 + 文本的工具执行器
    同时覆盖 EmbeddedResource 文本与图片分支,以便测试资源处理逻辑。
    """

    @classmethod
    def execute(cls, tool, run_context, **tool_args):
        async def generator():
            # 内联文本内容
            inline_text = SimpleNamespace(
                type="text",
                text="mock inline text from mixed-content tool",
            )

            # 内联图片内容(保持与现有测试中图片内容约定一致,如需可调整字段名)
            inline_image = SimpleNamespace(
                type="image",
                image_url="mock://inline-image",
            )

            # 模拟 TextResourceContents 对应的 EmbeddedResource
            text_embedded_resource = SimpleNamespace(
                kind="text",
                uri="mock://embedded-text-resource",
                mime_type="text/plain",
                text="mock embedded text resource",
            )

            text_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=text_embedded_resource,
            )

            # 模拟 BlobResourceContents(图片类)对应的 EmbeddedResource
            image_embedded_resource = SimpleNamespace(
                kind="image",
                uri="mock://embedded-image-resource",
                mime_type="image/png",
                data=b"\x89PNG\r\n\x1a\n",  # 仅作占位,不必是真实图片
            )

            image_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=image_embedded_resource,
            )

            tool_name = getattr(tool, "name", "mock_mixed_content_tool")

            # 保持与其他 mock executor 一致:返回一次性流结果,
            # 其中包含文本、图片以及两种 EmbeddedResource
            yield [
                SimpleNamespace(
                    role="tool",
                    name=tool_name,
                    content=[
                        inline_text,
                        inline_image,
                        text_embedded_content,
                        image_embedded_content,
                    ],
                )
            ]

        return generator()

```

1. 如果生产代码中 `EmbeddedResource` / `TextResourceContents` / `BlobResourceContents` 有专门的类或字段名(例如 `EmbeddedResource`, `TextResourceContent`, `BlobResourceContent` 等),请将上面使用的 `kind`, `uri`, `mime_type`, `text`, `data`、以及 `type="embedded_resource"` 等字段替换为项目中实际使用的类型和字段。
2. 确认其他 mock 执行器在测试中的返回结构(例如是否直接返回 `SimpleNamespace(role=..., content=[...])`、或封装在更高层对象中),如果有差异,请将 `yield [...]` 部分调整为与现有 mock 的结构完全一致,以确保新的 EmbeddedResource 分支被现有测试路径正确消费。
</issue_to_address>

### Comment 2
<location path="tests/test_tool_loop_agent_runner.py" line_range="400" />
<code_context>


+@pytest.mark.asyncio
+async def test_tool_result_includes_all_calltoolresult_content(
+    runner, mock_provider, provider_request, mock_hooks, monkeypatch
+):
</code_context>
<issue_to_address>
**suggestion (testing):** Add a variant of this test where CallToolResult.content is empty to verify the 'no content' path

Since there is now an `if not res.content:` branch that appends "The tool returned no content." and exits early, please add a test where the mock tool returns `CallToolResult(content=[])` (or `content=None` if valid). That test should verify that this branch is taken and that no `save_image` calls are made.

Suggested implementation:

```python
    async def text_chat(self, **kwargs) -> LLMResponse:
        self.call_count += 1
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"


@pytest.mark.asyncio
async def test_tool_result_handles_empty_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
    """当工具返回空 content 时,应走 'no content' 分支且不保存图片。"""

    from astrbot.core.agent.tool_image_cache import tool_image_cache
    from astrbot.core.provider import CallToolResult

    mock_provider.should_call_tools = True
    mock_provider.max_calls_before_normal_response = 1

    saved_images = []

    def fake_save_image(*args, **kwargs):
        saved_images.append((args, kwargs))

    # monkeypatch 掉图片保存函数,方便断言不会被调用
    monkeypatch.setattr(tool_image_cache, "save_image", fake_save_image)

    # 让 mock 工具返回空的 CallToolResult.content,以覆盖 `if not res.content` 分支
    async def fake_tool_callable(*args, **kwargs):
        return CallToolResult(content=[])

    # 根据现有测试/MockProvider 约定,挂上假的工具实现
    mock_provider.tool_callable = fake_tool_callable

    result = await runner.run(provider_request)

    # 断言走到了 "no content" 分支
    # (如果实现使用了其它提示文案,请同步更新下面的断言)
    assert "The tool returned no content." in result.msgs[-1].content[0].text
    # 不应保存任何图片
    assert saved_images == []
    # 工具生命周期 Hook 仍然应该被调用
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"

@pytest.mark.asyncio
async def test_tool_result_includes_all_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):

```

1. 确认 `astrbot.core.provider.CallToolResult` 的导入路径及构造签名是否为 `CallToolResult(content=[...])`,如不一致请调整导入路径和构造参数。
2. 上面的断言 `result.msgs[-1].content[0].text` 假设最终的 LLM 输出以这种结构暴露文本;如果实际返回结构不同,请根据现有的 `test_tool_result_includes_all_calltoolresult_content` 中用于检查 tool result 文本的方式进行同步修改。
3. `mock_provider.tool_callable` 是根据常见模式猜测的挂载点,请对照项目中 MockProvider 的实现,改为实际用于返回工具结果的属性或方法(例如 `mock_provider.tools[0].fn``mock_provider.mock_tool_result` 等),保证在 runner 调用工具时会返回 `CallToolResult(content=[])`4. 如果 “no content” 分支的提示文案与 `"The tool returned no content."` 略有出入,请将断言字符串改成与实现完全一致,或使用 `in` 断言其中的关键片段。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

return generator()


class MockMixedContentToolExecutor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 考虑添加一个混合内容的 mock,以同时覆盖 EmbeddedResource 分支

MockMixedContentToolExecutor 目前很好地覆盖了 ImageContent + TextContent 路径,但实现中对 EmbeddedResourceTextResourceContents 和类图片的 BlobResourceContents)也有专门的处理。请扩展这个 mock(或新增一个),使其同时返回文本和图片类型的 EmbeddedResource,从而在测试中覆盖这些分支,减少资源处理相关回归的可能性。

建议实现如下:

class MockMixedContentToolExecutor:
    """模拟返回图片 + 文本的工具执行器
    同时覆盖 EmbeddedResource 文本与图片分支,以便测试资源处理逻辑。
    """

    @classmethod
    def execute(cls, tool, run_context, **tool_args):
        async def generator():
            # 内联文本内容
            inline_text = SimpleNamespace(
                type="text",
                text="mock inline text from mixed-content tool",
            )

            # 内联图片内容(保持与现有测试中图片内容约定一致,如需可调整字段名)
            inline_image = SimpleNamespace(
                type="image",
                image_url="mock://inline-image",
            )

            # 模拟 TextResourceContents 对应的 EmbeddedResource
            text_embedded_resource = SimpleNamespace(
                kind="text",
                uri="mock://embedded-text-resource",
                mime_type="text/plain",
                text="mock embedded text resource",
            )

            text_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=text_embedded_resource,
            )

            # 模拟 BlobResourceContents(图片类)对应的 EmbeddedResource
            image_embedded_resource = SimpleNamespace(
                kind="image",
                uri="mock://embedded-image-resource",
                mime_type="image/png",
                data=b"\x89PNG\r\n\x1a\n",  # 仅作占位,不必是真实图片
            )

            image_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=image_embedded_resource,
            )

            tool_name = getattr(tool, "name", "mock_mixed_content_tool")

            # 保持与其他 mock executor 一致:返回一次性流结果,
            # 其中包含文本、图片以及两种 EmbeddedResource
            yield [
                SimpleNamespace(
                    role="tool",
                    name=tool_name,
                    content=[
                        inline_text,
                        inline_image,
                        text_embedded_content,
                        image_embedded_content,
                    ],
                )
            ]

        return generator()
  1. 如果生产代码中 EmbeddedResource / TextResourceContents / BlobResourceContents 有专门的类或字段名(例如 EmbeddedResource, TextResourceContent, BlobResourceContent 等),请将上面使用的 kind, uri, mime_type, text, data、以及 type="embedded_resource" 等字段替换为项目中实际使用的类型和字段。
  2. 确认其他 mock 执行器在测试中的返回结构(例如是否直接返回 SimpleNamespace(role=..., content=[...])、或封装在更高层对象中),如果有差异,请将 yield [...] 部分调整为与现有 mock 的结构完全一致,以确保新的 EmbeddedResource 分支被现有测试路径正确消费。
Original comment in English

suggestion (testing): Consider adding a mixed content mock that also exercises EmbeddedResource branches

MockMixedContentToolExecutor usefully covers the ImageContent + TextContent path, but the implementation also has specific handling for EmbeddedResource (TextResourceContents and image-like BlobResourceContents). Please extend this mock (or add another) to return both text and image EmbeddedResource values so those branches are exercised by tests and resource-handling regressions are less likely.

Suggested implementation:

class MockMixedContentToolExecutor:
    """模拟返回图片 + 文本的工具执行器
    同时覆盖 EmbeddedResource 文本与图片分支,以便测试资源处理逻辑。
    """

    @classmethod
    def execute(cls, tool, run_context, **tool_args):
        async def generator():
            # 内联文本内容
            inline_text = SimpleNamespace(
                type="text",
                text="mock inline text from mixed-content tool",
            )

            # 内联图片内容(保持与现有测试中图片内容约定一致,如需可调整字段名)
            inline_image = SimpleNamespace(
                type="image",
                image_url="mock://inline-image",
            )

            # 模拟 TextResourceContents 对应的 EmbeddedResource
            text_embedded_resource = SimpleNamespace(
                kind="text",
                uri="mock://embedded-text-resource",
                mime_type="text/plain",
                text="mock embedded text resource",
            )

            text_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=text_embedded_resource,
            )

            # 模拟 BlobResourceContents(图片类)对应的 EmbeddedResource
            image_embedded_resource = SimpleNamespace(
                kind="image",
                uri="mock://embedded-image-resource",
                mime_type="image/png",
                data=b"\x89PNG\r\n\x1a\n",  # 仅作占位,不必是真实图片
            )

            image_embedded_content = SimpleNamespace(
                type="embedded_resource",
                resource=image_embedded_resource,
            )

            tool_name = getattr(tool, "name", "mock_mixed_content_tool")

            # 保持与其他 mock executor 一致:返回一次性流结果,
            # 其中包含文本、图片以及两种 EmbeddedResource
            yield [
                SimpleNamespace(
                    role="tool",
                    name=tool_name,
                    content=[
                        inline_text,
                        inline_image,
                        text_embedded_content,
                        image_embedded_content,
                    ],
                )
            ]

        return generator()
  1. 如果生产代码中 EmbeddedResource / TextResourceContents / BlobResourceContents 有专门的类或字段名(例如 EmbeddedResource, TextResourceContent, BlobResourceContent 等),请将上面使用的 kind, uri, mime_type, text, data、以及 type="embedded_resource" 等字段替换为项目中实际使用的类型和字段。
  2. 确认其他 mock 执行器在测试中的返回结构(例如是否直接返回 SimpleNamespace(role=..., content=[...])、或封装在更高层对象中),如果有差异,请将 yield [...] 部分调整为与现有 mock 的结构完全一致,以确保新的 EmbeddedResource 分支被现有测试路径正确消费。



@pytest.mark.asyncio
async def test_tool_result_includes_all_calltoolresult_content(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 增加一个该测试的变体,在 CallToolResult.content 为空时,验证“无内容”分支

由于现在存在一个 if not res.content: 分支会追加 "The tool returned no content." 并提前返回,请新增一个测试,让 mock 工具返回 CallToolResult(content=[])(或在合法的情况下返回 content=None)。该测试应验证:的确走到了这个分支,并且不会调用 save_image

建议实现如下:

    async def text_chat(self, **kwargs) -> LLMResponse:
        self.call_count += 1
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"


@pytest.mark.asyncio
async def test_tool_result_handles_empty_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
    """当工具返回空 content 时,应走 'no content' 分支且不保存图片。"""

    from astrbot.core.agent.tool_image_cache import tool_image_cache
    from astrbot.core.provider import CallToolResult

    mock_provider.should_call_tools = True
    mock_provider.max_calls_before_normal_response = 1

    saved_images = []

    def fake_save_image(*args, **kwargs):
        saved_images.append((args, kwargs))

    # monkeypatch 掉图片保存函数,方便断言不会被调用
    monkeypatch.setattr(tool_image_cache, "save_image", fake_save_image)

    # 让 mock 工具返回空的 CallToolResult.content,以覆盖 `if not res.content` 分支
    async def fake_tool_callable(*args, **kwargs):
        return CallToolResult(content=[])

    # 根据现有测试/MockProvider 约定,挂上假的工具实现
    mock_provider.tool_callable = fake_tool_callable

    result = await runner.run(provider_request)

    # 断言走到了 "no content" 分支
    # (如果实现使用了其它提示文案,请同步更新下面的断言)
    assert "The tool returned no content." in result.msgs[-1].content[0].text
    # 不应保存任何图片
    assert saved_images == []
    # 工具生命周期 Hook 仍然应该被调用
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"

@pytest.mark.asyncio
async def test_tool_result_includes_all_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
  1. 确认 astrbot.core.provider.CallToolResult 的导入路径及构造签名是否为 CallToolResult(content=[...]),如不一致请调整导入路径和构造参数。
  2. 上面的断言 result.msgs[-1].content[0].text 假设最终的 LLM 输出以这种结构暴露文本;如果实际返回结构不同,请根据现有的 test_tool_result_includes_all_calltoolresult_content 中用于检查 tool result 文本的方式进行同步修改。
  3. mock_provider.tool_callable 是根据常见模式猜测的挂载点,请对照项目中 MockProvider 的实现,改为实际用于返回工具结果的属性或方法(例如 mock_provider.tools[0].fnmock_provider.mock_tool_result 等),保证在 runner 调用工具时会返回 CallToolResult(content=[])
  4. 如果 “no content” 分支的提示文案与 "The tool returned no content." 略有出入,请将断言字符串改成与实现完全一致,或使用 in 断言其中的关键片段。
Original comment in English

suggestion (testing): Add a variant of this test where CallToolResult.content is empty to verify the 'no content' path

Since there is now an if not res.content: branch that appends "The tool returned no content." and exits early, please add a test where the mock tool returns CallToolResult(content=[]) (or content=None if valid). That test should verify that this branch is taken and that no save_image calls are made.

Suggested implementation:

    async def text_chat(self, **kwargs) -> LLMResponse:
        self.call_count += 1
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"


@pytest.mark.asyncio
async def test_tool_result_handles_empty_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
    """当工具返回空 content 时,应走 'no content' 分支且不保存图片。"""

    from astrbot.core.agent.tool_image_cache import tool_image_cache
    from astrbot.core.provider import CallToolResult

    mock_provider.should_call_tools = True
    mock_provider.max_calls_before_normal_response = 1

    saved_images = []

    def fake_save_image(*args, **kwargs):
        saved_images.append((args, kwargs))

    # monkeypatch 掉图片保存函数,方便断言不会被调用
    monkeypatch.setattr(tool_image_cache, "save_image", fake_save_image)

    # 让 mock 工具返回空的 CallToolResult.content,以覆盖 `if not res.content` 分支
    async def fake_tool_callable(*args, **kwargs):
        return CallToolResult(content=[])

    # 根据现有测试/MockProvider 约定,挂上假的工具实现
    mock_provider.tool_callable = fake_tool_callable

    result = await runner.run(provider_request)

    # 断言走到了 "no content" 分支
    # (如果实现使用了其它提示文案,请同步更新下面的断言)
    assert "The tool returned no content." in result.msgs[-1].content[0].text
    # 不应保存任何图片
    assert saved_images == []
    # 工具生命周期 Hook 仍然应该被调用
    assert mock_hooks.tool_end_called, "on_tool_end应该被调用"

@pytest.mark.asyncio
async def test_tool_result_includes_all_calltoolresult_content(
    runner, mock_provider, provider_request, mock_hooks, monkeypatch
):
  1. 确认 astrbot.core.provider.CallToolResult 的导入路径及构造签名是否为 CallToolResult(content=[...]),如不一致请调整导入路径和构造参数。
  2. 上面的断言 result.msgs[-1].content[0].text 假设最终的 LLM 输出以这种结构暴露文本;如果实际返回结构不同,请根据现有的 test_tool_result_includes_all_calltoolresult_content 中用于检查 tool result 文本的方式进行同步修改。
  3. mock_provider.tool_callable 是根据常见模式猜测的挂载点,请对照项目中 MockProvider 的实现,改为实际用于返回工具结果的属性或方法(例如 mock_provider.tools[0].fnmock_provider.mock_tool_result 等),保证在 runner 调用工具时会返回 CallToolResult(content=[])
  4. 如果 “no content” 分支的提示文案与 "The tool returned no content." 略有出入,请将断言字符串改成与实现完全一致,或使用 in 断言其中的关键片段。

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request refactors the _append_tool_call_result function to correctly process and aggregate multiple content items (text, images, embedded resources) from a single tool call result, rather than only handling the first item. This includes iterating through all content items, caching multiple images with unique indices, and combining all processed results into a single tool message. A new test case has been added to verify this multi-content handling functionality, ensuring that both image and text content are correctly extracted, cached, and included in the final tool message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Tool 返回 CallToolResult 中包含多个内容时 LLM 只获取到第一个内容

1 participant