omlx Provider Fails to Report Token Usage — Dashboard Shows ?/131k (?%) and Compactions: 0

**Describe the bug**

When using the `omlx` provider (MLX local inference via `http://127.0.0.1:8200/v1`), the OpenClaw dashboard UI fails to correctly display or track context window usage and compaction statistics. Specifically, the `/status` output shows `📚 Context: ?/131k (?%)` and `🧹 Compactions: 0` even when the session has accumulated substantial conversation history.

The `?` placeholders indicate that OpenClaw cannot resolve token counts from the omlx provider's response, making it impossible to monitor how close the session is to its 131k context window limit or verify whether auto-compaction is functioning properly.

**To Reproduce**

1. Configure the `omlx` provider in `openclaw.json` with a local MLX endpoint (e.g., `http://127.0.0.1:8200/v1`) serving `Qwen3.6-35B-A3B-UD-MLX-4bit` with `contextWindow: 131072`
2. Start a direct chat session and send multiple messages to build up conversation history
3. Run `/status` or check the dashboard session card
4. Observe `📚 Context: ?/131k (?%)` instead of actual token counts like `45k/131k (34%)`
5. Run `openclaw status --usage` — token usage data is similarly unavailable for omlx sessions

**Expected behavior**

- The context usage should display actual token counts (e.g., `📚 Context: 45,231/131,072 (34%)`) based on the provider's `usage` field in the completion response
- Compaction count should increment correctly when auto-compaction triggers near the context limit
- The `?` and `(?%)` placeholders should only appear when token tracking is genuinely unavailable, not when the provider returns valid usage data

**Screenshots**

N/A — the issue manifests as `?` placeholders in the status display rather than a visual error.

**Desktop (please complete the following information):**
 - macOS Version: 26.4.1
 - oMLX Version: 0.3.8 (OpenClaw 2026.5.4, provider is a local MLX inference server on port 8200)

**Additional context**

The `omlx` provider is configured as a standard OpenAI-compatible endpoint (`api: "openai-completions"`) pointing to a local MLX inference server. The model definition includes `contextWindow: 131072` and `maxTokens: 8192`.

The issue likely stems from one of these:
1. The MLX server's completion response doesn't include a properly structured `usage` object (with `prompt_tokens`, `completion_tokens`, `total_tokens` fields) that OpenClaw expects from OpenAI-compatible APIs
2. OpenClaw's token counting logic doesn't handle the response format from this specific MLX backend correctly
3. The `cost` fields are all `0` (free/self-hosted), which may cause the token accounting pipeline to skip usage tracking entirely

The provider config looks like:

```json
"omlx": {
  "baseUrl": "http://127.0.0.1:8200/v1",
  "apiKey": "omlx",
  "api": "openai-completions",
  "models": [{
    "id": "Qwen3.6-35B-A3B-UD-MLX-4bit",
    "contextWindow": 131072,
    "maxTokens": 8192,
    "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
  }]
}
```

Other providers (mimo, volcengine, my-gpt) display token counts correctly, so this appears specific to the omlx/MLX provider path.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

omlx Provider Fails to Report Token Usage — Dashboard Shows ?/131k (?%) and Compactions: 0 #1069

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

omlx Provider Fails to Report Token Usage — Dashboard Shows ?/131k (?%) and Compactions: 0 #1069

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions