[Feature] Add MLX inference backend

## Summary

Add an MLX-based inference backend as an alternative to the existing llama.cpp path, targeting Apple Silicon Macs.

## Problem

The current local inference path uses llama.cpp, which runs on the CPU (and partially on Metal via GGML). MLX is Apple's own machine learning framework optimized for Apple Silicon's unified memory architecture. On M-series chips, MLX can deliver significantly better throughput and lower latency than llama.cpp for the same model because it is designed ground-up for the hardware.

Users with Apple Silicon Macs would get faster completions and lower energy draw from local inference without switching to the Apple Intelligence engine.

## Proposed direction

- Add a new `SuggestionEngineKind` case (e.g. `.llamaMLX` or `.mlx`) alongside the existing `.llamaOpenSource` and `.appleIntelligence` cases.
- Implement an `MLXSuggestionEngine` conforming to the existing `SuggestionEngineProtocol` contract in `SuggestionSubsystemContracts.swift`.
- Route through `SuggestionEngineRouter` the same way the llama path does today.
- Support GGUF or MLX-native quantized weights (e.g. via `mlx-community` HuggingFace models). Reuse the existing `ModelDownloadManager` / `BundledRuntimeLocator` where possible.
- Gate the engine option on Apple Silicon availability so it never appears on Intel Macs.
- Surface the new engine in the Engine picker in both the menu bar popup and Settings > Engine & Model.

## Additional context

- MLX Swift bindings: https://github.com/ml-explore/mlx-swift
- MLX Examples (LLM inference): https://github.com/ml-explore/mlx-swift-examples
- The existing llama.cpp integration lives in `LlamaRuntimeCore`, `LlamaRuntimeManager`, and `LlamaSuggestionEngine` and is a good structural reference.
- The `CotabbyInference` Swift package currently wraps llama.cpp; MLX could live in a new target in that package or in a dedicated `CotabbyMLX` package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add MLX inference backend #457

Summary

Problem

Proposed direction

Additional context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] Add MLX inference backend #457

Description

Summary

Problem

Proposed direction

Additional context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions