Skip to content

hack-ink/voxit

Repository files navigation

Voxit

AI dictation App for macOS (MVP scaffold).

License Docs Language Checks Release GitHub tag (latest by date) GitHub last commit GitHub code lines

Feature Highlights

What is implemented in v1

  • Menubar dictation app on macOS with start/stop hotkey control.
  • ChatGPT login flow with browser OAuth as default and device-code fallback (if needed).
  • Real-time pass-1 transcription from mic with committed/draft streaming assembly.
  • Pass-2 finalize pass using gpt-4o-transcribe for better punctuation and stability.
  • Optional Pass-3 rewrite for cleaner English output with numeric/proper noun protection.
  • Auto-paste into the app that was frontmost when recording began.
  • Configurable behavior and models via config.toml.

For the normative product contract, constraints, and gaps, see System Spec v1.

Status

V1 target is macOS-first and aligned to the English-only voice input design.

  • Status: ✅ Core MVP loop is implemented (record → stream preview → finalize → optional rewrite → paste).
  • Scope: ✅ Native macOS mic capture + OpenAI model pipeline only.
  • Limitation: ✅ Linux/Windows build is intentionally disabled.
  • Limitation: ⚠️ Known gaps vs full spec are documented in System Spec v1 (hotkey configurability, tray menu behavior, CPAL fallback robustness, and rollout cleanup items).

Usage

Installation

Build from Source

# Clone the repository.
git clone https://github.com/hack-ink/voxit
cd voxit

# To install Rust on macOS and Unix, run the following command.
#
# To install Rust on Windows, download and run the installer from `https://rustup.rs`.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain stable

# Install the necessary dependencies. (Unix only)
# Using Ubuntu as an example, this really depends on your distribution.
sudo apt-get update
sudo apt-get install <DEPENDENCIES>

# Build the voxit package, and the binary will be available at `target/release/voxit`.
cargo build --release -p voxit

# If you are a macOS user and want to have a `Voxit.app`, run the following command.
# Install `cargo-bundle` to pack the binary into an app.
cargo install cargo-bundle
# Pack the app, and the it will be available at `target/release/bundle/osx/Voxit.app`.
cargo bundle --release -p voxit

Download Pre-built Binary

  • macOS
  • Windows / Linux
    • Not included in V1 release target (macOS-only).

Configuration

Voxit stores settings in:

$HOME/Library/Application Support/voxit/config.toml

Current supported keys are:

[ui]
start_hidden = true
panel_width_px = 420
panel_height_px = 260

[hotkey]
chord = "ctrl+shift+space"
mode = "toggle" # toggle | hold

[audio]
backend = "voice_processing" # voice_processing | cpal
input_sample_rate_hz = 48000
input_device_name = ""
input_device_id = 0
realtime_target_rate_hz = 24000

[openai]
api_base_url = "https://api.openai.com/v1"
realtime_model = "gpt-4o-mini-transcribe"
finalize_model = "gpt-4o-transcribe"
rewrite_model = "gpt-5.2-mini"
language = "en"

[openai.realtime]
noise_reduction = "near_field" # near_field | far_field | off

[rewrite]
enabled = true
auto = true
guard_numbers = true
max_output_chars = 8000
style = "clean" # clean | formal | concise

[paste]
lock_frontmost_app = true
method = "clipboard_cmd_v"

First-run onboarding checklist:

  • Sign in with ChatGPT.
  • Microphone permission in System Settings → Privacy & Security → Microphone.
  • Accessibility permission in Privacy & Security → Accessibility (for Cmd+V fallback).
  • Input Monitoring permission in Privacy & Security → Input Monitoring (for global hotkey hooks).
  • Voxit uses request buttons to guide you through the permission prompts in sequence (Microphone → Accessibility → Input Monitoring); grant each permission and re-check when prompted.
  • Verify paste flow after permission grant and restart the app if needed.

The app saves updates to the same config.toml path when settings are changed.

Interaction

Runtime behavior

  • Start recording: press the configured hotkey (default Ctrl+Shift+Space) to toggle.
  • While listening: panel shows live draft text and committed segments.
  • Stop recording: toggle key again or release key in hold mode.
  • Finalize: Pass-2 runs automatically; rewrite runs by default unless disabled in settings.
  • Microphone input selection is persisted in config as audio.input_device_id and audio.input_device_name.
  • Refresh workflow: the picker list is refreshed at startup and via the Refresh microphones control before choosing from a list of input-capable devices.
  • Runtime fallback: if a saved explicit device id is unavailable, Voxit falls back to the system default input device and continues recording.
  • Paste behavior: by default paste rewritten text after finalize, or paste raw transcript via available controls.
  • Output target: text is pasted into the app that was frontmost when dictation started.

Update

Changelog

Development

Architecture

Implementation snapshot

  • eframe/egui panel + menubar entrypoint.
  • Dedicated auth/session/config/rewrite/paste pipeline and typed application state.
  • macOS frontmost-app capture + clipboard/command-paste integration.

Support Me

If you find this project helpful and would like to support its development, you can buy me a coffee!

Your support is greatly appreciated and motivates me to keep improving this project.

  • Fiat
  • Crypto
    • Bitcoin
      • bc1pedlrf67ss52md29qqkzr2avma6ghyrt4jx9ecp9457qsl75x247sqcp43c
    • Ethereum
      • 0x3e25247CfF03F99a7D83b28F207112234feE73a6
    • Polkadot
      • 156HGo9setPcU2qhFMVWLkcmtCEGySLwNqa3DaEiYSWtte4Y

Thank you for your support!

Appreciation

We would like to extend our heartfelt gratitude to the following projects and contributors:

  • The Rust community for their continuous support and development of the Rust ecosystem.

Additional Acknowledgements

  • Not yet populated.

License

Licensed under GPL-3.0.

About

Voxly.to — Speak. It types. It polishes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Generated from hack-ink/vibe-mono