Skip to content
@gpustack

GPUStack

GPU cluster manager for optimized AI model deployment

Pinned Loading

  1. gpustack gpustack Public

    A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.

    Python 4.7k 484

  2. runner runner Public

    Collection of Dockerfiles to build images for various inference services across different accelerated backends.

    Dockerfile 10 9

  3. runtime runtime Public

    Provides a unified interface to detect GPU resources and manages GPU workloads.

    Python 11 14

  4. gguf-parser-go gguf-parser-go Public

    Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

    Go 250 24

  5. vox-box vox-box Public

    A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

    Python 201 33

Repositories

Showing 10 of 15 repositories

Most used topics

Loading…