A simple worker which fetches a prompt job every 2s and processes it through an AI endpoint.
Tested only with LM Studio model API
This is a multirepo project, and depends on the Async Queue.
- Async: Process jobs from a queue.
- Multiple instance There's no limit on the amount of simultaneous active workers, each will fetch a different Job.
- Private/Public Each worker can be assigned to users(Giving priority), once there's no exclusive jobs for the worker... It will fetch non assigned prompts(Public)
git clone https://github.com/Angel-del-dev/Async-AI-Prompt-Local-Worker.git async-ai-worker
cd async-ai-worker
cp .env-example .env
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt--psql
-- Create a worker
|> insert into workers(created_at) values (current_timestamp);sqlalchemypandaspython-dotenvpsycopg2-binaryrequests
(venv) |> python3 main.py- Python (Version3.13.5)
- PostgreSQL engine.
- LM Studio (Either GUI or headless)
- AI Model (Installed via LM Studio)
LM Studio API seems to give better results when Enable thinking is disabled in inference/Custom Fields
Project startup depends on .env variables.
| Var | Description | Type | Required | Domain |
|---|---|---|---|---|
DB_HOST |
Database host. | string |
true |
DB |
DB_NAME |
Database name. | string |
true |
DB |
DB_USER |
Database username. | string |
true |
DB |
DB_PASSWORD |
Database password. | string |
true |
DB |
DB_PORT |
Database port. | int |
true |
DB |
WORKER |
UUID obtained when inserting the new worker in the DB. | string |
true |
ENV |
MODEL |
Model code given by LM Studio(Ex: qwen/qwen3.5-9b, google/gemma-4-e4b). | string |
true |
AI |
MODEL_URL |
LM Studio local API Url. | string |
true |
LM_STUDIO |