semantic-cache

Here are 47 public repositories matching this topic...

codefuse-ai / ModelCache

A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.

llm semantic-cache

Updated Jun 30, 2025
Python

redis / redis-vl-python

Star

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

python search redis openai embedding redis-search vector-search huggingface vector-database large-language-models llm anthropic semantic-cache retrieval-augmented-generation llmcache

Updated Mar 26, 2026
Python

aqstack / mimir

Star

mimir is a drop-in proxy that caches LLM API responses using semantic similarity, reducing costs and latency for repeated or similar queries.

kubernetes golang caching proxy openai cost-optimization llm semantic-cache

Updated Dec 24, 2025
Go

peva3 / SmarterRouter

Star

SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.

docker self-hosted model-serving gpu-monitoring fastapi llm openai-proxy semantic-cache local-llm ollama llm-proxy ollama-api ai-gateway llm-router self-hosted-ai ai-cache

Updated Mar 23, 2026
Python

sensoris / semcache

Star

Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.

gemini openai llm anthropic semantic-cache genai

Updated Jan 2, 2026
Rust

vcache-project / vCache

Star

Reliable and Efficient Semantic Prompt Caching with vCache

Updated Dec 17, 2025
Python

ferro-labs / ai-gateway

Star

One API for 25+ LLMs, OpenAI, Anthropic, Bedrock, Azure. Caching, guardrails & cost controls. Go-native LiteLLM & Kong AI Gateway alternative.

Updated Mar 27, 2026
Go

redis / redis-vl-java

Star

Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.

java redis ai embeddings vectors rag vector-search vector-database llm generative-ai semantic-cache llm-cache rag-chatbot semantic-routing agentic-ai

Updated Mar 17, 2026
Java

Harras3 / Enterprise-Grade-RAG

Star

This is a RAG based chatbot in which semantic cache and guardrails have been incorporated.

guardrails semantic-cache retrieval-augmented-generation

Updated Nov 11, 2024
HTML

aws-samples / Reducing-Hallucinations-in-LLM-Agents-with-a-Verified-Semantic-Cache

Star

This repository contains sample code demonstrating how to implement a verified semantic cache using Amazon Bedrock Knowledge Bases to prevent hallucinations in Large Language Model (LLM) responses while improving latency and reducing costs.

agent aws demo bedrock rag aws-blog llm semantic-cache llm-agent amazon-bedrock amazon-bedrock-agents amazon-bedrock-knowledge-bases

Updated Apr 3, 2025
Jupyter Notebook

zakariaf / RAG-Cache

Star

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

redis embeddings openai cost-optimization rag fastapi vector-database qdrant semantic-cache llm-caching

Updated Dec 2, 2025
Python

jonathanscholtes / LLM-Performance-with-Azure-Cosmos-DB-Semantic-Cache

Star

Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.

vector-search azurecosmosdb semantic-cache

Updated Mar 22, 2024
Python

riccardogiuriola / vecs

Star

Ultra-fast Semantic Cache Proxy written in pure C

c high-performance embeddings vector-search llama-cpp semantic-cache llm-ops

Updated Jan 28, 2026
C

mar1boroman / redis-movies-gen-ai

Star

Redis Vector Similarity Search, Semantic Caching, Recommendation Systems and RAG

redis vector vector-search llm semantic-cache redis-vector-search retrieval-augmented-generation dalle-3

Updated Apr 3, 2024
Python

vcal-project / vcal-core

Star

VCAL Core — high-performance semantic cache and vector cache library for LLM applications.

in-memory semantic-search similarity-search hnsw vector-search semantic-cache llm-cache ai-infrastructure ai-cache vector-cache

Updated Feb 22, 2026
Rust

mar1boroman / ask-redis-blogs

Star

A ChatBot using Redis Vector Similarity Search, which can recommend blogs based on user prompt

python redis chatbot vector-search vector-database sentence-transformers huggingface-transformers large-language-models llm generative-ai semantic-cache redis-vector-search llmcache

Updated Sep 30, 2023
Python

benitomartin / semantic-caching-qdrant-splade

Star

Optimized RAG Retrieval with Indexing, Quantization, Hybrid Search and Caching

quantization hnsw huggingface hybrid-search large-language-models splade qdrant-vector-database semantic-cache retrieval-augmented-generation

Updated Nov 6, 2024
Python

bv-saketha-rama / flux-cache

Star

Adaptive semantic cache for LLMs with streaming support, ML-based thresholds, and real-time cost tracking. Built in Rust for sub-millisecond performance.

Updated Jan 14, 2026

nabaos / nabaos

Star

An operating system for autonomous AI agents — 5-tier cache-first routing (97.5% cost reduction), Ed25519 constitution enforcement, 130 agents, 106 plugins. Rust.

Updated Mar 19, 2026
Rust

munimx / recallm

Star

Semantic cache layer for LLM APIs — embed prompts locally, find near-matches, skip redundant LLM calls.

python embeddings openai llm semantic-cache

Updated Mar 10, 2026
Python

Improve this page

Add a description, image, and links to the semantic-cache topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semantic-cache topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semantic-cache

Here are 47 public repositories matching this topic...

codefuse-ai / ModelCache

redis / redis-vl-python

aqstack / mimir

peva3 / SmarterRouter

sensoris / semcache

vcache-project / vCache

ferro-labs / ai-gateway

redis / redis-vl-java

Harras3 / Enterprise-Grade-RAG

aws-samples / Reducing-Hallucinations-in-LLM-Agents-with-a-Verified-Semantic-Cache

zakariaf / RAG-Cache

jonathanscholtes / LLM-Performance-with-Azure-Cosmos-DB-Semantic-Cache

riccardogiuriola / vecs

mar1boroman / redis-movies-gen-ai

vcal-project / vcal-core

mar1boroman / ask-redis-blogs

benitomartin / semantic-caching-qdrant-splade

bv-saketha-rama / flux-cache

nabaos / nabaos

munimx / recallm

Improve this page

Add this topic to your repo