Lokasi ngalangkungan proxy:   [ UP ]  
[Ngawartoskeun bug]   [Panyetelan cookie]                
Skip to content
#

semantic-caching

Here are 31 public repositories matching this topic...

An intelligent gateway for Claude APIs that dynamically routes requests to the most cost-efficient model, caches responses, and escalates based on confidence signals — reducing LLM spend without sacrificing quality.

  • Updated May 6, 2026
  • Python

A retrieval-augmented generation pipeline in Python with a rigorous offline evaluation harness. Chunks and embeds documents, retrieves by vector similarity, and generates grounded answers — with pluggable LLM providers (including a deterministic local fake for tests) and metrics for retrieval quality and answer faithfulness. No API key required.

  • Updated Jun 1, 2026
  • Python

A systems research platform for semantic KV-cache orchestration, topology-aware memory placement, distributed prefix reuse, and rack-scale inference memory simulation.

  • Updated May 25, 2026
  • Python

Machine-readable companion to the IEEE OJ-CS survey 'Semantic Caching and Response Reuse for Large Language Model Services: A Survey' (Chukkapalli, Mishra, Naik, 2026): 21-work evidence matrix, systematic-search log, proposed benchmark trace schema, stdlib-only contract validator, and CPU pilot. Code MIT; data CC-BY-4.0.

  • Updated Jun 5, 2026
  • Python

Improve this page

Add a description, image, and links to the semantic-caching topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semantic-caching topic, visit your repo's landing page and select "manage topics."

Learn more