Results for "inference-routing"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

38 skills found · Page 1 of 2

open-compress / Claw Compactor

2.0k

14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.

universal

ai-agent-toolsai-infrastructureast-code-analysis+17

Updated 5m ago

diegosouzapw / OmniRoute

2.0k

OmniRoute is an AI gateway for multi-provider LLMs: an OpenAI-compatible endpoint with smart routing, load balancing, retries, and fallbacks. Add policies, rate limits, caching, and observability for reliable, cost-aware inference.

universal

Updated 42m ago

thushan / Olla

182

High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.

universal

aiamdgolang+17

Updated 1d ago

greynewell / Infermux

Route inference across LLM providers. Track cost per request.

claude codeclaude desktop

ai-gatewayai-infrastructureanthropic+17

Updated 1mo ago

T-Sunm / Rag Ops

This project applies the core knowledge from the LLMOps module, including the design and implementation of the API Layer, Inference Layer, Observability Layer, Cache Layer, Guardrails Layer, Routing Layer, and the Data Ingestion Pipeline.

universal

airflowchatbotchromadb+10

Updated 1d ago

ZhenweiAn / Dynamic MoE

Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"

universal

Updated 7d ago

JakeFenley / Koa Zod Router

Build typesafe routes for Koa with ease. Utilizes Typescript, Zod, and Koa-Router to provide an easy solution to I/O validation and type inference.

universal

apiendpointhttp+11

Updated 5mo ago

jrf0110 / 8track

A service worker router with async middleware and neato type-inference inspired by Koa

universal

Updated 11mo ago

nhevers / Mica Plugin

Claude Code plugin - route compute through MVM nodes on cheap renewable energy. Save tokens, cut inference costs.

claude codeclaude desktop+1

aiclaude-codeclaude-plugin+9

Updated 14h ago

shahghasiadil / Laravel Bruno Generator

Generate Bruno API collections from Laravel routes with automatic request body inference and environment support

universal

apiapi-clientbruno+6

Updated 19d ago

pmh / Funkyweb

The clojure web framework with route inference

universal

Updated 6y ago

pmerolla / Fomoe

Fast Opportunistic Mixture-Of-Experts. From-scratch C/HIP MoE inference with multi-tier caching and cache-aware routing. First ever example of running Qwen3.5-397B at 5–9 tok/s on a $2,100 desktop.

universal

Updated 3d ago

bug-ops / Zeph

Rust AI agent where every context token earns its place. Self-learning skills, temporal graph memory, cascade quality routing, OWASP AI security. Hybrid inference: Ollama · Claude · Gemini · OpenAI · GGUF. MCP + ACP. One binary.

claude codeclaude desktop+2

ai-agentclaudecli+17

Updated 2h ago

lingticio / Llmg

🧘 Extensive LLM endpoints, expended capabilities through your favorite protocols, 🕸️ GraphQL, ↔️ gRPC, ♾️ WebSocket. Extended SOTA support for structured data, function calling, instruction mapping, load balancing, grouping, intelli-routing. Advanced tracing and inference tracking.

universal

Updated 5d ago

aiming-lab / CITER

[COLM'25] CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

universal

Updated 26d ago

huanyuhello / Awesome Dynamic Inference

A list for dynamic inference research, including: dynamic routing, anytime inference and conditional computation

universal

Updated 2y ago

ialacol / Text Inference Batcher

A high performance batching router optimises max throughput for text inference workload

universal

Updated 1y ago

www-norma-dev / IONOS Simple Chatbot

IONOS AI Chatbot is a starter pack built around a core ReAct agent with a FastAPI backend and Streamlit frontend for building intelligent conversational AI. It connects to IONOS Hub for inference models and IONOS Studio for fine-tuned models, supports real-time web search, model routing, and tool calling. It’s your European go-to solution for Infra

universal

aichatbotfastapi+7

Updated 2mo ago

olwal / Scope AI Language

Generative AI plugins for language-driven, real-time video inference and generation. Ollama VLM/LLM pipelines and UDP prompt routing, built on shared libraries for communication and AI services (scope-bus, scope-language).

universal

Updated 10d ago

expresso / Router

Express router with automatic type inference, validation, and OpenAPI documentation generation

universal

expresshacktoberfestopenapi-specification+1

Updated 2y ago