30 skills found
diegosouzapw / OmniRouteOmniRoute is an AI gateway for multi-provider LLMs: an OpenAI-compatible endpoint with smart routing, load balancing, retries, and fallbacks. Add policies, rate limits, caching, and observability for reliable, cost-aware inference.
kubernetes-sigs / Gateway Api Inference ExtensionGateway API Inference Extension
lightseekorg / SmgEngine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, tokenization caching, Responses API, embeddings, WASM plugins, MCP, and multi-tenant auth.
Nayjest / Lm ProxyOpenAI-compatible HTTP LLM proxy / gateway for multi-provider inference (Google, Anthropic, OpenAI, PyTorch). Lightweight, extensible Python/FastAPI—use as library or standalone service.
inference-gateway / Inference GatewayAn open-source, cloud-native, high-performance gateway unifying multiple LLM providers, from local solutions like Ollama to major cloud providers such as OpenAI, Groq, Cohere, Anthropic, Cloudflare and DeepSeek.
NightmareAI / CogflareA flexible gateway for running ML inference jobs through cloud providers or your own GPU. Powered by Replicate and Cloudflare Workers.
modelgw / ModelgwGateway and load balancer to your LLM inference endpoints
Cognipeer / ConsoleOperate inference, LLM Gateways, vector stores, tracing, guardrails, RAG, config, and incident workflows behind one production-ready console with tenant isolation built in.
inference-gateway / AdkAn Agent Development Kit (ADK) allowing for seamless creation of A2A-compatible agents written in Go
microsoft / InnerEye GatewayThe InnerEye-Gateway is a Windows service that acts as a DICOM end point to run inference on https://github.com/microsoft/InnerEye-DeepLearning models.
inference-gateway / Google Calendar AgentA2A agent server enabling Google Calendar scheduling, retrieval, and automation
cameronking4 / Programmatic Tool Calling AI SDK⚡ Cut LLM inference costs 80% with Programmatic Tool Calling. Instead of N tool call round-trips, generate JavaScript to orchestrate tools in Vercel Sandbox. Supports Anthropic, OpenAI, 100+ models via AI Gateway. Novel MCP Bridge for external service integration.
alvaropaco / HaifProduction-ready microservices framework for AI inference over RPC. It provides a Gateway for client requests, an Orchestrator that schedules work, a Registry for model metadata, Workers that run inference, and a full observability stack (Prometheus, Grafana, Loki, Jaeger) — all wired together with Docker Compose.
sofianhamiti / Amazon Sagemaker Pipelines Serverless InferenceDeploying a serverless inference service with Amazon SageMaker Pipelines, AWS Lambda, Amazon API Gateway, and CDK
busthorne / SimpA simple point of consumption for text inference providers, & OpenAI-compatible gateway daemon.
forpublicai / Platform.publicai.coThe API gateway for the Public AI Inference Utility, based on Zuplo
sofianhamiti / Aws Lambda R InferenceDeploying a Serverless R Inference Service Using AWS Lambda, Amazon API Gateway, and the AWS CDK
savinims / DATAS Causal DiscoveryCausal inference tutorials written as part of the Data Analysis Tools for Atmospheric Scientists (DATAS) Gateway.
sofianhamiti / Aws Lambda Multi Model Express WorkflowDeploying a Multi-Model Inference Service With AWS Lambda, Synchronous Express Workflows, Amazon API Gateway, and CDK
forsterdan51 / Edge Vision AIA secure, lightweight inference runtime optimized for ARM boards and IoT gateways. It enables real-time computer-vision processing without cloud dependencies.