Results for "llm-harness"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

75 skills found · Page 1 of 3

langroid / Langroid

4.0k

Harness LLMs with Multi-Agent Programming

universal

agentsaichatgpt+15

Updated 1d ago

claw-eval / Claw Eval

387

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

universal

agentharnessllm+1

Updated 2h ago

suyoumo / OpenClawProBench

325

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

universal

agentbenchmarkevaluation+4

Updated 32m ago

digiteinfotech / Kairon

274

Agentic AI platform that harnesses Visual LLM Chaining to build proactive digital assistants

universal

botbot-frameworkbotkit+17

Updated 9h ago

XiaoxinHe / TAPE

268

Official Implementation of ICLR 2024 paper "Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning"

universal

Updated 8d ago

bolt-foundry / Gambit

227

Agent harness framework for building, running, and verifying LLM workflows

universal

Updated 1d ago

junchenzhi / Awesome LLM Ensemble

215

A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"

universal

ensembleensemble-learningensemble-machine-learning+11

Updated 6d ago

bmd1905 / ChatOpsLLM

136

To simplify and streamline LLM operations, empowering developers and organizations to harness the full potential of large language models with ease.

universal

chatbotdevopslangfuse+5

Updated 7d ago

lt-asset / Resym

130

For our CCS24 paper 🏆 "ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries" by Danning Xie, Zhuo Zhang, Nan Jiang, Xiangzhe Xu, Lin Tan, and Xiangyu Zhang. 🏆 ACM SIGSAC Distinguished Paper Award Winner

universal

binaryanalysisllmllm4code+1

Updated 15h ago

1038lab / ComfyUI SparkTTS

125

ComfyUI-SparkTTS is a custom ComfyUI node implementation of SparkTTS, an advanced text-to-speech system that harnesses the power of large language models (LLMs) to generate highly accurate and natural-sounding speech.

universal

comfyuicomfyui-nodessparktts+1

Updated 16d ago

UnicomAI / Hexagent

109

HexAgent – An Agent harness that gives any LLM a computer to complete tasks the way humans do

universal

agent-harnessagentscowork

Updated 4d ago

project-etalon / Etalon

LLM Serving Performance Evaluation Harness

universal

latencyllm-inferencethroughput+1

Updated 10d ago

logic-star-ai / Swt Bench

[NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation

universal

benchmarkevaluation-frameworkllm+1

Updated 6d ago

poloclub / ClickDiffusion

ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing

universal

Updated 1mo ago

deeplearning-wisc / Haloscope

source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"

universal

Updated 22d ago

ritun16 / Llm Text Summarization

A comprehensive guide and codebase for text summarization using Large Language Models (LLMs). Dive into techniques, from chunking to clustering, and harness the power of LLMs like GPT-3.5 and GPT-4.

universal

Updated 1mo ago

boramorka / LLM Book

This book is a comprehensive manual designed to empower professionals to harness the potential of AI technologies responsibly and innovatively. The book addresses the technical, ethical, and practical aspects of AI development, offering a roadmap for those looking to advance in the rapidly evolving field of LLM Ops.

universal

Updated 6d ago

sfw / Loom

Loom is an AI harness, that can be used with local or cloud LLMs, for complex tasks. It decomposes work, drives execution through a verification harness, and keeps models on track with structured state instead of history. It can route between thinking and acting models, verifies outputs, and exposes an APP/TUI/API/CLI/MCP for both humans & agents.

claude codecursor

adhocaicowork+6

Updated 2h ago

GregorStocks / Mage Bench

Magic: The Gathering harness for LLMs

universal

Updated 3d ago

peteromallet / Megaplan

General-purpose planning and execution harness for LLMs — structured phases, critique, gating, and review

universal

Updated 16h ago