75 skills found · Page 1 of 3
langroid / LangroidHarness LLMs with Multi-Agent Programming
claw-eval / Claw EvalClaw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
suyoumo / OpenClawProBenchOpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.
digiteinfotech / KaironAgentic AI platform that harnesses Visual LLM Chaining to build proactive digital assistants
XiaoxinHe / TAPEOfficial Implementation of ICLR 2024 paper "Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning"
bolt-foundry / GambitAgent harness framework for building, running, and verifying LLM workflows
junchenzhi / Awesome LLM EnsembleA curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"
bmd1905 / ChatOpsLLMTo simplify and streamline LLM operations, empowering developers and organizations to harness the full potential of large language models with ease.
lt-asset / ResymFor our CCS24 paper 🏆 "ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries" by Danning Xie, Zhuo Zhang, Nan Jiang, Xiangzhe Xu, Lin Tan, and Xiangyu Zhang. 🏆 ACM SIGSAC Distinguished Paper Award Winner
1038lab / ComfyUI SparkTTSComfyUI-SparkTTS is a custom ComfyUI node implementation of SparkTTS, an advanced text-to-speech system that harnesses the power of large language models (LLMs) to generate highly accurate and natural-sounding speech.
UnicomAI / HexagentHexAgent – An Agent harness that gives any LLM a computer to complete tasks the way humans do
project-etalon / EtalonLLM Serving Performance Evaluation Harness
logic-star-ai / Swt Bench[NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation
poloclub / ClickDiffusionClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing
deeplearning-wisc / Haloscopesource code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"
ritun16 / Llm Text SummarizationA comprehensive guide and codebase for text summarization using Large Language Models (LLMs). Dive into techniques, from chunking to clustering, and harness the power of LLMs like GPT-3.5 and GPT-4.
boramorka / LLM BookThis book is a comprehensive manual designed to empower professionals to harness the potential of AI technologies responsibly and innovatively. The book addresses the technical, ethical, and practical aspects of AI development, offering a roadmap for those looking to advance in the rapidly evolving field of LLM Ops.
sfw / LoomLoom is an AI harness, that can be used with local or cloud LLMs, for complex tasks. It decomposes work, drives execution through a verification harness, and keeps models on track with structured state instead of history. It can route between thinking and acting models, verifies outputs, and exposes an APP/TUI/API/CLI/MCP for both humans & agents.
GregorStocks / Mage BenchMagic: The Gathering harness for LLMs
peteromallet / MegaplanGeneral-purpose planning and execution harness for LLMs — structured phases, critique, gating, and review