Results for "benchmark-test"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

871 skills found · Page 1 of 30

LearningCircuit / Local Deep Research

4.3k

Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local & Encrypted.

claude codeclaude desktop

academiaanthropicarxiv+17

Updated 4h ago

denji / Awesome Http Benchmark

3.7k

HTTP(S) benchmark tools, testing/debugging, & restAPI (RESTful)

universal

awesomeawesome-listbenchmark+15

Updated 1d ago

minitest / Minitest

3.4k

minitest provides a complete suite of testing facilities supporting TDD, BDD, and benchmarking.

universal

minitestrubyseattlerb+2

Updated 1d ago

bojand / Ghz

3.3k

Simple gRPC benchmarking and load testing tool

universal

grpchacktoberfest

Updated 6h ago

phoronix-test-suite / Phoronix Test Suite

3.0k

The Phoronix Test Suite open-source, cross-platform automated testing/benchmarking software.

universal

benchmarkbenchmarkingbsd+6

Updated 1d ago

joedicastro / Vps Comparison

1.4k

A comparison between some VPS providers. It uses Ansible to perform a series of automated benchmark tests over the VPS servers that you specify. It allows the reproducibility of those tests by anyone that wanted to compare these results to their own. All the tests results are available in order to provide independence and transparency.

universal

ansiblecloudcomparison+6

Updated 2d ago

kubernetes / Perf Tests

969

Performance tests and benchmarks

universal

Updated 2d ago

HewlettPackard / Netperf

955

Netperf is a benchmark that can be used to measure the performance of many different types of networking. It provides tests for both unidirectional throughput, and end-to-end latency.

universal

Updated 6d ago

n-st / Nench

914

VPS benchmark script — based on the popular bench.sh, plus CPU and ioping tests, and dual-stack IPv4 and v6 speedtests by default

universal

benchmarkspeedtestvps

Updated 14d ago

microsoft / WindowsAgentArena

849

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

universal

agenticaiai-agent+6

Updated 2d ago

maxim-saplin / CrossPlatformDiskTest

838

Windows, macOS and Android storage (HDD, SSD, RAM) speed testing/performance benchmarking app

universal

androidbenchmarkdesktop+10

Updated 1d ago

OWASP-Benchmark / BenchmarkJava

786

OWASP Benchmark is a test suite designed to verify the speed and accuracy of software vulnerability detection tools. A fully runnable web app written in Java, it supports analysis by Static (SAST), Dynamic (DAST), and Runtime (IAST) tools that support Java. The idea is that since it is fully runnable and all the vulnerabilities are actually exploitable, it’s a fair test for any kind of vulnerability detection tool. For more details on this project, please see the OWASP Benchmark Project home page.

universal

Updated 6d ago

dotnet / Performance

759

This repo contains benchmarks used for testing the performance of all .NET Runtimes

universal

Updated 4d ago

howardjohn / Gateway Api Bench

723

Gateway API Benchmarks provides a common set of tests to evaluate a Gateway API implementation.

universal

Updated 9h ago

SanMuzZzZz / LuaN1aoAgent

672

LuaN1aoAgent is a cognitive-driven AI hacker. It is a fully autonomous AI penetration testing agent powered by DeepSeek V3.2. Using dual-graph reasoning, LuaN1ao achieves a success rate of over 90% on the XBOW Benchmark, with a median exploit cost of just $0.09.

universal

agentsaiai-agents+14

Updated 4h ago

ServiceNow / AgentLab

556

AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

universal

agentagentsbenchmark+6

Updated 11h ago