1,299 skills found · Page 1 of 44
CLUEbenchmark / CLUE中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
FreedomIntelligence / Awesome AI4MedA curated list of medical LLMs, multimodal systems, datasets, benchmarks, and more. 🏥
github / CodeSearchNetDatasets, tools, and benchmarks for representation learning of code.
beir-cellar / BeirA Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
snap-stanford / OgbBenchmark datasets, data loaders, and evaluators for graph machine learning
Thinklab-SJTU / Bench2Drive[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
ChineseGLUE / ChineseGLUELanguage Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
RoboVerseOrg / RoboVerseRoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
doc-analysis / TableBankTableBank: A Benchmark Dataset for Table Detection and Recognition
RuihengZhang / IFSOD DatasetDataset approched by A Benchmark and Frequency Compression Method for Infrared Few-Shot Object Detection
YerevaNN / Mimic3 BenchmarksPython suite to construct benchmark machine learning datasets from the MIMIC-III 💊 clinical database.
EpistasisLab / PmlbPMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
LAMDA-Tabular / TALENTA comprehensive toolkit and benchmark for tabular data learning, featuring 35+ deep methods, more than 10 classical methods, and 300 diverse tabular datasets.
pangeo-data / WeatherBenchA benchmark dataset for data-driven weather forecasting
bcmi / Image Harmonization Dataset IHarmony4[CVPR 2020] The first large-scale public benchmark dataset for image harmonization. The code used in our paper "DoveNet: Deep Image Harmonization via Domain Verification", CVPR2020. Useful for image harmonization, image composition, etc.
RobustBench / RobustbenchRobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
syncora-ai / Syncora BenchmarksA lightweight, plug‑and‑play benchmark kit for synthetic data. Compare Syncora against other generators (e.g., Gretel, MostlyAI) by dropping in CSVs, then auto‑compute fidelity and similarity metrics. Works with any dataset via simple file naming no heavy setup needed.
google-research / NasbenchNASBench: A Neural Architecture Search Dataset and Benchmark
DataScienceUIBK / Rankify🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.
OpenDriveLab / OpenLane V2[NeurIPS 2023 Track Datasets and Benchmarks] OpenLane-V2: The First Perception and Reasoning Benchmark for Road Driving