Benchmark

A time & energy benchmark suite for generative AI

Generate Convert Improve

Install / Use

/learn @ml-energy/Benchmark

About this skill

Quality Score

0/100

README

The ML.ENERGY Benchmark

Benchmarking framework for measuring energy consumption and performance of generative AI models like Large Language Models (LLMs), Multimodal LLMs (MLLMs), and Diffusion models.

You can browse The ML.ENERGY Leaderboard for the latest benchmarking results.

Overview: Tasks, datasets, runtime
Data Preparation: Downloading necessary datasets and processing them
Running Benchmarks: Job generation and manual execution
Analyzing Results: Analyzing and understanding benchmarking results

Citation

@inproceedings{mlenergy-neuripsdb25,
    title={The {ML.ENERGY Benchmark}: Toward Automated Inference Energy Measurement and Optimization}, 
    author={Jae-Won Chung and Jeff J. Ma and Ruofan Wu and Jiachen Liu and Oh Jun Kweon and Yuxuan Xia and Zhiyu Wu and Mosharaf Chowdhury},
    year={2025},
    booktitle={NeurIPS Datasets and Benchmarks},
}

Related Skills

node-connect

354.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

112.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

354.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

354.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。