Benchmark
A time & energy benchmark suite for generative AI
Install / Use
/learn @ml-energy/BenchmarkREADME
The ML.ENERGY Benchmark
Benchmarking framework for measuring energy consumption and performance of generative AI models like Large Language Models (LLMs), Multimodal LLMs (MLLMs), and Diffusion models.
You can browse The ML.ENERGY Leaderboard for the latest benchmarking results.
- Overview: Tasks, datasets, runtime
- Data Preparation: Downloading necessary datasets and processing them
- Running Benchmarks: Job generation and manual execution
- Analyzing Results: Analyzing and understanding benchmarking results
Citation
@inproceedings{mlenergy-neuripsdb25,
title={The {ML.ENERGY Benchmark}: Toward Automated Inference Energy Measurement and Optimization},
author={Jae-Won Chung and Jeff J. Ma and Ruofan Wu and Jiachen Liu and Oh Jun Kweon and Yuxuan Xia and Zhiyu Wu and Mosharaf Chowdhury},
year={2025},
booktitle={NeurIPS Datasets and Benchmarks},
}
Related Skills
node-connect
354.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
