SkillAgentSearch skills...

BigOBench

BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated code.

Install / Use

/learn @facebookresearch/BigOBench
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1 align="center"> <!-- <p><b><i>BigO(Bench)</b></i></p> --> <img style="width: 500px" src="./docs/images/logo_transparent.png" alt="logo"> </h1> <div align="center" style="line-height: 1;"> <a href="https://facebookresearch.github.io/BigOBench" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="HomePage" src="https://img.shields.io/badge/🏡%20HomePage-BigOBench-green" style="display: inline-block; vertical-align: middle;"/></a> <a href="https://facebookresearch.github.io/BigOBench/leaderboard.html" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="Leaderboard" src="https://img.shields.io/badge/🏆%20Leaderboard-BigOBench-yellow" style="display: inline-block; vertical-align: middle;"/></a> <a href="https://facebookresearch.github.io/BigOBench/demo.html" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="Explorer" src="https://img.shields.io/badge/🔎%20Explorer-BigOBench-white" style="display: inline-block; vertical-align: middle;"/></a> </div> <div align="center" style="line-height: 1;"> <a href="https://github.com/facebookresearch/BigOBench" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="Github" src="https://img.shields.io/badge/Github-facebookresearch/BigOBench-black?logo=github" style="display: inline-block; vertical-align: middle;"/></a> <a href="https://huggingface.co/datasets/facebook/BigOBench" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="HuggingFace" src="https://img.shields.io/badge/🤗%20HuggingFace-facebook/BigOBench-ffc107" style="display: inline-block; vertical-align: middle;"/></a> </div> <div align="center" style="line-height: 1;"><a href="https://arxiv.org/abs/2503.15242" target="_blank" style="margin: 2px; text-decoration: none !important;"><img alt="ArXiv" src="https://img.shields.io/badge/arXiv-2503.15242-b5212f?logo=arxiv" style="display: inline-block; vertical-align: middle;"/></a> </div> <h2 align="center"> <p><i>Can LLMs Generate Code with <br> Controlled Time and Space Complexity?</i></p> </h2>

👋 Overview

[!NOTE] Significant refactoring efforts have been made to enhance the usability and clarity of our codebase for public users. As we continue to identify and address any bugs, we will be pushing regular patches. If you encounter any issues or spot a bug, please don't hesitate to reach out – we would be delighted to investigate and resolve it promptly.

🧐 Introduction <sub><sup>(back to top)<sub><sup>

image

<span style="font-variant: small-caps;"><b>BigO(Bench)</b></span> is a benchmark of ~300 code problems to be solved in Python, that evaluates whether LLMs can find the time-space complexity of code solutions or generate code solutions themselves that respect a time-space complexity requirement. This benchmark addresses the gap in current evaluations that often overlook the ability of models to comprehend and produce code constrained by computational complexity. <span style="font-variant: small-caps;"><b>BigO(Bench)</b></span> includes a complexity inference framework that can run any Python code snippet, measure multiple runtime and memory footprint values, and infer its algorithmic time-space complexity. It also includes of set of 3,105 coding problems and 1,190,250 solutions from Code Contests annotated with inferred (synthetic) time and space complexity labels from the complexity framework, as well as corresponding runtime and memory footprint values for a large set of input sizes.

🙌 Project Overview <sub><sup>(back to top)<sub><sup>

Our project contains the following modules, and each of them is documented in its attached README file !

📋 Getting Started with the CODE <sub><sup>(back to top)<sub><sup>

To clone the repository, run

git clone git@github.com:facebookresearch/BigOBench.git
cd BigOBench

If you want to install everything at once, and run the BigOBench benchmark:

cd src
bash create_bigobench_env.sh

And then navigate to src/README.md to read about how to run the full BigOBench evaluation pipeline.

Otherwise, if you are specifically interested in one of our modules, you can install the dependencies of each module separately:

  • For the complexity framework:

    cd src/complexity
    bash create_complexity_env.sh
    

    You can then navigate to src/complexity/README.md to get to know the complexity framework.

  • For the inference engine:

    cd src/inference
    bash create_vllm_env.sh
    

    You can then navigate to src/inference/README.md to get to know the inference engine.

  • For the evaluation harness:

    cd src/eval
    bash create_eval_env.sh
    

    You can then navigate to src/eval/README.md to get to know the evaluation harness.

📚 Getting Started with the DATA <sub><sup>(back to top)<sub><sup>

The data is available in 5 .jsonl files, hosted by HuggingFace Datasets.
You can directly download them from the HuggingFace website, or use the CLI

``

Related Skills

View on GitHub
GitHub Stars40
CategoryDevelopment
Updated15d ago
Forks5

Languages

Python

Security Score

75/100

Audited on Mar 15, 2026

No findings