SkillAgentSearch skills...

Codescientist

CodeScientist: An automated scientific discovery system for code-based experiments

Install / Use

/learn @allenai/Codescientist
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center">

codecientist

<img src="images/codescientist-flow-diagram.png" style="width: 800px; border: 1px solid lightgray;"> </div>

This is the repository for CodeScientist, an end-to-end semi-automated scientific discovery system that designs, iterates, and analyzes scientific experiments that can be expressed as (Python) code. CodeScientist creates novel ideas to explore essentially by using genetic mutations (using an LLM-as-a-mutator paradigm) to mutate combinations of scientific articles and code examples, with code examples including how to prompt an LLM, make a plot, or use a specific benchmark. The experiment ideas can then be implemented using the Experiment Builder, which automatically creates, runs, and debugs the experiment code in a container. When completed, CodeScientist writes a report on the results. Usually, CodeScientist makes several (for example, 5) independent attempts at creating experiments for a given idea, and can create a meta-analysis describing the overall results over each of the 5 experiment attempts.

CodeScientist can be run in two modes:

  • Human-in-the-loop: A human helps build code examples, filter experiment ideas to run, and provides short comments on the ideas that might help their implementation. This is the primary mode we report in the paper.
  • Fully-automatic: You can run CodeScientist in fully automatic mode with a few clicks, though it is less efficient at producing scientific results.

What you'll find in this repository:

  • CodeScientist Software: CodeScientist is open source, and this repository includes the full set of software and installation instructions.

  • Reports: The CodeScientist paper highlights a set of 20 candidate discoveries (in Table 4). These are readily available here: Example CodeScientist-Generated Experiment Reports and Code

  • Raw Data: The repository also includes a great deal of raw data: full experiment code, logs, ideas, external reviewer ratings, etc.


Table of Contents

<span id='0-paper'/>

0. Paper

CodeScientist is described in the following paper: CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation (ACL Findings 2025).

codescientist-paper

<span id='1-quick-start'/>

1. Quick Start

<span id="1-1-i-want-to-read-about-codescientist"/>

1.1. I want to read about CodeScientist

The CodeScientist paper is available here: Section 0. Paper

<span id="1-2-i-want-to-examine-the-papers-code-and-other-results-created-by-codescientist"/>

1.2. I want to examine the papers, code, and other results created by CodeScientist

  • Appendix: A number of the highest quality experimental results (as rated by humans) are in the paper's Appendix.
  • Example Papers: You can also see the above highly rated experimental results (and a number of rejected papers) here: example_papers/
  • Lots of Papers: If you'd like all the details -- high-quality and low-quality experiments, including their papers, code, results, and logs, they are available in bulk here: generated_experments/
<span id="1-3-i-want-to-run-codescientist-on-my-local-machine"/>

1.3. I want to run CodeScientist on my local machine

Please see the installation instructions in Section 3.1. Installation

<span id="1-4-i-would-like-to-use-codescientist-in-my-own-domain"/>

1.4. I would like to use CodeScientist in my own domain.

To use CodeScientist in a subdomain other than the provided domain (i.e. agents and environments), there are two steps:

  • Add papers in the subdomain: This is as easy as pasting Arxiv links into the Create New Ideas (from Papers) menu item.
  • Add codeblocks: If you need specialized codeblocks for your domain other than the general ones provided in this repository, simply add them to the codeblocks directory in the required format.

More information on these steps is provided in Section 3. Installation and Running and Section 4. Using CodeScientist

<span id="1-5-i-want-to-manually-provide-codescientist-an-idea-to-create-an-experiment-for"/>

1.5. I want to manually provide CodeScientist an idea to create an experiment for, instead of using LLM-generated ideas.

You can do this by pressing the Create New Experiment (Manual) button on the main menu. More detailed instructions on running CodeScientist are provided in Section 4.2. Create New Experiment (Manual)

<span id="1-6-i-want-to-feed-ideas-into-codescientist-from-another-system"/>

1.6. I want to feed ideas into CodeScientist that were made from some other system.

You can do this in bulk using the Run Benchmark button -- see the section on pre-generating ideas for an example of the format CodeScientist expects here: Secton 4.8 .Pregenerated Ideas/Filtering Ideas Externally, followed by Section 4.5. Run Benchmark

<span id="1-7-how-do-i-use-a-specific-aspect-of-codescientist"/>

1.7. How do I use [specific aspect of CodeScientist]?

Please see the instructions for various components in the detailed usage instructions of Section 4. Usage Instructions

<span id="1-8-i-have-a-question-not-answered-here"/>

1.8. I have a question not answered here.

Please see the documentation below. If you're question isn't answered, please add an issue.

<span id="2-example-codescientist-generated-experiment-reports-and-code"/>

2. Example CodeScientist Generated Experiment Reports and Code

Below are six example experiment reports (and s

View on GitHub
GitHub Stars320
CategoryDevelopment
Updated1d ago
Forks39

Languages

Python

Security Score

95/100

Audited on Mar 26, 2026

No findings