SkillAgentSearch skills...

WholeCellEcoliRelease

Release of the whole cell E. coli model.

Install / Use

/learn @CovertLab/WholeCellEcoliRelease
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Whole Cell Model - Escherichia coli

Notice: This repository contains previous release snapshots of the Covert Lab's Whole Cell Model for Escherichia coli. For the most recent versions of the E. coli whole-cell model that are undergoing active development, please visit the wcEcoli and the vivarium-ecoli repositories. This repository should only be used for the purpose of replicating model outputs generated for previous publications listed below. We do not plan to merge Pull Requests into this repository except documentation and installation fixes.

You can reach us at WholeCellTeam.

This repository contains code for the following publications:

Setup

See docs/README.md for docs on how to set up and run the model.

In short, there are two alternative ways to set up to run the model: in a Docker container or in a pyenv Python virtual environment. Docker containers are easier to build and isolated from your development computer, but they run slower. (PyCharm should support debugging into a Docker container but we haven't tested that.) pyenv virtual environments take more steps to build and depend on your computer's OS, but are lighter weight and easier for debugging. With Docker, you can start running a simulation with these steps:

  1. Create a github personal access token with at least the read:packages permission selected.
  2. Place the token in github_personal_access_token.txt.
  3. Log in to docker.pkg.github.com:
    cat github_personal_access_token.txt | docker login https://docker.pkg.github.com -u USERNAME --password-stdin
    
    You should see an output message like Login Succeeded
  4. Pull the Docker image:
    docker pull docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full:latest
    
  5. Run the Docker container:
    docker run --name=wcm -it --rm docker.pkg.github.com/covertlab/wholecellecolirelease/wcm-full
    
  6. Inside the container, run the model:
    python runscripts/manual/runSim.py
    

Quick start

When running this code, prepare with these steps (the wcm-code Docker container already prepares this for you):

  1. cd to the top level of your wcEcoli directory.

  2. Set the $PYTHONPATH:

    export PYTHONPATH="$PWD"
    
  3. In the wcEcoli directory, compile the Cython code:

    make clean compile
    

Ways to run the model:

  1. Use the manual runscripts.

    They run each step directly in-process, which is particularly handy to use with a debugger. But you're responsible for properly sequencing all the steps: parameter calculation, cell simulation generations, and analyses. The manual runscripts work with a Docker container and also with a pyenv virtual environment.

  2. Queue up a Fireworks workflow, then run it.

    You configure it for the desired variants, number of generations, and other options, then Fireworks will automatically run all the steps including parameter calculation, simulations, and all the analysis plots.

    The workflow tasks can be distributed over multiple processes or even multiple computers, but they must all access a shared file system such as NFS and the (or copies of the) pyenv virtual environment. We have not tested Fireworks with Docker containers.

  3. Run on the Google Cloud Platform using Docker containers and our custom workflow software.

  4. Use the multi-scale agent-based framework.

    This can run several cells interactively on a simulated microscope slide.

Using the manual runscripts

These scripts will:

  • run the parameter calculator (ParCa),
  • run cell simulations, and
  • run analysis plots

All these steps run directly, in-process, without any workflow software or MongoDB. This is handy for development, e.g. running under the PyCharm debugger. But you're responsible for running the scripts in order and for re-running the ParCa after relevant code changes.

You can run just the parts you want and rerun them as needed but the manual scripts don't automate dependency management. It's on you to rerun code if things change, runSim before analysis, or delete runSim output before running it again. (That last part should be improved! Also note that some analysis scripts get confused if the sim runs are more varied than expected. See Issue #199.)

These scripts have command line interfaces built on argparse, so you can use shorter option names as long as they're unambiguous, and also one-letter forms so you can use --cpus 8, or --cpu 8, or -c8.

NOTE: Use the -h or --help switch to get complete, up-to-date documentation on the command line options. Below are just some of the command line options.

To run the parameter calculator (ParCa), which is needed to prepare data for the simulation:

python runscripts/manual/runParca.py [-h] [--cpus CPUS] [sim_outdir]

To simulate one or more cell generations with optional variants:

python runscripts/manual/runSim.py [-h] [--variant VARIANT_TYPE FIRST_INDEX LAST_INDEX] [--generations GENERATIONS] [--init-sims INIT_SIMS] [--seed SEED] [sim_dir]

To interactively select from the data that is saved during a simulation for visualization:

python runscripts/manual/analysis_interactive.py [-h] [sim_dir]

Running the command without any arguments will populate drop down menus for each set of simulations in out/ where you can select the desired variant/seed/generation/daughter and view the available values that are saved during simulations. Some simple data processing options are available. This interface mainly lets you select time traces or create scatter plots that can be used to compare different variants, generations, etc.

To run predefined analysis plots on the simulation output in a given sim_dir (use the -h parameter to get complete help on the command line options):

python runscripts/manual/analysisParca.py [-h] [-p PLOT [PLOT ...]] [--cpus CPUS] [sim_dir]

python runscripts/manual/analysisVariant.py [-h] [--plot PLOT [PLOT ...]] [--cpus CPUS] [sim_dir]

python runscripts/manual/analysisCohort.py [-h] [--plot PLOT [PLOT ...]] [--cpus CPUS] [--variant-index VARIANT_INDEX] [--variant-range START_VARIANT END_VARIANT] [sim_dir]

python runscripts/manual/analysisMultigen.py [-h] [--plot PLOT [PLOT ...]] [--cpus CPUS] [--variant-index VARIANT_INDEX] [--seed SEED] [--variant-range START_VARIANT END_VARIANT] [--seed-range START_SEED END_SEED] [sim_dir]

python runscripts/manual/analysisSingle.py [-h] [--plot PLOT [PLOT ...]] [--cpus CPUS] [--variant-index VARIANT_INDEX] [--seed SEED] [--generation GENERATION] [--daughter DAUGHTER] [--variant-range START_VARIANT END_VARIANT] [--seed-range START_SEED END_SEED] [--generation-range START_GENERATION END_GENERATION] [sim_dir]

> python runscripts/manual/analysis_interactive.py [-h] [sim_dir]

If you default the analysis parameters, these scripts will pick the latest simulation directory, the first variant, the first generation, and so on. To get full analyses across all variants, generations, etc., run:

  • analysisVariant.py
  • analysisCohort.py for each --variant_index you simulated
  • analysisMultigen.py for each combination of --variant_index and --seed you simulated
  • analysisSingle.py for each combination of --variant_index, --seed, and --generation you simulated

The --plot (or -p) optional parameter lets you pick one or more specific PLOTS to run. The list of PLOTs can include analysis class filenames like aaCounts (or aaCounts.py) and analysis group TAGS like CORE. See the __init__.py file in each analysis class directory for the available analysis classes and group TAGS. The default is to run the DEFAULT tag, which will run the CORE group of plots that are recommended for everyday development and any variant specific plots with the corresponding variant tag.

For example, to run two analysis plots on simulation variant #3 and put a filename prefix "v3_" on their output files (to distinguish them from other analysis runs):

python runscripts/manual/analysisCohort.py --plot compositionFitting.py figure2e.py --variant_index 3 --output_prefix v3_

Set the environment variable DEBUG_GC=1 if you want to check for Python memory leaks when running the analysis plots.

There's another way run an individual analysis plot:

python models/ecoli/analysis/cohort/transcriptFrequency.py [-h] [-o OUTPUT_PREFIX] [-v VARIANT_INDEX] [sim_dir]

Running a Fireworks workflow

See [wholece

View on GitHub
GitHub Stars76
CategoryDevelopment
Updated1mo ago
Forks18

Languages

Python

Security Score

80/100

Audited on Feb 4, 2026

No findings