Cryodrgn
Neural networks for cryo-EM reconstruction
Install / Use
/learn @ml-struct-bio/CryodrgnREADME
:snowflake::dragon: cryoDRGN: Deep Reconstructing Generative Networks for cryo-EM and cryo-ET heterogeneous reconstruction
CryoDRGN is a neural network based algorithm for heterogeneous cryo-EM reconstruction. In particular, the method models a continuous distribution over 3D structures by using a neural network based representation for the volume.
Documentation
The latest documentation for cryoDRGN is available in our user guide, including an overview and walkthrough of cryoDRGN installation, training and analysis. A brief quick start is provided below.
For any feedback, questions, or bugs, please file a Github issue or start a Github discussion.
Updates in Version 4.2.x
- [NEW] cryoDRGN-AI ab initio reconstruction method integrated into cryoDRGN as
cryodrgn abinit- former ab-initio reconstruction methods are deprecated as
cryodrgn abinit_het_oldandcryodrgn abinit_homo_old cryodrgn analyze,landscape, etc. now support cryoDRGN-AI models as well as the previous cryoDRGN models
- former ab-initio reconstruction methods are deprecated as
- more memory-efficient ab initio reconstruction
- support for Python 3.13 and PyTorch 2.9; PyTorch <2.0 is no longer supported
A full list of cryoDRGN version updates can be found at our release notes.
Installation
cryodrgn may be installed via pip, and we recommend installing cryodrgn in a clean conda environment.
Our package is compatible with Python versions 3.10 through 3.13;
we recommend using the latest available Python version:
# Create and activate conda environment
(base) $ conda create --name cryodrgn python=3.13
(cryodrgn) $ conda activate cryodrgn
# install cryodrgn
(cryodrgn) $ pip install cryodrgn
You can alternatively install a newer, less stable, development version of cryodrgn using our beta release channel:
(cryodrgn) $ pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ cryodrgn --pre
More installation instructions are found in the documentation.
Quickstart: heterogeneous reconstruction with consensus poses
1. Preprocess image stack
First resize your particle images using the cryodrgn downsample command:
usage: cryodrgn downsample [-h] -D D -o MRCS [--is-vol] [--chunk CHUNK]
[--datadir DATADIR]
mrcs
Downsample an image stack or volume by clipping fourier frequencies
positional arguments:
mrcs Input images or volume (.mrc, .mrcs, .star, .cs, or .txt)
optional arguments:
-h, --help show this help message and exit
-D D New box size in pixels, must be even
-o MRCS Output image stack (.mrcs) or volume (.mrc)
--is-vol Flag if input .mrc is a volume
--chunk CHUNK Chunksize (in # of images) to split particle stack when
saving
--relion31 Flag for relion3.1 star format
--datadir DATADIR Optionally provide path to input .mrcs if loading from a
.star or .cs file
--max-threads MAX_THREADS
Maximum number of CPU cores for parallelization (default: 16)
--ind PKL Filter image stack by these indices
</details>
We recommend first downsampling images to 128x128 since larger images can take much longer to train:
$ cryodrgn downsample [input particle stack] -D 128 -o particles.128.mrcs
The maximum recommended image size is D=256, so we also recommend downsampling your images to D=256 if your images are larger than 256x256:
$ cryodrgn downsample [input particle stack] -D 256 -o particles.256.mrcs
The input file format can be a single .mrcs file, a .txt file containing paths to multiple .mrcs files, a RELION
.star file, or a cryoSPARC .cs file. For the latter two options, if the relative paths to the .mrcs are broken,
the argument --datadir can be used to supply the path to where the .mrcs files are located.
If there are memory issues with downsampling large particle stacks, add the --chunk 10000 argument to
save images as separate .mrcs files of 10k images.
2. Parse image poses from a consensus homogeneous reconstruction
CryoDRGN expects image poses to be stored in a binary pickle format (.pkl). Use the parse_pose_star or
parse_pose_csparc command to extract the poses from a .star file or a .cs file, respectively.
Example usage to parse image poses from a RELION 3.1 starfile:
$ cryodrgn parse_pose_star particles.star -o pose.pkl
Example usage to parse image poses from a cryoSPARC homogeneous refinement particles.cs file:
$ cryodrgn parse_pose_csparc cryosparc_P27_J3_005_particles.cs -o pose.pkl -D 300
Note: The -D argument should be the box size of the consensus refinement (and not the downsampled
images from step 1) so that the units for translation shifts are parsed correctly.
3. Parse CTF parameters from a .star/.cs file
CryoDRGN expects CTF parameters to be stored in a binary pickle format (.pkl).
Use the parse_ctf_star or parse_ctf_csparc command to extract the relevant CTF parameters from a .star file
or a .cs file, respectively.
Example usage for a .star file:
$ cryodrgn parse_ctf_star particles.star -o ctf.pkl
If the box size and Angstrom/pixel values are not included in the .star file under fields _rlnImageSize and
_rlnImagePixelSize respectively, the -D and --Apix arguments to parse_ctf_star should be used instead to
provide the original parameters of the input file (before any downsampling):
$ cryodrgn parse_ctf_star particles.star -D 300 --Apix 1.03 -o ctf.pkl
Example usage for a .cs file:
$ cryodrgn parse_ctf_csparc cryosparc_P27_J3_005_particles.cs -o ctf.pkl
4. (Optional) Test pose/CTF parameters parsing
Next, test that pose and CTF parameters were parsed correctly using the voxel-based backprojection script. The goal is to quickly verify that there are no major problems with the extracted values and that the output structure resembles the structure from the consensus reconstruction before training.
Example usage:
$ cryodrgn backproject_voxel projections.128.mrcs \
--poses pose.pkl \
--ctf ctf.pkl \
-o backproject.128 \
--first 10000
The output structure backproject.128/backproject.mrc will not be identical to the consensus reconstruction because we
only used the first 10k particles images for quicker results.
If the structure is too noisy to interpret, you can use more images with --first 25000 or use the
entire particle stack (by leaving off the --first flag).
Note: If the volume does not resemble your structure, you may need to use the flag --uninvert-data.
This flips the data sign (e.g. light-on-dark or dark-on-light), which may be needed depending on the
convention used in upstream processing tools.
5. Running cryoDRGN heterogeneous reconstruction
When the input images (.mrcs), poses (.pkl), and CTF parameters (.pkl) have been prepared, a cryoDRGN model can be trained with following command:
<details><summary><code>$ cryodrgn train_vae -h</code></summary>usage: cryodrgn train_vae [-h] -o OUTDIR --zdim ZDIM --poses POSES [--ctf pkl]
[--load WEIGHTS.PKL] [--checkpoint CHECKPOINT]
[--log-interval LOG_INTERVAL] [-v] [--seed SEED]
[--ind PKL] [--uninvert-data] [--no-window]
[--window-r WINDOW_R] [--datadir DATADIR] [--lazy]
[--max-threads MAX_THREADS]
[--tilt TILT] [--tilt-deg TILT_DEG] [-n NUM_EPOCHS]
[-b BATCH_SIZE] [--wd WD] [--lr LR] [--beta BETA]
[--beta-control BETA_CONTROL] [--norm NORM NORM]
[--no-amp] [--multigpu] [--do-pose-sgd]
[--pretrain PRETRAIN] [--emb-type {s2s2,quat}]
[--pose-lr POSE_LR] [--enc-layers QLAYERS]
[--enc-dim QDIM]
[--encode-mode {conv,resid,mlp,tilt}]
[--enc-mask ENC_MASK] [--use-real]
[--dec-layers PLAYERS] [--dec-dim PDIM]
[--pe-type {geom_ft,geom_full,geom_lowf,geom_nohighf,linear_lowf,gaussian,none}]
[--feat-sigma FEAT_SIGMA] [--pe-dim PE_DIM]
[--domain {hartley,fourier}]
[--activation {relu,leaky_relu}]
particles
Train a VAE for heterogeneous reconstruction with known pose
positional arguments:
particles Input particles (.mrcs, .star, .cs, or .txt)
optional arguments:
-h, --help show this help message and exit
-o OUTDIR, --outdir OUTDIR
Output directory to save model
--zdim ZDIM Dimension of late
Related Skills
node-connect
347.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
