SkillAgentSearch skills...

BatchConvert

A nextflow based tool that wraps bfconvert and bioformats2raw to convert image data collections to OME-TIFF and OME-Zarr, respectively, in a parallelised manner.

Install / Use

/learn @Euro-BioImaging/BatchConvert
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

BatchConvert DOI:10.5281

A command line tool for converting image data into either of the standard file formats OME-TIFF or OME-Zarr.

The tool wraps the dedicated file converters bfconvert and bioformats2raw to convert into OME-TIFF or OME-Zarr, respectively. The workflow management system Nextflow is used to perform conversion in parallel for batches of images.

The tool also wraps s3 and Aspera clients (go-mc and aspera-cli, respectively). Therefore, input and output locations can be specified as local or remote storage and file transfer will be performed automatically. The conversion can be run on HPC with Slurm.

Important note: The package has been so far only tested on Ubuntu 20.04.

Installation & Dependencies

Installation via conda

A conda package for BatchConvert exists: https://anaconda.org/Euro-BioImaging/batchconvert

Conda installation of BatchConvert to a new conda environment is recommended. Simply run the following command:

conda install -c euro-bioimaging -c conda-forge -c bioconda -c ome batchconvert

Installation from the source

BatchConvert can also be acquired by cloning this repository. In this case, Nextflow, which is the minimal dependency to run BatchConvert, should be installed and made accessible from the command line.

If conda exists on your system, you can install BatchConvert together with Nextflow using the following script:

git clone https://github.com/Euro-BioImaging/BatchConvert.git && \ 
source BatchConvert/installation/install_with_nextflow.sh

If you already have Nextflow installed and accessible from the command line (or if you prefer to install it manually e.g., as shown here), you can also install BatchConvert alone, using the following script:

git clone https://github.com/Euro-BioImaging/BatchConvert.git && \ 
source BatchConvert/installation/install.sh

Regardless of how BatchConvert has been installed, the specific workflow dependencies (listed below) will be automatically installed:

  • bioformats2raw (entrypoint bioformats2raw)
  • bftools (entrypoint bfconvert)
  • go-mc (entrypoint mc)
  • aspera-cli (entrypoint ascp)

These dependencies will be pulled and cached automatically at the first execution of the conversion command. The mode of dependency management can be specified by using the command line option --profile or -pf. Depending on how this option is specified, the dependencies will be acquired / run either via conda or via docker/singularity containers.

Specifying --profile conda (default) will install the dependencies to an environment at ./.condaCache and use this environment to run the workflow. This option requires that miniconda/anaconda is installed on your system.

Alternatively, specifying --profile docker or --profile singularity will pull a docker or singularity image with the dependencies, respectively, and use this image to run the workflow. These options assume that the respective container runtime (docker or singularity) is available on your system. If singularity is being used, a cache directory will be created at the path ./.singularityCache where the singularity image is stored.

Finally, you can still choose to install the dependencies manually and use your own installations to run the workflow. In this case, you should specify --profile standard and make sure the entrypoints specified above are recognised by your shell.

Configuration

BatchConvert can be configured to have default options for file conversion and transfer. Probably, the most important sets of parameters to be configured include credentials for the remote ends. The easiest way to configure remote stores is by running the interactive configuration command as indicated below.

Configuration of the s3 object store

Run the interactive configuration command:

batchconvert configure_s3_remote

This will start a sequence of requests for s3 credentials such as name, url, access, etc. Provide each requested credential and click enter. Continue this cycle until the process is finished. Upon completing the configuration, the sequence of commands should roughly look like this:

oezdemir@pc-ellenberg108:~$ batchconvert configure_s3_remote
enter remote name (for example s3)
s3
enter url:
https://s3.embl.de
enter access key:
"your-access-key"
enter secret key:
"your-secret-key"
enter bucket name:
"your-bucket"
Configuration of the default s3 credentials is complete

Configuration of the BioStudies user space

Run the interactive configuration command:

batchconvert configure_bia_remote

This will prompt a request for the secret directory to connect to. Enter the secret directory for your user space and click enter. Upon completing the configuration, the sequence of commands should roughly look like this:

oezdemir@pc-ellenberg108:~$ batchconvert configure_bia_remote
enter the secret directory for BioImage Archive user space:
"your-secret-directory"
configuration of the default bia credentials is complete

Configuration of the slurm options

BatchConvert can also run on slurm clusters. In order to configure the slurm parameters, run the interactive configuration command:

batchconvert configure_slurm

This will start a sequence of requests for slurm options. Provide each requested option and click enter. Continue this cycle until the process is finished. Upon completing the configuration, the sequence of commands should roughly look like this:

oezdemir@pc-ellenberg108:~$ batchconvert configure_slurm
Please enter value for queue_size
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the current value ´50´
s
Please enter value for submit_rate_limit
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the current value ´10/2min´
s
Please enter value for cluster_options
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the current value ´--mem-per-cpu=3140 --cpus-per-task=16´
s
Please enter value for time
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the current value ´6h´
s
configuration of the default slurm parameters is complete

Configuration of the default conversion parameters

While all conversion parameters can be specified as command line arguments, it can be useful for the users to set their own default parameters to avoid re-entering those parameters for subsequent executions. BatchConvert allows for interactive configuration of conversion in the same way as configuration of the remote stores described above.

To configure the conversion into OME-TIFF, run the following command:

batchconvert configure_ometiff

This will prompt the user to enter a series of parameters, which will then be saved as the default parameters to be passed to the batchconvert ometiff command. Upon completing the configuration, the sequence of commands should look similar to:

oezdemir@pc-ellenberg108:~$ batchconvert configure_ometiff
Please enter value for noflat
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
s
Please enter value for series
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
s
Please enter value for timepoint
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
s
...
...
...
...
...
...
Configuration of the default parameters for 'bfconvert' is complete

To configure the conversion into OME-Zarr, run the following command:

batchconvert configure_omezarr

Similarly, this will prompt the user to enter a series of parameters, which will then be saved as the default parameters to be passed to the batchconvert omezarr command. Upon completing the configuration, the sequence of commands should look similar to:

oezdemir@pc-ellenberg108:~$ batchconvert configure_omezarr
Please enter value for resolutions_zarr
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
s
Please enter value for chunk_y
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
s
Please enter value for chunk_x
Click enter if this parameter is not applicable
Enter "skip" or "s" if you would like to keep the parameter´s current value, which is None
...
...
...
...
...
...
Configuration of the default parameters for 'bioformats2raw' is complete

It is important to note that the initial defaults for the conversion parameters are the same as the defaults of the backend tools bfconvert and bioformats2raw, as noted in the prompt excerpt above. Through interactive configuration, the user is overriding these initial defaults and setting their own defaults. It is possible to reset the initial defaults by running the following command.

batchconvert reset_defaults

Another important point is that any of these configured parameters can be overridden by passing a value to that parameter in the commandline. For instance, in the following command, the value of 20 will be assigned to chunk_y parameter even if the value for the same parameter might be different in the configuration file.

batchconvert omezarr --chunk_y 20 "path/to/input" "path/to/output"

Examples

Local conversion

Parallel conversion of files to separate OME-TIFFs / OME-Zarrs:

Convert a batch of images on your local storage into OME-TIFF format. Note that the input_path in the command given below is typically a directory with m

Related Skills

View on GitHub
GitHub Stars28
CategoryProduct
Updated1mo ago
Forks4

Languages

Python

Security Score

95/100

Audited on Feb 28, 2026

No findings