Bioformats2raw
Bio-Formats image file format to raw format converter
Install / Use
/learn @glencoesoftware/Bioformats2rawREADME
bioformats2raw converter
Java application to convert image file formats, including .mrxs, to an intermediate Zarr structure compatible with the OME-NGFF specification. The raw2ometiff application can then be used to produce a Bio-Formats 5.9.x ("Faas") or Bio-Formats 6.x (true OME-TIFF) pyramid.
Requirements
As of 0.12.0, Java 11 or later is required to run bioformats2raw.
libblosc (https://github.com/Blosc/c-blosc) version 1.9.0 or later must be installed separately. The native libraries are not packaged with any relevant jars. See also note in jzarr readme (https://github.com/bcdev/jzarr/blob/master/README.md)
- macOS:
brew install c-bloscthen setJAVA_OPTS=-Djna.library.path=$(echo $(brew --cellar c-blosc)/*/lib/) - Windows: Pre-built blosc DLLs are available from the Fiji project. Rename the downloaded DLL to
blosc.dlland place in a fixed location then setJAVA_OPTS="-Djna.library.path=C:\path\to\blosc\folder". - Ubuntu:
apt-get install libblosc1 - RHEL/Rocky Linux:
dnf install bloscafter enabling the EPEL repository. - conda: Installing
bioformats2rawvia conda (see below) will includebloscas a dependency.
If using features that rely on OpenCV (see the Downsampling type section below), minimum supported versions are:
- Ubuntu 18.04
- RHEL 8
- Windows 10
- expect to see warnings as described in https://github.com/opencv/opencv/issues/20113; these can be ignored
NOTE: If you are setting jna.library.path via the JAVA_OPTS environment variable, make sure the path is to the folder containing the library not path to the library itself.
Temporary directory usage
Beginning with 0.10.0, if the default temporary directory (usually /tmp/) is mounted as noexec, conversion will fail immediately by default.
CIS security benchmarks recommend that /tmp/ be mounted as noexec; these standards are increasingly
being adopted by major Linux distributions. For certain types of input data (e.g. NDPI, JPEG-XR compression), bioformats2raw needs
to extract a native library from one or more jars to a temporary location. In these cases, the extracted native library must be executable in order
to read image data.
The recommended solution is to choose a different temporary directory by adding -Djava.io.tmpdir=/path/to/alternate/tmp to JAVA_OPTS.
If multiple properties need to be set via JAVA_OPTS, separate them with a space, e.g. JAVA_OPTS="-Djava.io.tmpdir/path/to/alternate/tmp -Djna.library.path=/path/to/blosc".
If you know this issue does not affect your input data and wish to warn instead of immediately stopping conversion, the --warn-no-exec option can be used.
For input data that relies on native library loading (e.g. NDPI, JPEG-XR compression), using --warn-no-exec instead of specifying an alternate
temporary directory will simply cause the conversion to fail at a later point.
For additional information, please see:
- https://github.com/glencoesoftware/bioformats2raw/pull/252
- https://github.com/scijava/native-lib-loader/issues/41
- https://forum.image.sc/t/after-omero-server-upgrade-hamamatsu-ndpi-files-display-only-in-low-resolution/81868
- https://forum.image.sc/t/bioformats-unable-to-read-czi-files-with-jpegxr-compression-on-almalinux-os-java-lang-unsatisfiedlinkerror-ome-jxrlib-jxrjni-new-decodecontext-j/82646
Installation
-
Download and unpack a release artifact:
https://github.com/glencoesoftware/bioformats2raw/releases
-
OR, install via
condaas described at conda-bioformats2raw.
Development Installation
As of 0.12.0, Java 17 is required to build bioformats2raw.
-
Clone the repository:
git clone https://github.com/glencoesoftware/bioformats2raw.git
-
Run the Gradle build as required, a list of available tasks can be found by running:
./gradlew tasks
Configuring Logging
Logging is provided using the logback library. The logback.xml file in src/dist/lib/config/ provides a default configuration for the command line tool.
In release and snapshot artifacts, logback.xml is in lib/config/.
You can configure logging by editing the provided logback.xml or by specifying the path to a different file:
JAVA_OPTS="-Dlogback.configurationFile=/path/to/external/logback.xml" \
bioformats2raw ...
Alternatively you can use the --debug flag, optionally writing the stdout to a file:
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --debug > bf2raw.log
The --log-level option takes an slf4j logging level for additional simple logging configuration.
--log-level DEBUG is equivalent to --debug. For even more verbose logging:
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --log-level TRACE
Eclipse Configuration
-
Run the Gradle Eclipse task:
./gradlew eclipse
-
Add the logback configuration in
src/dist/lib/config/to your CLASSPATH.
Usage
Run the conversion:
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid
bioformats2raw /path/to/file.svs /path/to/zarr-pyramid
By default, the resolutions will be set so that the smallest resolution is no greater than 256x256.
The target of the smallest resolution can be configured with --target-min-size e.g. to ensure
that the smallest resolution is no greater than 128x128
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --target-min-size 128
bioformats2raw /path/to/file.svs /path/to/zarr-pyramid --target-min-size 128
Alternatively, the --resolutions options can be passed to specify the exact number of resolution levels:
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --resolutions 6
bioformats2raw /path/to/file.svs /path/to/zarr-pyramid --resolutions 6
Maximum tile dimensions can be configured with the --tile-width and --tile-height options. Defaults can be viewed with
bioformats2raw --help. Be mindful of the downstream workflow when selecting a tile size other than the default.
A smaller than default tile size is rarely recommended.
If the input file has multiple series, a subset of the series can be converted by specifying a comma-separated list of indexes:
bioformats2raw /path/to/file.scn /path/to/zarr-pyramid --series 0,2,3,4
By default, several additional readers are added to the beginning of Bio-Formats' list of reader classes. These readers are considered to be experimental and as a result only a limited range of input data is supported. See the Additional readers section below for more information.
Any of these readers can be excluded with the --extra-readers option:
# only include the reader for .mrxs
bioformats2raw /path/to/file.tiff /path/to/zarr-pyramid --extra-readers com.glencoesoftware.bioformats2raw.MiraxReader
# don't add any additional readers, just use the ones provided by Bio-Formats
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --extra-readers
Reader-specific options can be specified using --options:
bioformats2raw /path/to/file.mrxs /path/to/zarr-pyramid --options mirax.use_metadata_dimensions=false
Be aware when experimenting with different values for --options that the corresponding memo (cache) file may need to be
removed in order for new options to take effect. This file will be e.g. /path/to/.file.mrxs.bfmemo.
The output in /path/to/zarr-pyramid can be passed to raw2ometiff to produce
an OME-TIFF that can be opened in ImageJ, imported into OMERO, etc. See
https://github.com/glencoesoftware/raw2ometiff for more information.
Compression Options
By default, output is compressed with Blosc using the lz4 codec.
To change the overall compression type, use --compression <type>. Supported types are blosc, zlib, and null (uncompressed).
To change type-specific options, use --compression-properties <key=value>.
Supported options for blosc are:
cname=<codec>, where the default iscname=lz4.zstd,zlib,blosclz, andlz4hcare also valid values ofcname.clevel=<level>, where the default isclevel=5. Valid values are integers from 1 to 9 inclusive.
Supported options for zlib are:
level=<level>, where the default islevel=1. Valid values are integers from 1 to 9 inclusive.
There are no supported compression options for type null, as this is uncompressed data.
While --compression blosc --compression-properties cname=lz4 --compression-properties clevel=5 is the default,
some datasets perform better in time and/or space with different choices. For workflows where the size of the output Zarr,
total conversion time, and/or time required to decompress a chunk are important, it is a good idea to
benchmark several different options with the real input data being used. See also the Performance section below.
In some tests, we have found that --compression blosc --compression-properties cname=zstd --compression-properties clevel=3
may be a reasonable choice if compressed size is more important than conversion or decompression times.
Output Formatting Options
By default, the output of bioformats2raw will be a
Zarr dataset which follows the
metadata conventions defined by the
OME-NGFF 0.4 specification including the
bioformats2raw.layout specification.
Several formatting options can be pass
