SediNet
Deep learning framework for optical granulometry (estimation of sedimentological variables from sediment imagery)
Install / Use
/learn @DigitalGrainSize/SediNetREADME
SediNet: Build your own sediment descriptor
<!-- ______ ______ _____ __ __ __ ______ ______ --> <!--/\ ___\ /\ ___\ /\ __-. /\ \ /\ "-.\ \ /\ ___\ /\__ _\ --> <!--\ \___ \ \ \ __\ \ \ \/\ \ \ \ \ \ \ \-. \ \ \ __\ \/_/\ \/ --> <!-- \/\_____\ \ \_____\ \ \____- \ \_\ \ \_\\"\_\ \ \_____\ \ \_\ --> <!-- \/_____/ \/_____/ \/____/ \/_/ \/_/ \/_/ \/_____/ \/_/ -->By Dr Daniel Buscombe
daniel@mardascience.com
Deep learning framework for optical granulometry (estimation of sedimentological variables from sediment imagery).
About SediNet
A configurable machine-learning framework for estimating either (or both) continuous and categorical variables from a photographic image of clastic sediment. It has wide potential application, even to subpixel imagery and complex mixtures, because the dimensions of the grains aren't being measured directly or indirectly, but using a mapping from image to requested output using a machine learning algorithm that you have to train using examples of your data.
For more details, please see the paper:
Buscombe, D. (2019). SediNet: a configurable deep learning model for mixed qualitative and quantitative optical granulometry. Earth Surface Processes and Landforms 45 (3), 638-651. https://onlinelibrary.wiley.com/doi/abs/10.1002/esp.4760
Free Earth ArXiv preprint here
This repository contains code and data to reproduce the above paper, as well as additional examples and jupyter notebooks that you can run on the cloud and use as examples to build your own Sedinet sediment descriptor
The algorithm implementation has changed since the paper, so the results are slightly different but the concepts and data, and most of everything, have not changed.
SediNet can be configured and trained to estimate:
- up to nine numeric grain-size metrics in pixels from a single input image. Grain size is then recovered using the physical size of a pixel (note that sedinet doesn't help you estimate that). Appropriate metrics include mean, median or any other percentile
- equivalent sieve diameters directly from image features, without the need for area-to-mass conversion formulas and without even knowing the scale of one pixel. SediNet might be useful for other metrics such as sorting (standard deviation), skewness, etc. There could be multiple quantities that could be estimated from the imagery
- categorical variables such as grain shape, population, colour, etc
The motivating idea behind SediNet is community development of tools for information extraction from images of sediment. You can use SediNet "off-the-shelf", or other people's models, or configure it for your own purposes.
<!-- You can even choose to contribute imagery back to the project, so we can build bigger and better models collaboratively. If that sounds like something you would like to do, there is a [special repo](https://github.com/MARDAScience/SediNet-Contrib) for you wonderful people -->Within this package there are several examples of different ways it can be configured for estimating categorical variables and various numbers of continuous variables
You can use the models in this repository for your purposes (and you might find them useful because they have been trained on large numbers of images). If that doesn't work for you, you can train SediNet for your own purposes even on small datasets.
The examples have been curated with the following hardware specification in mind: 16 GB RAM, and Nvidia GPU with 11 GB of DDR4 or DDR6 memory (e.g. RTX 2080 Ti). If you have access to larger GPU memory, you can use larger imagery and larger batch sizes and you should achieve better accuracy.
How SediNet works
Sedinet is a deep learning model, which is a type of machine learning model that uses very large neural networks to automatically extract features from data to make predictions. For imagery, network layers typically use convolutions therefore the models are called Convolutional Neural Networks or CNNs for short.
CNNs have multiple processing layers (called convolutional layers or blocks) and nonlinear transformations (that include batch normalization, activation, and dropout), with the outputs from each layer passed as inputs to the next. The model architecture is summarised below:

SediNet is very configurable, and is designed primarily to be a research tool. There are two in-built model sizes (shallow and false), and numerous options for how to train and treat the data. For example, data inputs can optionally be scaled. Various image sizes can be used. A single batch size may be chosen, or a model might be constructed using multiple batch sizes. Therefore it might take some experimentation to achieve optimal results for a particular dataset. Hopefully, this toolbox makes such experimentation straightforward. It isn't always obvious what combinations of settings to use, so be prepared to construct models using a variety of settings, then using the model with the best validation scores.
<!-- -------------------------------------------------------------------------------- ## Run in your browser! The following links will open jupyter notebooks in Google Colab, which is a free cloud computing service ### Categorical ##### Use SediNet to estimate sediment population [Open this link]() ##### Use SediNet to estimate sediment shape [Open this link]() --> <!-- #### Continuous ##### Sediment grain size prediction (sieve size) on a small population of beach sands [Open this link]() ##### Sediment grain size prediction (9 percentiles of the cumulative distribution) on a small population of beach sands [Open this link]() ##### Sediment grain size prediction (9 percentiles of the cumulative distribution) on a large 400 image dataset [Open this link]() -->Install and run on your computer
You must have python 3, pip for python 3, git and conda. On Windows I recommend the latest Anaconda release.
Windows:
git clone --depth 1 https://github.com/MARDAScience/SediNet.git
Linux/Mac:
git clone --depth 1 git@github.com:MARDAScience/SediNet.git
Anaconda/miniconda:
If you do NOT want to use your GPU for computations with tensorflow, edit the conda_env/sedinet.yml replacing tensorflow-gpu with tensorflow. This is NOT recommended for training models, only using them for prediction.
(if you are a regular or long-term conda user, perhaps this is a good time to conda clean --packages and conda update -n base conda?)
conda env create -f conda_env/sedinet.yml
conda activate sedinet
(Later, when you're done ... conda deactivate sedinet)
Train and use the provided example models yourself
The following examples have been selected to demonstrate the range of options you can choose when optimizing a SediNet model for a particular dataset. It therefore serves as a guide, rather than a gallery of best possible model outcomes. I encourage you to experiment with a few sets of options before deciding on a final optimal configuration and defaults file. Sometimes, using multiple batch sizes can be advantageous.
Continuous
Train SediNet for sediment grain size prediction (9 percentiles of the cumulative distribution) on a large population of 400 images
python sedinet_train.py -c config/config_9percentiles.json
Subsequently predict using:
python sedinet_predict.py -c config/config_9percentiles.json -1 grain_size_global/res/global_9prcs_simo_batch12_im768_768_9vars_pinball_noaug.hdf5 -2 grain_size_global/res/global_9prcs_simo_batch13_im768_768_9vars_pinball_noaug.hdf5 -3 grain_size_global/res/global_9prcs_simo_batch14_im768_768_9vars_pinball_noaug.hdf5
The above model has been trained with multiple batch size of 12, 13 and 14, with 768x768 pixel imagery, no augmentation, and no variable scaling
To use the model to predict on a single image:
python sedinet_predict1image.py -c config/config_9percentiles.json -i images/Cal_16.tif -1 grain_size_global/res/global_9prcs_simo_batch12_im768_768_9vars_pinball_noaug.hdf5 -2 grain_size_global/res/global_9prcs_simo_batch13_im768_768_9vars_pinball_noaug.hdf5 -3 grain_size_global/res/global_9prcs_simo_batch14_im768_768_9vars_pinball_noaug.hdf5
To use the model to predict on all images in a folder:
python sedinet_predictfolder.py -c config/config_9percentiles.json -w grain_size_global/res/global_9prcs_simo_batch14_im768_768_9vars_pinball_noaug.hdf5 -i images/
Train SediNet for sediment grain size prediction (4 percentiles of the cumulative distribution plus sieve size) on a small population of beach sands
python sedinet_train.py -c config/config_sievedsand_sieve_plus.json
Subsequently predict using:
python sedinet_predict.py -c config/config_sievedsand_sieve_plus.json -w grain_size_sieved_sands/res_sieve_plus/sievesand_sieve_plus_simo_batch8_im512_512_6vars_pinball_aug_scale.hdf5
