SegCaps
A Clone version from Original SegCaps source code with enhancements on MS COCO dataset.
Install / Use
/learn @Cheng-Lin-Li/SegCapsREADME
Capsules for Object Segmentation (SegCaps)
by Rodney LaLonde and Ulas Bagci
Modified by Cheng-Lin Li
Objectives: Build up an End-to-End pipeline for Object Segmentation experiments on SegCaps with not only 3D CT images (LUNA 16) but also 2D color images (MS COCO 2017) on Binary Image Segmentation tasks.
This repository downloaded from the official website of SegCaps implementation with program restructure and enhancements.
The original paper for SegCaps can be found at https://arxiv.org/abs/1804.04241.
The original source code can be found at https://github.com/lalonderodney/SegCaps
Author's project page for this work can be found at https://rodneylalonde.wixsite.com/personal/research-blog/capsules-for-object-segmentation.
Getting Started Guide
This is the presentation file for this project.
This is my experiment to test SegCaps Net R3. I overfit on a single image, then tested how the modeled performed as the image orientation was changed. <img src="imgs/overfit-test.png" width="900px"/> Pre-trained weights include in 'data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-mar_slic-1_sub--1_strid-1_lr-0.01_recon-20.0_model_20180723-235354.hdf5'
Enhancements & Modifications
- The program was modified to support python 3.6 on Ubuntu 18.04 and Windows 10.
- Support not only 3D computed tomography scan images but also 2D Microsoft Common Objects in COntext (MS COCO) dataset images.
- Change the dice loss function type from Sørensen to Jaccard coefficient for comparing the similarity
- Add Kfold parameter for users to customize the cross validation task. K = 1 will force model to perform overfit.
- Add retrain parameter to enable users to reload pre-trained weights and retrain the model.
- Add initial learning rate for users to adjust.
- Add steps per epoch for users to adjust.
- Add number of patience for early stop of training to users.
- Add 'bce_dice' loss function as binary cross entropy + soft dice coefficient.
- Revise 'train', 'test', 'manip' flags from 0 or 1 to flags show up or not to indicate the behavior of main program.
- Add new webcam integration program for video stream segmentation.
- Accept any size of images and program automatically convert to 512 X 512 resolutions.
Procedures
1. Download this repo to your own folder
1-1. Download this repository via https://github.com/Cheng-Lin-Li/SegCaps/archive/master.zip
1-2. Extract the zip file into a folder.
1-3. Change your current directory to project folder.
cd ./SegCaps-master/SegCaps-master
2. Install Required Packages on Ubuntu / Windows
This code is written for Keras using the TensorFlow backend. The requirements.txt will install tensorflow CPU as default. You may need to adjust requirements.txt file according to your environment (CPU only or GPU for tensorflow installation).
Please install all required packages before using programs.
pip install -r requirements.txt
You may need to install additional library in Ubuntu version 17 or above version.
If you get the following error:
ImportError: libjasper.so.1: cannot open shared object file: No such file or directory
These steps will resolve it:
sudo apt-get update
sudo apt-get install libjasper-dev
3. Make your data directory.
Below commands:
3-1. Create root folder name 'data' in the repo folder. All models, results, etc. are saved to this root directory.
3-2. Create 'imgs' and 'masks' folders for image and mask files.
3-3. If you would like to leverage the data folder which come from this repo, then leave the repo as is.
mkdir data
chmod 755 data
cd ./data
mkdir imgs
mkdir masks
chmod 755 *
cd ..
4. Select Your dataset
4-1. Test the result on original LUNA 16 dataset.
- Go to LUng Nodule Analysis 2016 Grand-Challenges website
- Get an account by registration.
- Join the 'LUNA 16' challenge by click 'All Challenges' on the tab of top. Click the 'Join' and goto 'Download' section to get your data.
- copy your image files into BOTH ./data/imgs and ./data/masks folders.
4-2. Test on Microsoftsoft Common Objects in COntext (MS COCO) dataset 2017.
The repo include a crawler program to download your own class of images for training. But you have to download the annotation file first.
Click Microsoft COCO 2017 to download it.
There are two JSON files contain in the zip file. Extract them into a folder.
In this example, these two annotation files were extracted into the folder ~/SegCaps/annotations/
Example 1: Download 10 images and mask files with 'person' class from MS COCO validation dataset.
cd ./cococrawler
$python3 getcoco17.py --data_root_dir ../data --category person --annotation_file ./annotations/instances_val2017.json --number 10
Example 2: Download image IDs 22228, and 178040 with mask images for only person class from MS COCO 2017 training dataset.
cd ./cococrawler
$python3 getcoco17.py --data_root_dir ../data/coco --category person --annotation_file ./annotations/instances_train2017.json --number 10 --id 22228 178040
You can choose multiple classes if you want. Just specify category of each class by space.
Example: --category person dog cat
Try below command to list all parameters for the crawler program.
python3 getcoco17.py -h
usage: getcoco17.py [-h] [--data_root_dir DATA_ROOT_DIR]
[--category CATEGORY [CATEGORY ...]]
[--annotation_file ANNOTATION_FILE]
[--resolution RESOLUTION] [--id ID [ID ...]]
[--number NUMBER]
Download COCO 2017 image Data
optional arguments:
-h, --help show this help message and exit
--data_root_dir DATA_ROOT_DIR
The root directory for your data.
--category CATEGORY [CATEGORY ...]
MS COCO object categories list (--category person dog
cat). default value is person
--annotation_file ANNOTATION_FILE
The annotation json file directory of MS COCO object
categories list. file name should be
instances_val2017.json
--resolution RESOLUTION
The resolution of images you want to transfer. It will
be a square image.Default is 0. resolution = 0 will
keep original image resolution
--id ID [ID ...] The id of images you want to download from MS COCO
dataset.Number of images is equal to the number of
ids. Masking will base on category.
--number NUMBER The total number of images you want to download.
4-3. Test on your own dataset.
The program only tested on LUNA 16 and MS COCO2017 dataset. But it can support for your own dataset too.
4-3-1. For 2D images
Dimension of images: (Width, Height, Channels)
-
Channels = 1 or 3
-
Program parameters: --dataset mscoco17
4-3-2. For 3D images
Dimension of images: (Width, Height, Slices)
-
Slices = 1 (default) or integer (Number of slices to include for training/testing.)
-
Program parameters: --dataset luna16 --slices 1
4-3-4. Mask files
Due to the program only support binary image segmentation.
The mask should be either 0(background, Black) or 1(Foreground, White) for each pixel.
Dimension of images: (Width, Height, 1)
5. Train your model
5-1 Main File
From the main file (main.py) you can train, test, and manipulate the segmentation capsules of various networks. Simply set the --train, --test, or --manip flags to turn these on respectively. The argument --data_root_dir is the only required argument and should be set to the directory containing your 'imgs' and 'masks' folders. There are many more arguments that can be set and these are all explained in the main.py file.
Please be aware of manipulate function only support 3D images. This version of source codes DO NOT support 2D images.
The program will convert all image files into numpy format then store training/testing images into ./data/np_files and training (and testing) file lists under ./data/split_list folders. You need to remove these two folders every time if you want to replace your training image and mask files. The program will only read data from np_files folders.
5-2 Train your model:
The program uses KFold cross-training and testing, and K = 4 as default. If your testing image files less than 4, please indicate the number of image files you have.
Example: You have only 2 images, and you indicate --Kfold 2, which means you will use 1 image file for training, and 1 image file for testing.
Example command: Train SegCaps R3 on MS COCO dataset without GPU support. Assume you have 4 or more images.
python3 ./main.py --train --initial_lr 0.1 --net segcapsr3 --loss dice --data_root_dir=data --which_gpus=-2 --gpus=0 --dataset mscoco17
Example command: Train basic Capsule Net on MS COCO dataset with GPU support. Number of GPU = 1. K = 1 = You only train your model on one image for overfitting test.
python3 ./main.py --train --data_root_dir=data --net capsbasic --initial_lr 0.0001 --loglevel 2 --Kfold 1 --loss dice --dataset mscoco17 --recon_wei 20 --which_gpu -1 --gpus 1 --aug_data 0
5-3 Test your model:
For testing you will also have to specify the number of Kfolds as above if you have fewer than 4 images.
Again, the program will convert all image files into numpy format and store training/te
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
