Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark

A Pytorch Implementation of Detectron

<div align="center"> <img src="demo/33823288584_1d21cf0a26_k-pydetectron-R101-FPN.jpg" width="700px"/> Example output of e2e_mask_rcnn-R-101-FPN_2x using Detectron pretrained weight. <img src="demo/33823288584_1d21cf0a26_k-detectron-R101-FPN.jpg" width="700px"/> Corresponding example output from Detectron. <img src="demo/img1_keypoints-pydetectron-R50-FPN.jpg" width="700px"/> Example output of e2e_keypoint_rcnn-R-50-FPN_s1x using Detectron pretrained weight. </div>

This code follows the implementation architecture of Detectron. Only part of the functionality is supported. Check this section for more information.

With this code, you can...

Train your model from scratch.
Inference using the pretrained weight file (*.pkl) from Detectron.

This repository is originally built on jwyang/faster-rcnn.pytorch. However, after many modifications, the structure changes a lot and it's now more similar to Detectron. I deliberately make everything similar or identical to Detectron's implementation, so as to reproduce the result directly from official pretrained weight files.

This implementation has the following features:

It is pure Pytorch code. Of course, there are some CUDA code.
It supports multi-image batch training.
It supports multiple GPUs training.
It supports three pooling methods. Notice that only roi align is revised to match the implementation in Caffe2. So, use it.
It is memory efficient. For data batching, there are two techiniques available to reduce memory usage: 1) Aspect grouping: group images with similar aspect ratio in a batch 2) Aspect cropping: crop images that are too long. Aspect grouping is implemented in Detectron, so it's used for default. Aspect cropping is the idea from jwyang/faster-rcnn.pytorch, and it's not used for default.

Besides of that, I implement a customized nn.DataParallel module which enables different batch blob size on different gpus. Check My nn.DataParallel section for more details about this.

News

(2018/05/25) Support ResNeXt backbones.
(2018/05/22) Add group normalization baselines.
(2018/05/15) PyTorch0.4 is supported now !

Getting Started

Clone the repo:

git clone https://github.com/roytseng-tw/mask-rcnn.pytorch.git

Requirements

Tested under python3.

python packages
- pytorch>=0.3.1
- torchvision>=0.2.0
- cython
- matplotlib
- numpy
- scipy
- opencv
- pyyaml
- packaging
- pycocotools — for COCO dataset, also available from pip.
- tensorboardX — for logging the losses in Tensorboard
An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
NOTICE: different versions of Pytorch package have different memory usages.

Compilation

Compile the CUDA code:

cd lib  # please change to this directory
sh make.sh

If your are using Volta GPUs, uncomment this line in lib/mask.sh and remember to postpend a backslash at the line above. CUDA_PATH defaults to /usr/loca/cuda. If you want to use a CUDA library on different path, change this line accordingly.

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Crop and ROI_Align. (Actually gpu nms is never used ...)

Note that, If you use CUDA_VISIBLE_DEVICES to set gpus, make sure at least one gpu is visible when compile the code.

Data Preparation

Create a data folder under the repo,

cd {repo_root}
mkdir data

COCO: Download the coco images and annotations from coco website.

And make sure to put the files as the following structure:
```
coco
├── annotations
|   ├── instances_minival2014.json
│   ├── instances_train2014.json
│   ├── instances_train2017.json
│   ├── instances_val2014.json
│   ├── instances_val2017.json
│   ├── instances_valminusminival2014.json
│   ├── ...
|
└── images
    ├── train2014
    ├── train2017
    ├── val2014
    ├──val2017
    ├── ...
```
Download coco mini annotations from here. Please note that minival is exactly equivalent to the recently defined 2017 val set. Similarly, the union of valminusminival and the 2014 train is exactly equivalent to the 2017 train set.

Feel free to put the dataset at any place you want, and then soft link the dataset under the data/ folder:
```
ln -s path/to/coco data/coco
```
Recommend to put the images on a SSD for possible better training performance

Pretrained Model

I use ImageNet pretrained weights from Caffe for the backbone networks.

ResNet50, ResNet101, ResNet152
VGG16 (vgg backbone is not implemented yet)

Download them and put them into the {repo_root}/data/pretrained_model.

You can the following command to download them all:

extra required packages: argparse_color_formater, colorama, requests

python tools/download_imagenet_weights.py

NOTE: Caffe pretrained weights have slightly better performance than Pytorch pretrained. Suggest to use Caffe pretrained models from the above link to reproduce the results. By the way, Detectron also use pretrained weights from Caffe.

If you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data preprocessing (minus mean and normalize) as used in Pytorch pretrained model.

ImageNet Pretrained Model provided by Detectron

Besides of using the pretrained weights for ResNet above, you can also use the weights from Detectron by changing the corresponding line in model config file as follows:

RESNETS:
  IMAGENET_PRETRAINED_WEIGHTS: 'data/pretrained_model/R-50.pkl'

R-50-GN.pkl and R-101-GN.pkl are required for gn_baselines.

X-101-32x8d.pkl, X-101-64x4d.pkl and X-152-32x8d-IN5k.pkl are required for ResNeXt backbones.

Training

DO NOT CHANGE anything in the provided config files(configs/**/xxxx.yml) unless you know what you are doing

Use the environment variable CUDA_VISIBLE_DEVICES to control which GPUs to use.

Adapative config adjustment

Let's define some terms first

       batch_size: NUM_GPUS x TRAIN.IMS_PER_BATCH
       effective_batch_size: batch_size x iter_size
       change of somethining: new value of something / old value of something

Following config options will be adjusted automatically according to actual training setups: 1) number of GPUs NUM_GPUS, 2) batch size per GPU TRAIN.IMS_PER_BATCH, 3) update period iter_size

SOLVER.BASE_LR: adjust directly propotional to the change of batch_size.
SOLVER.STEPS, SOLVER.MAX_ITER: adjust inversely propotional to the change of effective_batch_size.

Train from scratch

Take mask-rcnn with res50 backbone for example.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --use_tfboard --bs {batch_size} --nw {num_workers}

Use --bs to overwrite the default batch size to a proper value that fits into your GPUs. Simliar for --nw, number of data loader threads defaults to 4 in config.py.

Specify —-use_tfboard to log the losses on Tensorboard.

NOTE: use --dataset keypoints_coco2017 when training for keypoint-rcnn.

The use of `--iter_size`

As in Caffe, update network once (optimizer.step()) every iter_size iterations (forward + backward). This way to have a larger effective batch size for training. Notice that, step count is only increased after network update.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --bs 4 --iter_size 4

iter_size defaults to 1.

Finetune from a pretrained checkpoint

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint}

or using Detectron's checkpoint file

python tools/train_net_step.py ... --load_detectron {path/to/the/checkpoint}

Resume training with the same dataset and batch size

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint} --resume

When resume the training, step count and **optimizer stat

Detectron.pytorch

Install / Use

README

A Pytorch Implementation of Detectron

News

Getting Started

Requirements

Compilation

Data Preparation

Pretrained Model

ImageNet Pretrained Model provided by Detectron

Training

Adapative config adjustment

Let's define some terms first

Train from scratch

The use of `--iter_size`

Finetune from a pretrained checkpoint

Resume training with the same dataset and batch size

Detectron.pytorch

Install / Use

README

A Pytorch Implementation of Detectron

News

Getting Started

Requirements

Compilation

Data Preparation

Pretrained Model

ImageNet Pretrained Model provided by Detectron

Training

Adapative config adjustment

Let's define some terms first

Train from scratch

The use of --iter_size

Finetune from a pretrained checkpoint

Resume training with the same dataset and batch size

The use of `--iter_size`