SkillAgentSearch skills...

MobileNetV3

An implementation of MobileNetV3 with pyTorch

Install / Use

/learn @ShowLo/MobileNetV3

README

MobileNetV3

An implementation of MobileNetV3 with pyTorch

Theory

 You can find the paper of MobileNetV3 at Searching for MobileNetV3.

Prepare data

  • CIFAR-10
  • CIFAR-100
  • SVHN
  • Tiny-ImageNet
  • ImageNet: Please move validation images to labeled subfolders, you can use the script here.

Train

  • Train from scratch:
CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small \
--print-freq=100 --dataset=CIFAR100 --ema-decay=0 --label-smoothing=0.1 \
--lr=0.3 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 \
--warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200 --width-multiplier=1 \
-nbd -zero-gamma -mixup

where the meaning of the parameters are as followed:

batch-size
mode: using MobileNetV3-Small(if set to small) or MobileNetV3-Large(if set to large).
dataset: which dataset to use(CIFAR10, CIFAR100, SVHN, TinyImageNet or ImageNet).
ema-decay: decay of EMA, if set to 0, do not use EMA.
label-smoothing: $epsilon$ using in label smoothing, if set to 0, do not use label smoothing.
lr-decay: learning rate decay schedule, step or cos.
lr-min: min lr in cos lr decay.
warmup-epochs: warmup epochs using in cos lr deacy.
num-epochs: total training epochs.
nbd: no bias decay.
zero-gamma: zero $gamma$ of last BN in each block.
mixup: using Mixup.

Pretrained models

 We have provided the pretrained MobileNetV3-Small model in pretrained.

Experiments

Training setting

on ImageNet

CUDA_VISIBLE_DEVICES=5 python train.py --batch-size=128 --mode=small --print-freq=2000 --dataset=imagenet \
--ema-decay=0.99 --label-smoothing=0.1 --lr=0.1 --save-epoch-freq=50 --lr-decay=cos --lr-min=0 --warmup-epochs=5 \
--weight-decay=1e-5 --num-epochs=250 --num-workers=2 --width-multiplier=1 -dali -nbd -mixup -zero-gamma -save

on CIFAR-10

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR10\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

on CIFAR-100

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

 Using more tricks:

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0.999 --label-smoothing=0.1 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1\
  -zero-gamma -nbd -mixup

on SVHN

CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small --print-freq=1000 --dataset=SVHN\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=20 --num-workers=2 --width-multiplier=1

on Tiny-ImageNet

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0 --label-smoothing=0 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali

 Using more tricks:

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0.999 --label-smoothing=0.1 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali -nbd -mixup

MobileNetV3-Large

on ImageNet

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Offical 1.0 | 219 M | 5.4 M | 75.2% | - | | Ours 1.0 | 216.6 M | 5.47 M | - | - |

on CIFAR-10

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 66.47 M | 4.21 M | - | - |

on CIFAR-100

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 66.58 M | 4.32 M | - | - |

MobileNetV3-Small

on ImageNet

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Offical 1.0 | 56.5 M | 2.53 M | 67.4% | - | | Ours 1.0 | 56.51 M | 2.53 M | 67.52% | 87.58% |

 The pretrained model with top-1 accuracy 67.52% is provided in the folder pretrained.

on CIFAR-10 (Average accuracy of 5 runs)

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.51 M | 1.52 M | 92.97% | - |

on CIFAR-100 (Average accuracy of 5 runs)

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.60 M | 1.61 M | 73.69% | 92.31% | | More Tricks | same | same | 76.24% | 92.58% |

on SVHN (Average accuracy of 5 runs)

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.51 M | 1.52 M | 97.92% | - |

on Tiny-ImageNet (Average accuracy of 5 runs)

| | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 51.63 M | 1.71 M | 59.32% | 81.38% | | More Tricks | same | same | 62.62% | 84.04% |

Dependency

 This project uses Python 3.7 and PyTorch 1.1.0. The FLOPs and Parameters and measured using torchsummaryX.

View on GitHub
GitHub Stars54
CategoryDevelopment
Updated4mo ago
Forks12

Languages

Python

Security Score

82/100

Audited on Nov 29, 2025

No findings