SkillAgentSearch skills...

DeepLearningBenchmarks

Benchmarks across Deep Learning Frameworks in Julia and Python

Install / Use

/learn @avik-pal/DeepLearningBenchmarks
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Popular Computer Vision Model Benchmarks

Input Dimensions

  1. Batch Size = 8, Image = 3 x 224 x 224 (IF NOTHING SPECIFIED / CPU USED)
  2. Batch Size = 4, Image = 3 x 224 x 224
    • Resnet 101
    • Resnet 152

GPU USED --- Titan 1080Ti 12 GB

|Model|Framework|Forward Pass|Backward Pass|Total Time|Inference| |:---:|:---:|:---:|:---:|:---:|:---:| |VGG16|Pytorch 0.4.1|0.0245 s|0.0606 s|0.0852 s|0.0234 s| ||Flux 0.6.8+|0.0287 s|0.0760 s|0.1047 s|0.0288 s| |VGG16 BN|Pytorch 0.4.1|0.0271 s|0.0672 s|0.0943 s|0.0273 s| ||Flux 0.6.8+|0.0333 s|0.0818 s|0.1151 s|0.0327 s| |VGG19|Pytorch 0.4.1|0.0281 s|0.0741 s|0.1021 s|0.0280 s| ||Flux 0.6.8+|0.0355 s|0.0923 s|0.1278 s|0.0356 s| |VGG19 BN|Pytorch 0.4.1|0.0321 s|0.0812 s|0.1134 s|0.0325 s| ||Flux 0.6.8+|0.0377 s|0.0965 s|0.1342 s|0.0371 s| |Resnet18|Pytorch 0.4.1|0.0064 s|0.0125 s|0.0190 s|0.0050 s| ||Flux 0.6.8+|0.0079 s|0.0218 s|0.0297 s|0.0079 s| |Resnet34|Pytorch 0.4.1|0.0092 s|0.0216 s|0.0307 s|0.0092 s| ||Flux 0.6.8+|0.0137 s|0.0313 s|0.0450 s|0.0151 s| |Resnet50|Pytorch 0.4.1|0.0155 s|0.0351 s|0.0506 s|0.0152 s| ||Flux 0.6.8+|0.0205 s|0.1795 s|0.2000 s|-| |Resnet101|Pytorch 0.4.1|0.0297 s|0.0379 s|0.0676 s|0.0298 s| ||Flux 0.6.8+|0.0215 s|0.0616 s|0.0831 s|0.0208 s| |Resnet152|Pytorch 0.4.1|0.0431 s|0.05337 s|0.0965 s|0.0429 s| ||Flux 0.6.8+|0.0308 s|0.0807 s|0.1115 s|0.0298 s|

CPU USED --- Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

|Model|Framework|Forward Pass|Backward Pass|Total Time|Inference| |:---:|:---:|:---:|:---:|:---:|:---:| |VGG16|Pytorch 0.4.1|6.6024 s|9.4336 s|16.036 s|6.4216 s| ||Flux 0.6.8+|10.458 s|10.245 s|20.703 s|10.111 s| |VGG16 BN|Pytorch 0.4.1|7.0793 s|9.0536 s|16.132 s|6.7909 s| ||Flux 0.6.8+|29.633 s|18.649 s|49.282 s|24.047 s| |VGG19|Pytorch 0.4.1|8.3075 s|10.899 s|19.207 s|8.0593 s| ||Flux 0.6.8+|12.226 s|12.457 s|24.683 s|12.029 s| |VGG19 BN|Pytorch 0.4.1|8.7794 s|12.739 s|21.519 s|8.4044 s| ||Flux 0.6.8+|28.518 s|21.464 s|49.982 s|22.649 s|

<!-- |Resnet18|Pytorch 0.4.1||||| ||Flux 0.6.8+||||| |Resnet34|Pytorch 0.4.1||||| ||Flux 0.6.8+||||| |Resnet50|Pytorch 0.4.1||||| ||Flux 0.6.8+||||| |Resnet101|Pytorch 0.4.1||||| ||Flux 0.6.8+||||| |Resnet152|Pytorch 0.4.1||||| ||Flux 0.6.8+||||| -->

Individual Layer Benchmarks

Layer Descriptions

  1. Conv3x3/1 = Conv2d, 3x3 Kernel, 1x1 Padding, 1x1 Stride
  2. Conv5x5/1 = Conv2d, 5x5 Kernel, 2x2 Padding, 1x1 Stride
  3. Conv3x3/2 = Conv2d, 3x3 Kernel, 1x1 Padding, 2x2 Stride
  4. Conv5x5/2 = Conv2d, 5x5 Kernel, 2x2 Padding, 2x2 Stride
  5. Dense = 1024 => 512
  6. BatchNorm = BatchNorm2d

GPU USED --- Titan 1080Ti 12 GB

|Layer|Framework|Forward Pass|Backward Pass|Total Time| |:---:|:---:|:---:|:---:|:---:| |Conv3x3/1|Pytorch 0.4.1|0.2312 ms|0.5359 ms|0.7736 ms| ||Flux 0.6.8+|0.1984 ms|0.7640 ms|0.9624 ms| |Conv5x5/1|Pytorch 0.4.1|0.2667 ms|0.5345 ms|0.8299 ms| ||Flux 0.6.8+|0.2065 ms|0.8075 ms|1.014 ms| |Conv3x3/2|Pytorch 0.4.1|0.1170 ms|0.2203 ms|0.3376 ms| ||Flux 0.6.8+|0.0927 ms|0.5988 ms|0.6915 ms| |Conv5x5/2|Pytorch 0.4.1|0.1233 ms|0.2162 ms|0.3407 ms| ||Flux 0.6.8+|0.0941 ms|0.6515 ms|0.7456 ms| |Dense|Pytorch 0.4.1|0.0887 ms|0.1523 ms|0.2411 ms| ||Flux 0.6.8+|0.0432 ms|0.2044 ms|0.2476 ms| |BatchNorm|Pytorch 0.4.1|0.1096 ms|0.1999 ms|0.3095 ms| ||Flux 0.6.8+|0.2211 ms|0.2849 ms|0.5060 ms|

<!-- ## CPU USED --- Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz |Layer|Framework|Forward Pass|Backward Pass|Total Time| |:---:|:---:|:---:|:---:|:---:| |Conv3x3/1|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| |Conv5x5/1|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| |Conv3x3/2|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| |Conv5x5/2|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| |Dense|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| |BatchNorm|Pytorch 0.4.1|||| ||Flux 0.6.8+|||| -->

NOTE

To reproduce the benchmarks checkout Flux 0.6.8+ avik-pal/cudnn_batchnorm and CuArrays master. Since the Batchnorm GPU is broken for Flux 0.6.8+ master so we cannot perform the benchmarks using that.

Related Skills

View on GitHub
GitHub Stars25
CategoryEducation
Updated1y ago
Forks0

Languages

Julia

Security Score

65/100

Audited on Jan 11, 2025

No findings