DeepLearningBenchmarks

Benchmarks across Deep Learning Frameworks in Julia and Python

Generate Convert Improve

Install / Use

/learn @avik-pal/DeepLearningBenchmarks

About this skill

Quality Score

0/100

README

Popular Computer Vision Model Benchmarks

Input Dimensions

Batch Size = 8, Image = 3 x 224 x 224 (IF NOTHING SPECIFIED / CPU USED)
Batch Size = 4, Image = 3 x 224 x 224
- Resnet 101
- Resnet 152

GPU USED --- Titan 1080Ti 12 GB

|Model|Framework|Forward Pass|Backward Pass|Total Time|Inference| |:---:|:---:|:---:|:---:|:---:|:---:| |VGG16|Pytorch 0.4.1|0.0245 s|0.0606 s|0.0852 s|0.0234 s| ||Flux 0.6.8+|0.0287 s|0.0760 s|0.1047 s|0.0288 s| |VGG16 BN|Pytorch 0.4.1|0.0271 s|0.0672 s|0.0943 s|0.0273 s| ||Flux 0.6.8+|0.0333 s|0.0818 s|0.1151 s|0.0327 s| |VGG19|Pytorch 0.4.1|0.0281 s|0.0741 s|0.1021 s|0.0280 s| ||Flux 0.6.8+|0.0355 s|0.0923 s|0.1278 s|0.0356 s| |VGG19 BN|Pytorch 0.4.1|0.0321 s|0.0812 s|0.1134 s|0.0325 s| ||Flux 0.6.8+|0.0377 s|0.0965 s|0.1342 s|0.0371 s| |Resnet18|Pytorch 0.4.1|0.0064 s|0.0125 s|0.0190 s|0.0050 s| ||Flux 0.6.8+|0.0079 s|0.0218 s|0.0297 s|0.0079 s| |Resnet34|Pytorch 0.4.1|0.0092 s|0.0216 s|0.0307 s|0.0092 s| ||Flux 0.6.8+|0.0137 s|0.0313 s|0.0450 s|0.0151 s| |Resnet50|Pytorch 0.4.1|0.0155 s|0.0351 s|0.0506 s|0.0152 s| ||Flux 0.6.8+|0.0205 s|0.1795 s|0.2000 s|-| |Resnet101|Pytorch 0.4.1|0.0297 s|0.0379 s|0.0676 s|0.0298 s| ||Flux 0.6.8+|0.0215 s|0.0616 s|0.0831 s|0.0208 s| |Resnet152|Pytorch 0.4.1|0.0431 s|0.05337 s|0.0965 s|0.0429 s| ||Flux 0.6.8+|0.0308 s|0.0807 s|0.1115 s|0.0298 s|

CPU USED --- Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

|Model|Framework|Forward Pass|Backward Pass|Total Time|Inference| |:---:|:---:|:---:|:---:|:---:|:---:| |VGG16|Pytorch 0.4.1|6.6024 s|9.4336 s|16.036 s|6.4216 s| ||Flux 0.6.8+|10.458 s|10.245 s|20.703 s|10.111 s| |VGG16 BN|Pytorch 0.4.1|7.0793 s|9.0536 s|16.132 s|6.7909 s| ||Flux 0.6.8+|29.633 s|18.649 s|49.282 s|24.047 s| |VGG19|Pytorch 0.4.1|8.3075 s|10.899 s|19.207 s|8.0593 s| ||Flux 0.6.8+|12.226 s|12.457 s|24.683 s|12.029 s| |VGG19 BN|Pytorch 0.4.1|8.7794 s|12.739 s|21.519 s|8.4044 s| ||Flux 0.6.8+|28.518 s|21.464 s|49.982 s|22.649 s|

Individual Layer Benchmarks

Layer Descriptions

Conv3x3/1 = Conv2d, 3x3 Kernel, 1x1 Padding, 1x1 Stride
Conv5x5/1 = Conv2d, 5x5 Kernel, 2x2 Padding, 1x1 Stride
Conv3x3/2 = Conv2d, 3x3 Kernel, 1x1 Padding, 2x2 Stride
Conv5x5/2 = Conv2d, 5x5 Kernel, 2x2 Padding, 2x2 Stride
Dense = 1024 => 512
BatchNorm = BatchNorm2d

GPU USED --- Titan 1080Ti 12 GB

|Layer|Framework|Forward Pass|Backward Pass|Total Time| |:---:|:---:|:---:|:---:|:---:| |Conv3x3/1|Pytorch 0.4.1|0.2312 ms|0.5359 ms|0.7736 ms| ||Flux 0.6.8+|0.1984 ms|0.7640 ms|0.9624 ms| |Conv5x5/1|Pytorch 0.4.1|0.2667 ms|0.5345 ms|0.8299 ms| ||Flux 0.6.8+|0.2065 ms|0.8075 ms|1.014 ms| |Conv3x3/2|Pytorch 0.4.1|0.1170 ms|0.2203 ms|0.3376 ms| ||Flux 0.6.8+|0.0927 ms|0.5988 ms|0.6915 ms| |Conv5x5/2|Pytorch 0.4.1|0.1233 ms|0.2162 ms|0.3407 ms| ||Flux 0.6.8+|0.0941 ms|0.6515 ms|0.7456 ms| |Dense|Pytorch 0.4.1|0.0887 ms|0.1523 ms|0.2411 ms| ||Flux 0.6.8+|0.0432 ms|0.2044 ms|0.2476 ms| |BatchNorm|Pytorch 0.4.1|0.1096 ms|0.1999 ms|0.3095 ms| ||Flux 0.6.8+|0.2211 ms|0.2849 ms|0.5060 ms|

NOTE

To reproduce the benchmarks checkout Flux 0.6.8+ avik-pal/cudnn_batchnorm and CuArrays master. Since the Batchnorm GPU is broken for Flux 0.6.8+ master so we cannot perform the benchmarks using that.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding

last30days-skill

17.6k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary