TLCBench
Benchmark scripts for TVM
Install / Use
/learn @tlc-pack/TLCBenchREADME
TLCBench
Benchmark scripts for TVM
Content
Requirement
Tested with
TVM commit id: 91e07e1f3a7 (Feb. 5, 2021)
mxnet==1.7.0
gluonnlp==0.10.0
Intel CPU
Results on AWS c5.9xlarge (Intel Xeon Platinum 8124m @ 3.00GHz 18-core)
- AutoTVM
-------------------------------------------------------------
Network Name Batch size Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50 1 5.40 ms (0.08 ms)
mobilenet_v2 1 1.33 ms (0.05 ms)
bert 1 31.31 ms (0.11 ms)
-------------------------------------------------------------
- AutoScheduler
-------------------------------------------------------------
Network Name Batch size Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50 1 5.30 ms (0.05 ms)
mobilenet_v2 1 0.91 ms (0.02 ms)
bert 1 16.52 ms (0.16 ms)
-------------------------------------------------------------
Benchmark All Networks
The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for all networks.
- Commands for AutoTVM
python3 benchmark_autotvm.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
- Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
Benchmark One Network
The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for one network.
You can replace "resnet_50" below with "mobilenet_v2" or "bert".
- Commands for AutoTVM
python3 benchmark_autotvm.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
- Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
Tuning
The following commands perform auto-tuning for one or all networks and save tuning logs to directory tmp_logs.
After tuning, you can use these logs to run benchmark by using benchmark commands above and replace the last argument with --logdir tmp_logs
- Commands for AutoTVM
# Tune one network
python3 tune_autotvm.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
# Tune all networks
python3 tune_autotvm.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
- Commands for AutoScheduler
# Tune one network
python3 tune_autoscheduler.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
# Tune all networks
python3 tune_autoscheduler.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
Nvidia GPU
Results on AWS g4dn.4xlarge (NVIDIA T4)
- AutoTVM
-------------------------------------------------------------
Network Name Batch size Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50 1 3.54 ms (0.02 ms)
mobilenet_v2 1 0.74 ms (0.00 ms)
bert 1 89.06 ms (1.22 ms)
-------------------------------------------------------------
- AutoScheduler
-------------------------------------------------------------
Network Name Batch size Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50 1 2.90 ms (0.01 ms)
mobilenet_v2 1 0.57 ms (0.00 ms)
bert 1 9.95 ms (0.01 ms)
-------------------------------------------------------------
Benchmark All Networks
The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for all networks.
- Commands for AutoTVM
python3 benchmark_autotvm.py --network all --target "cuda -model=t4" --logdir saved_logs/latest
- Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network all --target "cuda -model=t4" --logdir saved_logs/latest
Benchmark One Network
The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for one network.
You can replace "resnet_50" below with "mobilenet_v2" or "bert".
- Commands for AutoTVM
python3 benchmark_autotvm.py --network resnet_50 --target "cuda -model=t4" --logdir saved_logs/latest
- Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network resnet_50 --target "cuda -model=t4" --logdir saved_logs/latest
Tuning
The following commands perform auto-tuning for one or all networks and save tuning logs to directory tmp_logs.
After tuning, you can use these logs to run benchmark by using benchmark commands above and replace the last argument with --logdir tmp_logs
- Commands for AutoTVM
# Tune one network
python3 tune_autotvm.py --network resnet_50 --target "cuda -model=t4"
# Tune all networks
python3 tune_autotvm.py --network all --target "cuda -model=t4"
- Commands for AutoScheduler
# Tune one network
python3 tune_autoscheduler.py --network resnet_50 --target "cuda -model=t4"
# Tune all networks
python3 tune_autoscheduler.py --network all --target "cuda -model=t4"
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
