SkillAgentSearch skills...

SpeedTests

comparing the execution speeds of various programming languages

Install / Use

/learn @jabbalaci/SpeedTests

README

Speed Tests

When I learn a new programming language, I always implement the Münchausen numbers problem in the given language. The problem is simple but it includes a lot of computations, thus it gives an idea of the execution speed of a language.

Münchausen numbers

A Münchausen number is a number equal to the sum of its digits raised to each digit's power.

For instance, 3435 is a Münchausen number because 3<sup>3</sup>+4<sup>4</sup>+3<sup>3</sup>+5<sup>5</sup> = 3435.

0<sup>0</sup> is not well-defined, thus we'll consider 0<sup>0</sup>=0. In this case there are four Münchausen numbers: 0, 1, 3435, and 438579088.

Exercise

Write a program that finds all the Münchausen numbers. We know that the largest Münchausen number is less than 440 million.

Updates

Dates are in yyyy-month format.

2025-July: F# was added.

2025-April: Python 3 with Rust removed. Common LISP updated. C3 added.

Implementations

In the implementations I tried to use the same (simple) algorithm in order to make the comparisons as fair as possible.

All the tests were run on my home desktop machine (Intel Core i7-4771 CPU @ 3.50GHz with 8 CPU cores) using Manjaro Linux. Execution times are wall-clock times and they are measured with hyperfine (warmup runs: 1, benchmarked runs: 2).

The following implementations were received in the form of pull requests:

  • Clojure, Common LISP, Crystal, D, F#, FASM, Forth, Fortran, Haskell, JavaScript, Lua, Mojo, NASM, OCaml, Pascal, Perl, PHP, Python 3 with Numba, Racket, Ruby, Scala 3, Scheme, Swift, Toit, V, Zig

Thanks for the contributions!

If you know how to make something faster, let me know!

Languages are listed in alphabetical order.

The size of the EXE files can be further reduced with the command strip -s. If it's applicable, then the stripped EXE size is also shown in the table.

Below, you can find single-threaded implemetations. We also have some multi-threaded implementations, see here.


C

  • gcc (GCC) 13.2.1 20230801
  • clang version 16.0.6
  • Benchmark date: 2024-02-05 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | gcc -O3 main.c -o main -lm | 3.893 ± 0.01 | 15,560 | 14,408 | | gcc -O2 main.c -o main -lm | 3.892 ± 0.001 | 15,560 | 14,408 | | clang -O3 main.c -o main -lm | 2.684 ± 0.013 | 15,528 | 14,416 | | clang -O2 main.c -o main -lm | 2.672 ± 0.001 | 15,528 | 14,416 |

Notes:

  • No real difference between the switches -O2 and -O3. It's enough to use -O2.
  • clang is better in this case

see source

C++

  • g++ (GCC) 13.2.1 20230801
  • clang version 16.0.6
  • Benchmark date: 2024-02-05 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | g++ -O3 --std=c++2a main.cpp -o main | 3.865 ± 0.01 | 15,936 | 14,432 | | g++ -O2 --std=c++2a main.cpp -o main | 3.849 ± 0.012 | 15,936 | 14,432 | | clang++ -O3 --std=c++2a main.cpp -o main | 2.913 ± 0.01 | 15,904 | 14,440 | | clang++ -O2 --std=c++2a main.cpp -o main | 2.827 ± 0.015 | 15,904 | 14,440 |

Notes:

  • No big difference between the switches -O2 and -O3. Using -O2 is even better.
  • clang is better in this case

see source

C#

  • dotnet 9.0.106
  • Benchmark date: 2025-07-21 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | -- | |-----|:---:|:---:|:---:| | dotnet publish -o dist -c Release | 5.097 ± 0.043 | 77,736 | -- |

Notes:

  • Similar performance to Java.
  • 0.6 seconds faster than .NET 8.0.

see source

C3

  • C3 Compiler Version: 0.7.0
  • Benchmark date: 2025-04-29 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | c3c compile -O5 -g0 main.c3 | 3.125 ± 0.01 | 110,752 | 90,920 |

Notes:

  • Similar performance to C.
  • More info about the language: https://c3-lang.org

see source

Clojure

  • Clojure CLI version 1.12.0.1479
  • Benchmark date: 2024-10-08 [yyyy-mm-dd]

| Execution | Runtime (sec) | compiled / transpiled output size (bytes) | -- | |-----|:---:|:---:|:---:| | clj -M -m main | 5.631 ± 0.112 | -- | -- | | mkdir classes && java -cp `clj -Spath` main | 5.339 ± 0.101 | -- | -- |

see source

Notes:

  • A bit slower than Java.

Codon

  • codon 0.15.5
  • Benchmark date: 2023-04-02 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | codon build -release main.py | 5.369 ± 0.006 | 28,400 | 26,864 |

Notes:

  • Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead.
  • It's a bit faster than C#!
  • The code is unchanged Python code. No type annotations are needed.

See https://github.com/exaloop/codon for more information about this compiler.

see source

Common LISP

  • GNU CLISP 2.49.93+ (2018-02-18) (built on root2 [65.108.105.205])
  • SBCL 2.5.1
  • Benchmark date: 2025-04-10 [yyyy-mm-dd]

| Execution | Runtime (sec) | -- | -- | |-----|:---:|:---:|:---:| | clisp -C main2.cl | 517.914 ± 1.032 | -- | -- | | clisp -C main.cl | 322.324 ± 0.98 | -- | -- | | sbcl --script main.cl | 7.277 ± 0.003 | -- | -- | | sbcl --script main2.cl | 4.897 ± 0.007 | -- | -- |

Notes:

  • clisp is very slow. Even worse than Python. And without the -C switch, it's ten times slower.
  • With sbcl, you can get excellent performance.

see source

Crystal

  • Crystal 1.13.2 (2024-09-08); LLVM: 18.1.8; Default target: x86_64-pc-linux-gnu
  • Benchmark date: 2024-10-13 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | crystal build --release main.cr | 4.237 ± 0.077 | 807,432 | 273,424 |

Notes:

  • The runtime is very good, similar to Go.
  • The source code is almost identical to the Ruby source code.
  • The build time is also good. In a previous version (2022) it was painfully slow.

See https://crystal-lang.org for more info about this language.

see source

D

  • DMD64 D Compiler v2.100.0
  • LDC - the LLVM D compiler (1.29.0)
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | dmd -release -O main.d | 9.987 ± 0.045 | 993,816 | 712,504 | | ldc2 -release -O main.d | 3.089 ± 0.008 | 34,584 | 23,008 |

Notes:

  • the runtime is comparable to C/C++
  • the official compiler dmd is slow
  • ldc2 is the best in this case

see source

Dart

  • Dart SDK version: 2.17.6 (stable) (Tue Jul 12 12:54:37 2022 +0200) on "linux_x64"
  • Node.js v18.6.0
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Execution | Runtime (sec) | compiled / transpiled output size (bytes) | -- | |-----|:---:|:---:|:---:| | dart main.dart | 23.909 ± 0.581 | -- | -- | | dart compile js main.dart -O2 -m -o main.js && node main.js | 10.509 ± 0.032 | 31,684 | -- | | dart compile exe main.dart -o main && ./main | 8.377 ± 0.009 | 5,925,856 | -- |

(*): in the first case, the Dart code is executed as a script

Notes:

  • If you execute it as a script (JIT), it's slow.
  • If you compile to native code (AOT), it's fast (though slower than Java/C#).
  • stripping damaged the EXE file

see source

Elixir

  • Erlang/OTP 24 [erts-12.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]; Elixir 1.13.2 (compiled with Erlang/OTP 24)
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Execution | Runtime (sec) | -- | -- | |-----|:---:|:---:|:---:| | elixir main.exs | 227.963 ± 0.543 | -- | -- | | elixirc munchausen.ex && elixir caller.exs | 217.528 ± 0.762 | -- | -- |

Notes:

  • Elixir doesn't excel in CPU-intensive tasks.
  • In the second case, the modules were compiled to .beam files. However, it didn't make the program much faster. The difference is very small.

see source

F#

  • dotnet 9.0.106
  • Benchmark date: 2025-07-21 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | -- | |-----|:---:|:---:|:---:| | dotnet publish -o dist -c Release | 4.872 ± 0.015 | 77,736 | -- |

Notes:

  • Excellent performance. Even a bit faster than C#.

see source

FASM

  • flat assembler version 1.73.30
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | # FASM x64, see v1 in Makefile | 15.792 ± 0.018 | 532 | 532 | | # FASM x86, see v2 in Makefile | 15.207 ± 0.023 | 444 | 444 |

Note: no difference between the 32-bit and 64-bit versions.

See https://en.wikipedia.org/wiki/FASM for more info about FASM.

see source

Forth

  • gforth 0.7.3
  • Benchmark date: 2025-03-02 [yyyy-mm-dd]

| Execution | Runtime (sec) | -- | -- | |-----|:---:|:---:|:---:| | gforth-fast main.fs | 73.734 ± 0.034 | -- | -- |

see source

Fortran

  • GNU Fortran (GCC) 12.1.0
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | gfortran -O2 main.f08 -o main | 3.884 ± 0.054 | 21,016 | 14,456 |

Note: its speed is comparable to C.

see source

Go

  • go version go1.23.1 linux/amd64
  • Benchmark date: 2024-10-08 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (bytes) | |-----|:---:|:---:|:---:| | # using int, see v1 in Makefile | 4.122 ± 0.034 | 2,137,820 | 1,391,192 | | # using uint and uint32, see v2 in Makefile | 3.5 ± 0.045 | 2,137,756 | 1,391,192 |

Notes:

  • The speed is between C and Java (slower than C, faster than Java).
  • Using uint and uint32, you can get better performance.
  • The EXE is quite big.

see source

Haskell

  • The Glorious Glasgow Haskell Compilation System, version 8.10.7
  • Benchmark date: 2022-07-28 [yyyy-mm-dd]

| Compilation | Runtime (sec) | EXE (bytes) | stripped EXE (by

Related Skills

View on GitHub
GitHub Stars139
CategoryDevelopment
Updated2d ago
Forks43

Languages

Python

Security Score

100/100

Audited on Mar 31, 2026

No findings