Benchmarks
This section describes a range of performance benchmarks that have been run comparing this library with the standard library, and how to run your own benchmarks if required.
The values in the ratio column are how many times longer running a specific operation takes in comparison to the same operation with a double
.
On nearly all platforms, there is hardware support for binary floating point math, so we are comparing hardware to software runtimes; Decimal will be slower |
Both the results from Intel and GCC types are from very close, but not identical benchmark routines since they are written in C instead of C++. We assume they are close enough, and the differences between the C and C++ compilers are small enough, for fair comparison |
How to run the Benchmarks
To run the benchmarks yourself, navigate to the test folder and define BOOST_DECIMAL_RUN_BENCHMARKS
when running the tests.
An example on Linux with b2: ../../../b2 cxxstd=20 toolset=gcc-13 define=BOOST_DECIMAL_RUN_BENCHMARKS benchmarks -a release
, or
../../../b2 cxxstd=20 toolset=gcc-13 define=BOOST_DECIMAL_RUN_BENCHMARKS=1,BOOST_DECIMAL_BENCHMARK_CHARCONV=1 benchmarks -a release
to also run the <charconv>
benchmarks
To run the GCC benchmarks you can use the following command: gcc benchmark_libdfp.c -O3 -std=c17
followed by: ./a.out
To run the Intel benchmarks you will need both the Intel Compiler, and the library.
You can the use the following command: icx benchmark_libbid.c -O3 $PATH_TO_LIBBID/libbid.a -std=c17
followed by: ./a.out
The Intel benchmarks can only be run on one of their supported architectures: IA-32, IA-64, and Intel x64 |
Methodology
Comparisons
The benchmark for comparisons generates a random vector containing 20,000,000 elements and does operations >
, >=
, <
, <=
, ==
, and !=
between vec[i] and vec[i + 1]
.
This is repeated 5 times to generate stable results.
Basic Mathematical Operations
The benchmark for these operations generates a random vector containing 20 million elements and does operations +
, -
, *
, /
between vec[i] and vec[i + 1]
.
This is repeated five times to generate stable results.
<charconv>
Parsing and serializing number exactly is one of the key features of decimal floating point types, so we must compare the performance of <charconv>
.
For all the following the results compare we compare against STL provided <charconv>
for 20 million conversions.
Since <charconv>
is fully implemented in software for each type the performance gap between built-in float
and double
vs decimal32_t
and decimal64_t
is significantly smaller (or the decimal performance is better) than the hardware vs. software performance gap seen above for basic operations.
To run these benchmarks yourself, you will need a compiler with complete implementation of <charconv>
and to run the benchmarks under C++17 or higher.
At the time of writing, this is limited to:
-
GCC 11 or newer
-
MSVC 19.24 or newer
These benchmarks are automatically disabled if your compiler does not provide feature complete <charconv>
or if the language standard is set to C++14.
x64 Linux
Run using an Intel i9-11900k chipset running Ubuntu 24.04 and Intel oneAPI compiler 2025.2.0 or GCC 13.3.0.
Comparisons
Intel Compiler:
Type |
Runtime (us) |
Ratio to |
|
72,696 |
0.505 |
|
143,924 |
1.000 |
|
1,485,786 |
10.323 |
|
1,653,991 |
11.492 |
|
4,662,704 |
32.397 |
|
619,662 |
4.305 |
|
606,382 |
4.213 |
|
698,945 |
4.856 |
Intel |
2,411,294 |
16.754 |
Intel |
3,158,422 |
21.945 |
Intel |
3,389,883 |
23.553 |
GCC:
Type |
Runtime (us) |
Ratio to |
|
56,457 |
0.916 |
|
61,615 |
1.000 |
|
1,404,638 |
22.797 |
|
1,408,074 |
22.853 |
|
4,974,170 |
80.730 |
|
546,836 |
8.875 |
|
472,387 |
7.667 |
|
480,853 |
7.804 |
GCC |
816,703 |
13.255 |
GCC |
501,479 |
8.139 |
GCC |
914,600 |
14.844 |
Addition
Intel Compiler:
Type |
Runtime (us) |
Ratio to |
|
118,040 |
1.303 |
|
90,579 |
1.000 |
|
1,712,396 |
18.905 |
|
1,575,893 |
17.398 |
|
3,181,562 |
35.125 |
|
729,257 |
8.051 |
|
1,083,923 |
11.967 |
|
1,367,004 |
15.092 |
Intel |
1,242,797 |
13.721 |
Intel |
1,689,585 |
18.653 |
Intel |
1,958,345 |
21.620 |
GCC:
Type |
Runtime (us) |
Ratio to |
|
79,256 |
1.085 |
|
73,017 |
1.000 |
|
1,501,645 |
20.566 |
|
1,567,250 |
21.464 |
|
4,609,413 |
63.128 |
|
735,864 |
10.078 |
|
1,002,119 |
13.724 |
|
1,329,644 |
18.210 |
GCC |
2,975,146 |
40.746 |
GCC |
2,186,565 |
29.946 |
GCC |
3,368,864 |
46.138 |
Subtraction
Intel Compiler:
Type |
Runtime (us) |
Ratio to |
|
78,250 |
1.069 |
|
73,193 |
1.000 |
|
1,480,678 |
20.229 |
|
1,371,677 |
18.741 |
|
2,768,955 |
37.831 |
|
1,040,587 |
14.217 |
|
1,055,980 |
14.427 |
|
1,212,405 |
16.564 |
Intel |
1,922,108 |
26.261 |
Intel |
1,793,879 |
24.509 |
Intel |
2,397,372 |
32.754 |
GCC:
Type |
Runtime (us) |
Ratio to |
|
275,230 |
0.936 |
|
293,907 |
1.000 |
|
1,451,610 |
4.939 |
|
1,456,587 |
4.956 |
|
4,332,644 |
14.742 |
|
842,910 |
2.868 |
|
968,939 |
3.297 |
|
1,327,411 |
4.516 |
GCC |
2,045,306 |
6.959 |
GCC |
1,355,777 |
4.613 |
GCC |
3,178,891 |
10.816 |
Multiplication
Intel Compiler:
Type |
Runtime (us) |
Ratio to |
|
78,445 |
1.078 |
|
72,798 |
1.000 |
|
1,735,239 |
23.836 |
|
2,272,739 |
31.220 |
|
6,396,750 |
87.870 |
|
993,256 |
13.644 |
|
1,670,141 |
22.942 |
|
5,959,977 |
81.870 |
Intel |
1,375,434 |
18.894 |
Intel |
2,052,278 |
28.191 |
Intel |
5,964,489 |
81.932 |
GCC:
Type |
Runtime (us) |
Ratio to |
|
76,238 |
1.161 |
|
65,652 |
1.000 |
|
1,703,365 |
25.945 |
|
2,564,605 |
39.063 |
|
7,115,514 |
108.382 |
|
1,225,047 |
18.660 |
|
1,904,509 |
29.009 |
|
6,056,348 |
92.249 |
GCC |
2,635,531 |
40.144 |
GCC |
2,545,441 |
38.772 |
GCC |
7,050,299 |
107.289 |
Division
Intel Compiler:
Type |
Runtime (us) |
Ratio to |
|
100,799 |
0.971 |
|
103,796 |
1.000 |
|
2,134,312 |
20.563 |
|
5,399,276 |
52.018 |
|
10,012,578 |
96.464 |
|
1,558,774 |
15.018 |
|
1,597,873 |
15.394 |
|
8,105,004 |
78.086 |
Intel |
1,561,213 |
15.041 |
Intel |
3,115,862 |
30.019 |
Intel |
7,474,712 |
72.013 |
GCC:
Type |
Runtime (us) |
Ratio to |
|
60,277 |
0.747 |
|
80,676 |
1.000 |
|
2,396,732 |
29.708 |
|
4,021,720 |
49.850 |
|
10,677,625 |
132.352 |
|
1,083,011 |
13.424 |
|
1,851,520 |
22.950 |
|
8,121,160 |
100.664 |
GCC |
5,082,812 |
63.002 |
GCC |
3,005,153 |
37.250 |
GCC |
10,257,437 |
130.490 |
from_chars
to_chars
General Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
2,920,036 |
0.850 |
|
3,436,919 |
1.000 |
|
4,136,631 |
1.204 |
|
4,318,996 |
1.257 |
|
14,624,180 |
4.255 |
|
4,752,219 |
1.383 |
|
4,382,014 |
1.275 |
|
17,350,588 |
5.048 |
General Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
5,541,073 |
0.969 |
|
5,716,626 |
1.000 |
|
3,527,433 |
0.617 |
|
4,125,772 |
0.722 |
|
6,967,211 |
1.219 |
|
3,654,219 |
0.639 |
|
3,386,125 |
0.592 |
|
6,018,439 |
1.053 |
Scientific Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
2,841,569 |
0.827 |
|
3,437,387 |
1.000 |
|
2,564,053 |
0.750 |
|
2,856,944 |
0.831 |
|
12,147,039 |
3.534 |
|
2,878,507 |
0.837 |
|
2,933,273 |
0.853 |
|
15,010,374 |
4.367 |
Scientific Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
4,896,523 |
0.958 |
|
5,112,924 |
1.000 |
|
2,542,237 |
0.497 |
|
3,119,552 |
0.610 |
|
4,811,741 |
0.941 |
|
2,890,023 |
0.565 |
|
2,956,466 |
0.578 |
|
5,476,431 |
1.071 |
x64 Windows
Run using an Intel i9-11900k chipset running Windows 11 and Visual Studio 17.14.10
Comparisons
Type |
Runtime (us) |
Ratio to |
|
191,653 |
1.028 |
|
186,424 |
1.000 |
|
2,391,863 |
12.830 |
|
2,491,239 |
13.363 |
|
16,643,031 |
89.275 |
|
872,997 |
4.682 |
|
793,997 |
4.259 |
|
801,708 |
4.300 |
Addition
Type |
Runtime (us) |
Ratio to |
|
76,777 |
0.961 |
|
79,897 |
1.000 |
|
2,902,356 |
36.326 |
|
3,569,820 |
44.680 |
|
12,075,529 |
151.139 |
|
1,940,333 |
24.285 |
|
3,064,073 |
38.350 |
|
3,109,101 |
38.914 |
Subtraction
Type |
Runtime (us) |
Ratio to |
|
336,960 |
1.042 |
|
323,282 |
1.000 |
|
3,040,167 |
9.404 |
|
3,617,843 |
11.191 |
|
12,325,962 |
38.128 |
|
2,313,234 |
7.155 |
|
2,935,476 |
9.080 |
|
2,963,570 |
9.167 |
Multiplication
Type |
Runtime (us) |
Ratio to |
|
78,634 |
1.000 |
|
78,649 |
1.000 |
|
2,636,784 |
33.526 |
|
3,861,139 |
49.093 |
|
11,349,378 |
144.304 |
|
2,688,661 |
34.186 |
|
3,504,172 |
44.554 |
|
9,236,110 |
117.434 |
Division
Type |
Runtime (us) |
Ratio to |
|
83,566 |
0.936 |
|
89,317 |
1.000 |
|
3,048,254 |
34.128 |
|
3,282,819 |
36.755 |
|
16,648,799 |
186.401 |
|
2,059,743 |
23.061 |
|
5,105,018 |
57.156 |
|
11,587,763 |
129,737 |
from_chars
to_chars
General Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
3,181,029 |
0.826 |
|
3,852,857 |
1.000 |
|
5,242,934 |
1.361 |
|
5,586,541 |
1.450 |
|
13,955,214 |
3.622 |
|
6,053,804 |
1.571 |
|
7,957,278 |
2.065 |
|
20,202,107 |
5.243 |
General Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
6,111,231 |
0.949 |
|
6,433,885 |
1.000 |
|
4,605,311 |
0.716 |
|
4,742,497 |
0.737 |
|
12,372,901 |
1.923 |
|
4,716,827 |
0.733 |
|
4,861,975 |
0.756 |
|
10,779,778 |
1.675 |
Scientific Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
3,107,509 |
0.773 |
|
4,020,767 |
1.000 |
|
3,428,517 |
0.853 |
|
4,095,802 |
1.019 |
|
11,577,791 |
2.879 |
|
3,375,975 |
0.840 |
|
4,427,563 |
1.101 |
|
13,581,654 |
3.378 |
Scientific Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
4,938,623 |
0.930 |
|
5,309,818 |
1.000 |
|
3,435,843 |
0.647 |
|
3,682,980 |
0.694 |
|
9,223,227 |
1.737 |
|
3,379,702 |
0.637 |
|
3,892,990 |
0.733 |
|
10,158,657 |
1.913 |
ARM64 macOS
Run using a Macbook pro with M4 Max chipset running macOS Sequoia 15.5 and homebrew Clang 20.1.8
Comparisons
Type |
Runtime (us) |
Ratio to |
|
64,639 |
1.606 |
|
40,255 |
1.000 |
|
957,179 |
23.778 |
|
897,409 |
22.293 |
|
2,131,391 |
52.947 |
|
380,892 |
9.462 |
|
481,455 |
11.960 |
|
465,461 |
11.563 |
Addition
Type |
Runtime (us) |
Ratio to |
|
11,853 |
0.964 |
|
12,295 |
1.000 |
|
1,338,796 |
108.889 |
|
1,231,462 |
100.160 |
|
2,262,808 |
184.043 |
|
608,660 |
49.505 |
|
847,512 |
68.931 |
|
1,030,662 |
83.827 |
Subtraction
Type |
Runtime (us) |
Ratio to |
|
11,939 |
0.951 |
|
12,551 |
1.000 |
|
1,296,430 |
103.293 |
|
1,180,456 |
94.053 |
|
2,078,008 |
165.565 |
|
817,989 |
65.173 |
|
823,569 |
65.618 |
|
993,447 |
79.153 |
Multiplication
Type |
Runtime (us) |
Ratio to |
|
12,186 |
0.944 |
|
12,914 |
1.000 |
|
1,441,141 |
111.595 |
|
2,117,061 |
163.935 |
|
5,376,470 |
416.329 |
|
923,346 |
71.500 |
|
1,766,419 |
136.783 |
|
5,463,675 |
423.082 |
Division
Type |
Runtime (us) |
Ratio to |
|
12,576 |
0.722 |
|
17,145 |
1.000 |
|
1,732,611 |
101.056 |
|
3,558,094 |
207.529 |
|
8,985,521 |
524.090 |
|
1,075,184 |
62.711 |
|
2,027,533 |
118.258 |
|
7,583,016 |
442.287 |
from_chars
to_chars
General Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
2,223,891 |
0.882 |
|
2,520,203 |
1.000 |
|
2,983,523 |
1.184 |
|
3,348,702 |
1.329 |
|
8,899,289 |
3.531 |
|
3,383,567 |
1.343 |
|
3,436,470 |
1.364 |
|
12,509,443 |
4.964 |
General Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
4,664,538 |
0.948 |
|
4,915,699 |
1.000 |
|
2,570,339 |
0.523 |
|
3,309,343 |
0.673 |
|
5,962,030 |
1.212 |
|
2,213,792 |
0.450 |
|
3,067,584 |
0.624 |
|
6,006,157 |
1.222 |
Scientific Format Shortest Precision
Type |
Runtime (us) |
Ratio to |
|
2,119,538 |
0.848 |
|
2,500,900 |
1.000 |
|
1,757,416 |
0.703 |
|
2,187,911 |
0.875 |
|
6,976,380 |
2.790 |
|
1,739,069 |
0.695 |
|
2,060,848 |
0.824 |
|
12,509,443 |
5.002 |
Scientific Format 6 digits Precision
Type |
Runtime (us) |
Ratio to |
|
4,157,977 |
0.933 |
|
4,457,878 |
1.000 |
|
1,764,018 |
0.395 |
|
2,625,621 |
0.589 |
|
4,060,487 |
0.911 |
|
1,728,473 |
0.388 |
|
2,734,955 |
0.614 |
|
5,300,774 |
1.189 |