rieMiner/Benchmarks

From Riecoin

This page shows some rieMiner benchmark results in order to help comparing different processors, provide an idea on how to tune the parameters, or highlight some observations about current Riecoin mining.

Ratios and blocks per day

As mentioned in the benchmarking guide and the Stella page, the ratio is an essential metric of Riecoin mining, the candidates/s metric alone does not mean much usually. Due to how the mining algorithm is constructed, it is actually possible to compute it using the formula

is the Difficulty (searched numbers will be around ), the Prime Table Limit. .

It is not obvious in normal circumstances that the ratios between and -tuples counts or rates are the same for any , though the tendency may be observed after long mining sessions or if generating very large numbers of tuples in a benchmark.

Here are some values of the product for various PrimeTableLimits.

Product
0.0231432770
0.0238239564
0.0245458897
0.0253129494
0.0261294878
0.0270004472

Calculated ratios will be used in the benchmarks below. The blocks/day is then given by

is the constellation length, 7 since the second fork.

Results

Except when mentioned, an AMD Ryzen R7 3700X was used for the benchmarks, using all the 16 threads, and default settings were used; the constellation pattern is 0, 2, 4, 2, 4, 6, 2 (7-tuples). The benchmarks were done during a Debian 10 Live USB session. rieMiner was recompiled for the machine during the live session just before the benchmarks.

Different processors

Here are benchmarks with different CPUs.

  • Difficulty 1024;
  • 150 s Block Interval, during 16 minutes;
  • Prime Table Limit . By default, 1 Sieve Worker and 25 Sieve Bits;
  • The calculated ratio is used, .

The turbo/boost features were disabled and the CPU always ran at the mentioned frequency.

is a normalized metric, and corresponds to the candidates/s without HT/SMT divided by the number of cores and the GHz, yielding a result that can be interpreted as the architecture performance (speed of a single core at 1 GHz for this benchmark). This number is useful to make Riecoin profitability calculators like this one as various processors with the same architecture should have a similar . The list is sorted by this metric.

Lines with darker background are benchmarks done with actual hardware. Others were extrapolated.

Don't compare these values with the ones that you currently obtain while mining! To compare your CPU, you must run the Benchmark Mode in the same conditions as these benchmarks (see above)!

Processor (memory) Architecture c/s r* b/d a Remarks or specific parameters
AMD Ryzen R9 5950X @ 4 GHz (DDR4 3200 CL14) Zen 3 46137.3 18.546 5.282 554.0 Extrapolated from 3700X using 19% IPC improvement over Zen 2. 35456.6 c/s extrapolated without SMT (speedup 1.301x).
Intel Core i7-10900K @ 4 GHz (DDR4 3200 CL14) Skylake 21162.5 18.546 2.422 472.4 Extrapolated using old rieMiner benchmarks for 6700K. HT speedup assumed to be 1.12x (18895.1 c/s).
AMD Ryzen R7 3700X @ 4 GHz (DDR4 3200 CL14) Zen 2 19385.4 18.546 2.219 465.6 4 Sieve Workers. 14897.8 c/s for 8 Threads (3 Sieve Workers), meaning that the SMT speedup is about 1.301x.
AMD Ryzen R7 2700X @ 4 GHz (DDR4 3200 CL14) Zen+ 16446.4 18.546 1.882 395.0 Extrapolated from 3700X using old rieMiner benchmarks. 12639.2 c/s extrapolated without SMT (speedup 1.301x).
AMD Ryzen R7 1800X @ 4 GHz (DDR4 3200 CL14) Zen 15663.2 18.546 1.793 376.2 Extrapolated from 2700X assuming 5% IPC improvement over Zen. 12037.3 c/s extrapolated without SMT (speedup 1.301x).
Intel Core i7-5775C @ 4 GHz (DDR3 1600 CL8) Broadwell 7614.8 18.546 0.872 427.5 2 Sieve Workers. 6839.5 c/s for 4 Threads (1 Sieve Worker), meaning that the HyperThreading speedup is about 1.113x.
Intel Core i7-4790K @ 4 GHz (DDR3 1600 CL8) Haswell 6406.5 18.546 0.733 369.1 2 Sieve Workers. 5905.0 c/s for 4 Threads (1 Sieve Worker), meaning that the HyperThreading speedup is about 1.0849x.
Intel Core i7-3770K @ 4 GHz (DDR3 1600 CL8) Ivy Bridge 5910.4 18.546 0.677 327.9 2 Sieve Workers. 5245.7 c/s for 4 Threads (1 Sieve Worker), meaning that the HyperThreading speedup is about 1.127x.
Intel Core i7-2700K @ 4 GHz (DDR3 1600 CL8) Sandy Bridge 5628.9 18.546 0.644 312.2 Extrapolated from 3770K assuming 5% IPC improvement over Sandy Bridge. 4995.9 c/s extrapolated without HT (speedup 1.127x).
Intel Core i7-875K @ 4 GHz (DDR3 1600 CL8) Nehalem 4690.8 18.546 0.537 261.8 Extrapolated from 2700K assuming 20% IPC improvement over Nehalem. HT speedup assumed to be 1.12x (4188.2 c/s).
Intel Core 2 Quad QX9650 @ 4 GHz (DDR3 1600 CL8) Core 2 3707.1 18.546 0.424 231.7
Broadcom BCM2711 @ 1.6 GHz Cortex-A72 918.1 18.546 0.105 143.5 Raspberry Pi 4, rieMinerL, Raspberry Pi OS 64 bits, 23 Sieve Bits, 24 Sieve Iterations
Intel Pentium D 925 @ 4 GHz (DDR3 1067 CL7) Netburst 429.9 18.546 0.0492 53.7 24 Sieve Bits

Different memory speeds

We notice that memory speed does not matter much (despite rieMiner using a lot of memory) as much worse frequency and latency (DDR4 2400 CL18 vs 3200 CL14) is only about 3% slower.

  • Difficulty 1024;
  • PrimeTableLimit , 4 Sieve Workers, 150 s Block Interval, during 16 minutes;
  • The calculated ratio is used, .
Memory Speed c/s r* b/d
DDR4 3200 CL14 19385.4 18.546 2.219
DDR4 3200 CL18 19025.1 18.546 2.178
DDR4 2400 CL14 19011.2 18.546 2.176
DDR4 2400 CL18 18794.4 18.546 2.152

The prime table generation is more sensitive to memory performance (especially the frequency).

Memory Speed Prime table generation time (s)
DDR4 3200 CL14 5.37404
DDR4 3200 CL18 5.63299
DDR4 2400 CL14 6.31868
DDR4 2400 CL18 6.55031

Different Difficulties

The notable observation is that the ratio is proportional to the difficulty and follows the formula above. It also gives an idea about how the candidates/s metric depends on the difficulty, though the relation is difficult to establish. It can be approximated by the assumption that it is proportional to about - ( is used in the Riecoin protocol).

  • Benchmark Mode;
  • PrimeTableLimit , 1 Sieve Worker (except for 1024 where 4 are needed), no blocks, during 15 minutes;
  • r is the actual ratio, r* the calculated ratio, the latter is used to calculate the blocks/day.
Difficulty c/s r r* b/d Inverse c/s factor ()
8192 100.2 156.58 148.370 0.00000000547 197.537 (2.542)
6144 205.0 111.10 111.278 0.0000000839 96.541 (2.551)
4096 561.4 74.04 74.185 0.00000392 35.260 (2.570)
3072 1256.7 56.47 55.639 0.0000658 15.751 (2.509)
2048 3703.0 37.10 37.093 0.00331 5.346 (2.418)
1536 7909.3 27.75 27.819 0.0530 2.503 (2.263)
1024 19795.2 18.54 18.546 2.266 1.000

For example, mining at difficulty 3072 is about times harder than at difficulty 1024.

Different Prime Table Limits

These benchmarks highlight the importance of the PrimeTableLimit parameter and that it is important to not just look at the candidates/s metric. They were run at Difficulty 2048 as there is no CPU Underuse with only 1 Sieve Worker in every case. The higher the PrimeTableLimit is, the lower is the ratio, but also the candidates per second.

  • Difficulty 2048;
  • 1 Sieve Worker, no blocks, during 15 minutes;
  • r is the ratio, r* the calculated ratio, the latter is used to calculate the blocks/day.
PrimeTableLimit c/s r r* b/d
3334.6 33.95 33.820 0.005693
3536.9 34.89 34.844 0.004900
3641.8 35.96 35.933 0.004068
3703.7 37.10 37.093 0.003312
3738.7 38.38 38.329 0.002658
3806.8 47.81 47.911 0.000567
3843.1 71.99 71.849 0.000033

Despite the candidates/s being lower at higher difficulties, the blocks/days are better.

Different Constellation Patterns

  • Difficulty 2048;
  • No blocks, during 15 minutes;
  • Prime Table Limit . By default, 1 Sieve Worker and 25 Sieve Bits;
  • The calculated ratio is used, .
Length Pattern c/s r* b/d Remarks
5 0, 2, 6, 8, 12 3778.6 37.093 4.649
6 0, 4, 6, 10, 12, 16 3767.9 37.093 0.125
7 0, 2, 6, 8, 12, 18, 20 3703.7 37.093 0.00331
8 0, 2, 6, 8, 12, 18, 20, 26 3534.3 37.093 0.0000852
9 0, 2, 6, 8, 12, 18, 20, 26, 30 3002.0 37.093 0.00000195 3 Sieve Workers