rieMiner/Benchmarking and Tuning

From Riecoin

Here are the instructions to tune the rieMiner's parameters in order to get the best performance, and also includes indications about how to benchmark and compare values. We assume that you already read one of the other rieMiner's guide that explain how to mine or find a record.

Benchmarking

Comparing Riecoin mining performance is relatively difficult, and here is what you should know before comparing performance or tuning the settings.

There are some benchmarks here to have ideas about how a given computer should perform or examples of the remarks below.

Metrics

In rieMiner, the performance is based on two metrics,

  • The candidates/s : how many candidates (numbers that could be the first member of a prime constellation) are generated and tested every second. Higher is better. The higher the Difficulty is, the lower the candidates/s will be;
  • The ratio : the ratio of candidates found to prime numbers. Lower is better, because that means that you will find more blocks for a same during mining. The higher the Difficulty is, the larger the ratio will be (it is proportional).

If you are looking for -tuples, you can calculate the -tuple find rate (tuples per second) by doing . So, will give the estimated average number of -tuples every day. This is the relevant metric for comparing performance. Computing the inverse of this value gives the average time to find a -tuple. This is how the estimated time to find a block is calculated in rieMiner.

That means, in general, don't just consider the candidates/s metric! If it is lower after changing a setting (in particular, the PrimeTableLimit), it does not always mean that the mining performance was reduced. You must look at the -tuple rate or average time to find one instead. Similarly, a lower candidates/s with a higher Difficulty does not mean that the mining performance is lower.

There are some specific situations where it is enough to consider the candidates/s. This is the case if you can guarantee that the ratio and the Difficulty are always the same across the different benchmarks.

Convergence

The performance metrics take some time to converge, so don't make conclusions too fast about the performance! Test actual mining or use the Benchmark Mode during 10-20 minutes or more. Testing during a couple of minutes will in general not be enough. Note that if you test during mining and more blocks are found, it will reduce the candidates/s a bit, so you might take this in account when comparing the metrics.

Benchmark Mode

You should use this mode in order to compare performance of different computers or settings. Indeed, measuring performance during mining is subject to the random block occurrences, which as said above affect the performance. The Benchmark Mode allows to do "dummy mining" with reproducible conditions and compare more easily performance.

Here is a template of the Benchmark Mode. This is for a benchmark at Difficulty 1024 during 16 minutes. Blocks will appear every 150 s.

Mode = Benchmark
Difficulty = 1024
BenchmarkBlockInterval = 150
BenchmarkTimeLimit = 960
BenchmarkPrimeCountLimit = 0
# ConstellationPattern = 0, 4, 2, 4, 2, 4
# PrimorialNumber = 40

You must reproduce the current mining conditions (put the current Difficulty, and if you are mining before the second Fork or in Testnet, put an appropriate constellation pattern). You should also use the same PrimorialNumber as the one used in mining (the guessed value is slightly different between the modes).

The Search Mode is an alternative for benchmarking, but it is less reproducible and does not propose dummy blocks. In the other direction, do not use the Benchmark Mode to find new records!

Relevant configuration options

The options that can play a role to the mining performance are

  • PrimeTableLimit;
  • SieveWorkers;
  • SieveBits;
  • SieveIterations.

Threads can also be used to reduce the number of threads if wanted.

Here is a template (to append to the templates from the other guides or the Benchmark template above).

Threads = 0
PrimeTableLimit = 0
SieveWorkers = 0
SieveBits = 0
SieveIterations = 0

You can learn what these settings actually mean by reading the mining algorithm explanation.

0 is a special value that makes an initial but rough guess. Start the miner once with the automatic settings and report the guessed values, shown at the beginning. Then, you can use these values as starting points and tune the parameters like explained below and progressively fill the configuration file with manual values.

PrimeTableLimit and SieveWorkers

They are the main parameters for rieMiner tuning. Generally:

  • Higher PrimeTableLimit is better until a certain point, though increasing this will also increase the memory usage and may cause CPU Underuse. When increasing the PrimeTableLimit, the candidates/s metric will be lower, but the ratio too. So, don't assume that the mining is slower due to a lower candidates/s: you must use the estimated time to find a block instead like explained above;
  • Less SieveWorkers is better, as more will increase the memory usage and reduce the candidates/s a bit. However, there is a required minimum, as not having enough SieveWorkers will cause CPU Underuse.

To tune them, first look at your CPU usage during mining. It should be maxed out most of the times. If not, then you are experiencing CPU Underuse. For example, the CPU usage graph of the Windows 10's Task Manager may look like this:

If there is no CPU Underuse, try both, not in a particular order (you can use your intuition after few tries):

  • If you have available free memory, increase the PrimeTableLimit until you get some CPU Underuse, run out of memory, or lose performance;
  • Try to decrement the SieveWorkers until there is CPU Underuse.

If there is CPU Underuse, do the inverse operations.

Repeat the process until you feel that the settings are optimal. In all cases, it is trial and error and there is no precise quantity to increase or decrease. Multiply or divide the PrimeTableLimit by something like 1.1, 1.5, 2, 3 or something else. But you should vary the SieveWorkers only by steps of 1 or 2.

Other parameters

  • SieveBits: higher is better until a certain point, but normally, 25 is already a good value. If you have a CPU with less than 8 MiB of L3 Cache, or have a lot of SieveWorkers (more than 4), you can try to decrement this. If you have a lot of L3 Cache (for example with a server CPU), you may also try 26;
  • SieveIterations: normally, 16 is a good value and you should not have to touch this. It is unclear how this affects performance. You can try to change the value a bit and see if there is any improvement. Smaller values will reduce memory usage.

If you change these values, you should try to retune the PrimeTableLimit and SieveWorkers to see if you can still gain more performance.

Remarks for record attempters

The instructions above are also valid for those using the Search Mode or mining for records. Here are few additional remarks in these cases:

  • The longer the constellation pattern length is, the lower the PrimeTableLimit should be. While it could be well over for 5-tuples and shorter, it should not exceed a few millions or tens of millions for 10 (at Difficulty ~540) and 9-tuples (~725) for example;
  • Longer tuples will also usually require a lot of SieveWorkers, don't be surprised if you need to raise a lot this number. However, you cannot by default use more than 16 Sieve Workers. If you need more, you will have to add manually some PrimorialOffsets in the options, though in that case you should rather look for shorter tuples.