-
Notifications
You must be signed in to change notification settings - Fork 185
Advanced topics
This approaches assign a value to a position that is used in alpha-beta (PVS) search to find the best move. The NNUE evaluation computes this value with a neural network based on basic inputs (e.g. piece positions only). The network is optimized and trained on the evaluations of millions of positions at moderate search depth.
The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward. It can be evaluated efficiently on CPUs, and exploits the fact that only parts of the neural network need to be updated after a typical chess move. The nodchip repository provided the first version of the needed tools to train and develop the NNUE networks. Today, more advanced training tools are available in the nnue-pytorch repository, while data generation tools are available in a dedicated branch.
On CPUs supporting modern vector instructions (avx2 and similar), the NNUE evaluation results in much stronger playing strength, even if the nodes per second computed by the engine is somewhat lower (roughly 50% of nps is typical).
Notes:
-
the NNUE evaluation depends on the Pikafish binary and the network parameter file (see the EvalFile UCI option). Not every parameter file is compatible with a given Pikafish binary, but the default value of the EvalFile UCI option is the name of a network that is guaranteed to be compatible with that binary.
-
to use the NNUE evaluation, the additional data file with neural network parameters needs to be available. The filename for the default (recommended) net can be found as the default value of the
EvalFile
UCI option, with the formatpikafish.nnue
. This file can be downloaded fromhttp://test.pikafish.org
.
Pikafish supports large pages on Linux and Windows. Large pages make the hash access more efficient, improving the engine speed, especially on large hash sizes.
The support is automatic, Pikafish attempts to use large pages when available and will fall back to regular memory allocation when this is not the case.
Typical increases are 5-10% in terms of nodes per second, but speed increases up to 30% have been measured.
Large page support on Linux is obtained by the Linux kernel transparent huge pages functionality. Typically, transparent huge pages are already enabled, and no configuration is needed.
The use of large pages requires "Lock Pages in Memory" privilege. See Enable the Lock Pages in Memory Option (Windows) on how to enable this privilege, then run RAMMap to double-check that large pages are used.
We suggest that you reboot your computer after you have enabled large pages, because long Windows sessions suffer from memory fragmentation, which may prevent Pikafish from getting large pages: a fresh session is better in this regard.
The "speed of Pikafish" is the number of nodes (positions) Pikafish can search per second. Nodes per second (nps) is a useful benchmark number as the same version of Pikafish playing will play stronger with larger nps. Different versions of Pikafish will play at different nps, for example, if the NNUE network architecture changes, but in this case the nps difference is not related to the strength difference.
Notes:
- Stop all other applications when measuring the speedup of Pikafish
- Run at least 20 default benches (depth 13) for each build of Pikafish to have accurate measures
- A speedup of 0.3% could be meaningless (i.e. within the measurement noise)
To measure the speed of several builds of Pikafish, use one of these applications:
-
All OS:
-
bash script
bench_parallel.sh
(runbash bench_parallel.sh
for the help)
it might be that you have to install gawk as well on your system to not get syntax errors.Click to view
#!/bin/bash _bench () { ${1} << EOF > /dev/null 2>> ${2} bench 16 1 ${depth} default depth EOF } # _bench function customization example # setoption name SyzygyPath value C:\table_bases\wdl345;C:\table_bases\dtz345 # bench 128 4 ${depth} default depth if [[ ${#} -ne 4 ]]; then cat << EOF usage: ${0} ./pikafish_base ./pikafish_test depth n_runs fast bench: ${0} ./pikafish_base ./pikafish_test 13 10 slow bench: ${0} ./pikafish_base ./pikafish_test 20 10 EOF exit 1 fi pf_base=${1} pf_test=${2} depth=${3} n_runs=${4} # preload of CPU/cache/memory printf "preload CPU" (_bench ${pf_base} pf_base0.txt)& (_bench ${pf_test} pf_test0.txt)& wait # temporary files initialization : > pf_base0.txt : > pf_test0.txt : > pf_temp0.txt # bench loop: SMP bench with background subshells for ((k=1; k<=${n_runs}; k++)); do printf "\rrun %3d /%3d" ${k} ${n_runs} # swap the execution order to avoid bias if [ $((k%2)) -eq 0 ]; then (_bench ${sf_base} pf_base0.txt)& (_bench ${sf_test} pf_test0.txt)& wait else (_bench ${sf_test} pf_test0.txt)& (_bench ${sf_base} pf_base0.txt)& wait fi done # text processing to extract nps values cat pf_base0.txt | grep second | grep -Eo '[0-9]{1,}' > pf_base1.txt cat pf_test0.txt | grep second | grep -Eo '[0-9]{1,}' > pf_test1.txt for ((k=1; k<=${n_runs}; k++)); do echo ${k} >> pf_temp0.txt done printf "\rrun pf_base pf_test diff\n" paste pf_temp0.txt pf_base1.txt pf_test1.txt | awk '{printf "%3d %8d %8d %8+d\n", $1, $2, $3, $3-$2}' #paste pf_temp0.txt pf_base1.txt pf_test1.txt | awk '{printf "%3d\t%8d\t%8d\t%7+d\n", $1, $2, $3, $3-$2}' paste pf_base1.txt pf_test1.txt | awk '{printf "%d\t%d\t%d\n", $1, $2, $2-$1}' > pf_temp0.txt # compute: sample mean, 1.96 * std of sample mean (95% of samples), speedup # std of sample mean = sqrt(NR/(NR-1)) * (std population) / sqrt(NR) cat pf_temp0.txt | awk '{sum1 += $1 ; sumq1 += $1**2 ;sum2 += $2 ; sumq2 += $2**2 ;sum3 += $3 ; sumq3 += $3**2 } END {printf "\nsf_base = %8d +/- %6d (95%)\nsf_test = %8d +/- %6d (95%)\ndiff = %8d +/- %6d (95%)\nspeedup = %.5f% +/- %.3f% (95%)\n\n", sum1/NR , 1.96 * sqrt(sumq1/NR - (sum1/NR)**2)/sqrt(NR-1) , sum2/NR , 1.96 * sqrt(sumq2/NR - (sum2/NR)**2)/sqrt(NR-1) , sum3/NR , 1.96 * sqrt(sumq3/NR - (sum3/NR)**2)/sqrt(NR-1) , 100*(sum2 - sum1)/sum1 , 100 * (1.96 * sqrt(sumq3/NR - (sum3/NR)**2)/sqrt(NR-1)) / (sum1/NR) }' # remove temporary files rm -f pf_base0.txt pf_test0.txt pf_temp0.txt pf_base1.txt pf_test1.txt
Bench two git branches with bench_parallel
#!/bin/bash if [ "$#" -ne 5 ]; then echo "Usage: $0 branch1 branch2 depth runs compile_flags" exit 1 fi BRANCH1=$1 BRANCH2=$2 DEPTH=$3 RUNS=$4 COMPILE_FLAGS=$5 set -e echo "Switching to $BRANCH1 and building..." git switch $BRANCH1 make clean make -j profile-build EXE=pikafish-$BRANCH1 $COMPILE_FLAGS echo "Switching to $BRANCH2 and building..." git switch $BRANCH2 make clean make -j profile-build EXE=pikafish-$BRANCH2 $COMPILE_FLAGS echo "Running bench_parallel.sh with pikafish-$BRANCH1 and pikafish-$BRANCH2..." ./bench_parallel.sh ./pikafish-$BRANCH1 ./pikafish-$BRANCH2 $DEPTH $RUNS
-
Windows only:
- FishBench: Latest release Fishbench v6.0
- Buildtester: Latest release Buildtester 1.4.7.0