v2.2.1-A - Community Bulids Thread #69
Replies: 4 comments 18 replies
-
If anyone has any idea on how to build with PGO and BOLT on windows (preferably ClangCL as it was always faster in all our tests), pls comment and teach us! (or open something down below or somewhere on github idk) svt-av1-psy v2.2.1-A
The zip is a matrioska cause github doesnt support 7z Always prefer the build optimized for your march to the generic build (or do some testing yourself to see which one is the best for your hardware; in my case, on a ryzen 7900, znver4 is a lot faster than the generic x86-64-v4)
Infos + my findings:
Special thx to myself as Im the reason we finally have optimized builds on windows too (let's go!!, and this matter reached even the mainline) and to @Uranite because he's the one who made all this become real, initially coming to verify my doubts about the encoding speeds of the builds available at the time (several months ago), and finding out every possible way to build our binaries in the most optimized ways possibile. Sorry for the long post, this first time I just wanted to give some infos on the work we made to reach this point and thank everyone who helped. If anyone finds anything new we are happy to learn more! Finally, if anything doesnt work as it should, feel free to comment so we can find out what I did wrong lmao (I normally just build and post the znver4 and x86-64-v4 binaries on discord so idk if the others will not work properly) |
Beta Was this translation helpful? Give feedback.
-
arm and arm64 build for Termux |
Beta Was this translation helpful? Give feedback.
-
PGO Guide for LLVM and Visual Studio... Fo me, enabling PGO results in a single-digit percent speedup - I didn't do a lot of benchmarking, but it's significant, because it's free and I didn't observe speed regressions yet. Intel users might want to give the free ICC 2024.2 a try that has settings for the various precise Intel platforms and adds "hardware PGO". The new ICX is LLVM-based and even works with -march=znver4 (overriding the front-end), but for my Zen4 it's slower then LLVM 17 or 19.
cmake --fresh -B svt_build_llvm -T ClangCL -DBUILD_SHARED_LIBS=OFF -DENABLE_AVX512=ON -DCMAKE_C_FLAGS_RELEASE="/DNDEBUG /W0 /clang:-O2 -march=znver4 -mtune=znver4 -flto -fprofile-generate=c:\pgo\llvm.profraw -Xclang -ffast-math" You can leave out and/or override the .profraw location by settings the LLVM_PROFILE_FILE env var.
For LLVM 17 use "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\Llvm\x64\bin\llvm-profdata.exe" or for LLVM 19 installer You can automate 1-4 with Intel's one-click PGO (only works with the ICX compiler), or add 2-3 to the VS GUI Build Events=>Post-Build Event.
You can either re-cmake or edit C/C++=>Command Line in the VS GUI. For LLVM 19 (using https://github.com/zufuliu/llvm-utils) you can set the respective clang options in the LLVM settings in the VS GUI that appear after switching C/C++=>Platform Toolset to LLVM_143. |
Beta Was this translation helpful? Give feedback.
-
I'm attaching sample -march=znver4 PGO binares with LLVM 17 and 19 from the current testing branch. With my Zen4, unlike vanilla builds LLVM 19 with PGO is about the same as (or even a little faster than) Visual Studio's built-in LLVM 17. However, using a laptop with thermal mangement for benchmarking is horrible. |
Beta Was this translation helpful? Give feedback.
-
Community Builds Thread
This is a place for the community to share unofficial tools not affiliated with the project -- mainly consisting of binaries compiled by community members.
Trust
Architecture
When downloading pre-compiled binaries, you might see AVX, AVX2, AVX-512, x86-64-v3, etc. If you don't know exactly what ISA extensions your CPU supports, here is a chart to help you quickly understand your hardware's support:
AMD (Desktop)
Intel (Desktop)
It is also helpful to know the various options available with C compilers, like what is available when specifying
-march
&-mtune
on x64 CPUs:Known valid x64 arguments for
-march=
:Known valid x64 arguments for
-mtune=
:I'll leave it to you all to look up which options work for your hardware.
-march=foo
implies-mtune=foo
unless you also specify a different-mtune
. This is one reason why using-march
is better than just enabling options like-mavx
without doing anything about tuning.Antivirus
Be wary of antivirus software on Windows detecting EXEs distributed here as malicious software. While they may not always legitimately be malicious, it is important to maintain a healthy level of skepticism when running code that someone else has compiled.
Beta Was this translation helpful? Give feedback.
All reactions