-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MueLu: change default for "aggregation: deterministic" to true #12351
base: develop
Are you sure you want to change the base?
MueLu: change default for "aggregation: deterministic" to true #12351
Conversation
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
Using Repos:
Pull Request Author: GrahamBenHarper |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
|
@cgcgcg @jhux2
We see a slowdown in setup of 28% on the 100x100x100 problem and a slowdown in setup of 18% on the 200x200x200 problem. Does that seem reasonable, or is that too significant? |
While you're at it, can you also benchmark the coloring on device? |
@GrahamBenHarper Can we replicate 100x, please? Numbers that small are generally noisy. |
@cgcgcg @csiefer2 @rppawlo @jhux2 scaling.xml
I did 100 runs on three grid sizes: 60x60x60 (2 levels, for comparison with EMPIRE), 100x100x100 (3 levels), and 200x200x200 (3 levels), with each of the following aggregation scenarios: mis2 coarsening
mis2 aggregation
vertex based
vertex based bit set
edge filtering
net based bit set
deterministic (serial coloring)
Summaryserial coloring (which is the route that deterministic aggregation must take) is faster than all of the algorithms by a wide margin except for the two mis2 algorithms, which beat serial coloring by a lot sadly. Mis2 coarsening being 50% faster on extremely large (8M DOFs/GPU) problems is pretty tragic but maybe not too surprising, but for EMPIRE-scale problems, the numbers are closer to 30%. 30% still seems awfully large to merit changing the default aggregation to deterministic, but I want to know what everybody else thinks. |
Just to be clear, the mis2 methods are also deterministic. The only downside really is you can't set the min/max agg sizes with it, which might make it less appealing as the default. |
@GrahamBenHarper Thanks for doing these experiments. Here's a table summarizing the timer "Driver: 2 - MueLu Setup":
|
@brian-kelley that's right. I've realized is that @jhux2 also, it might be important to note that setup data uses |
Sure, but I assume the main differentiator is the aggregation. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_gcc
Jenkins Parameters
Build InformationTest Name: PR_gcc-openmpi_debug
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
Using Repos:
Pull Request Author: GrahamBenHarper |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_gcc
Jenkins Parameters
Build InformationTest Name: PR_gcc-openmpi_debug
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
|
@trilinos/muelu @cgcgcg
Motivation
This has been a long-needed change, and has been suggested by applications (e.g. @rppawlo in #11026), but it needs careful evaluation. We discussed it at standup today, and we decided we will pursue this if we can be certain from a performance standpoint that we don't horribly increase aggregation time running with default settings. I'll follow-up with performance testing data in this PR once it's done. This is backwards incompatible, so it's appropriate to try to merge before Trilinos 15.0 is released if the performance looks reasonable.
I'll ping other relevant application contacts once I'm certain this is okay to merge.
Related Issues
Closes #11026.
See also issues linked in #11026, for example.
Testing
This was never covered by testing before, as our usual "fix" was to manually specify
"aggregation: deterministic" = true
in testing, c.f. #11022 for example. Between the autotester and the performance tests, we should have everything covered.