Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error in unit test from unsynched bulk #1260

Merged
merged 1 commit into from
Apr 18, 2024

Conversation

marchdf
Copy link
Contributor

@marchdf marchdf commented Apr 18, 2024

Fixes these errors:

error: 'show' is not a valid command.
C++ exception with description "Requirement( bulkData.in_synchronized_state() ) FAILED                                                                                                                                                                                [63/29722]
Error occurred at: stk_mesh/stk_mesh/base/SkinBoundary.cpp:88

Error: Cannot use create_all_sides while in another mod cycle.
" thrown in the test body.

@marchdf marchdf requested review from psakievich and alanw0 April 18, 2024 15:55
@marchdf
Copy link
Contributor Author

marchdf commented Apr 18, 2024

These edits fix the errors but I now get another error later in the unit tests:

[----------] 2 tests from ConductionResidualFixture
[ RUN      ] ConductionResidualFixture.residual_executes
[       OK ] ConductionResidualFixture.residual_executes (0 ms)
[ RUN      ] ConductionResidualFixture.linearized_residual_executes
Process 16543 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x2e44dd734)
    frame #0: 0x00000001844e6e08 libsystem_malloc.dylib`tiny_free_list_remove_ptr + 112
libsystem_malloc.dylib`tiny_free_list_remove_ptr:
->  0x1844e6e08 <+112>: ldr    x12, [x1, #0x8]!
    0x1844e6e0c <+116>: mov    x11, x12
    0x1844e6e10 <+120>: xpacd  x11
    0x1844e6e14 <+124>: mov    x17, x11
Target 0: (unittestX) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x2e44dd734)
  * frame #0: 0x00000001844e6e08 libsystem_malloc.dylib`tiny_free_list_remove_ptr + 112
    frame #1: 0x00000001844e66c8 libsystem_malloc.dylib`tiny_free_no_lock + 1060
    frame #2: 0x00000001844e6120 libsystem_malloc.dylib`free_tiny + 496
    frame #3: 0x0000000100bb381c unittestX`Kokkos::HostSpace::impl_deallocate(char const*, void*, unsigned long, unsigned long, Kokkos_Profiling_SpaceHandle) const + 332
    frame #4: 0x0000000100bb3900 unittestX`Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, void>::~SharedAllocationRecord() + 120
    frame #5: 0x000000010003873c unittestX`Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, unsigned int, true>>::~SharedAllocationRecord(this=0x0000600003018bd0) at Kokkos_SharedAlloc.hpp:281:7
    frame #6: 0x000000010003867c unittestX`Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, unsigned int, true>>::~SharedAllocationRecord(this=0x0000600003018bd0) at Kokkos_SharedAlloc.hpp:281:7
    frame #7: 0x00000001000386a8 unittestX`Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, unsigned int, true>>::~SharedAllocationRecord(this=0x0000600003018bd0) at Kokkos_SharedAlloc.hpp:281:7
    frame #8: 0x000000010c1778bc libnalu.dylib`Kokkos::Impl::SharedAllocationRecord<void, void>::decrement(Kokkos::Impl::SharedAllocationRecord<void, void>*) + 60
    frame #9: 0x000000010001a0fc unittestX`Kokkos::Impl::SharedAllocationTracker::~SharedAllocationTracker(this=0x0000000106090300) at Kokkos_SharedAlloc.hpp:419:30
    frame #10: 0x000000010001a0ac unittestX`Kokkos::Impl::SharedAllocationTracker::~SharedAllocationTracker(this=0x0000000106090300) at Kokkos_SharedAlloc.hpp:419:29
    frame #11: 0x0000000100032144 unittestX`Kokkos::Impl::ViewTracker<Kokkos::View<unsigned int [2], Kokkos::LayoutLeft, Kokkos::HostSpace>>::~ViewTracker(this=0x0000000106090300) at Kokkos_ViewTracker.hpp:39:8
    frame #12: 0x00000001000320d4 unittestX`Kokkos::Impl::ViewTracker<Kokkos::View<unsigned int [2], Kokkos::LayoutLeft, Kokkos::HostSpace>>::~ViewTracker(this=0x0000000106090300) at Kokkos_ViewTracker.hpp:39:8
    frame #13: 0x000000010003244c unittestX`Kokkos::View<unsigned int [2], Kokkos::LayoutLeft, Kokkos::HostSpace>::~View(this=0x0000000106090300) at Kokkos_View.hpp:1269:19
    frame #14: 0x0000000100031fec unittestX`Kokkos::View<unsigned int [2], Kokkos::LayoutLeft, Kokkos::HostSpace>::~View(this=0x0000000106090300) at Kokkos_View.hpp:1269:19
    frame #15: 0x000000010043bc98 unittestX`Kokkos::DualView<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, void>::~DualView(this=0x0000000106090300) at Kokkos_DualView.hpp:113:7
    frame #16: 0x000000010043bc54 unittestX`Kokkos::DualView<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, void>::~DualView(this=0x0000000106090300) at Kokkos_DualView.hpp:113:7
    frame #17: 0x000000010043bc28 unittestX`Tpetra::Details::WrappedDualView<Kokkos::DualView<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, void>>::~WrappedDualView(this=0x0000000106090300) at Tpetra_Details_WrappedDualView.hpp:143:7
    frame #18: 0x000000010043ba24 unittestX`Tpetra::Details::WrappedDualView<Kokkos::DualView<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, void>>::~WrappedDualView(this=0x0000000106090300) at Tpetra_Details_WrappedDualView.hpp:143:7
    frame #19: 0x000000010043b80c unittestX`Tpetra::MultiVector<double, int, long, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace>>::~MultiVector(this=0x00000001060901a0, vtt=0x0000000100f2cb88) at Tpetra_MultiVector_decl.hpp:830:37
    frame #20: 0x00000001004300cc unittestX`Tpetra::MultiVector<double, int, long, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace>>::~MultiVector(this=0x00000001060901a0) at Tpetra_MultiVector_decl.hpp:830:37
    frame #21: 0x0000000100452fc4 unittestX`sierra::nalu::matrix_free::ConductionResidualFixture::~ConductionResidualFixture(this=0x000000010608fe00) at UnitTestConductionInterior.C:77:7
    frame #22: 0x000000010045304c unittestX`sierra::nalu::matrix_free::ConductionResidualFixture_linearized_residual_executes_Test::~ConductionResidualFixture_linearized_residual_executes_Test(this=0x000000010608fe00) at UnitTestConductionInterior.C:123:1
    frame #23: 0x0000000100450714 unittestX`sierra::nalu::matrix_free::ConductionResidualFixture_linearized_residual_executes_Test::~ConductionResidualFixture_linearized_residual_executes_Test(this=0x000000010608fe00) at UnitTestConductionInterior.C:123:1
    frame #24: 0x0000000100450740 unittestX`sierra::nalu::matrix_free::ConductionResidualFixture_linearized_residual_executes_Test::~ConductionResidualFixture_linearized_residual_executes_Test(this=0x000000010608fe00) at UnitTestConductionInterior.C:123:1
    frame #25: 0x0000000100bc1540 unittestX`void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 80
    frame #26: 0x0000000100bc2db8 unittestX`testing::TestInfo::Run() + 336
    frame #27: 0x0000000100bc37e8 unittestX`testing::TestSuite::Run() + 288
    frame #28: 0x0000000100bd0da8 unittestX`testing::internal::UnitTestImpl::RunAllTests() + 984
    frame #29: 0x0000000100bd07dc unittestX`bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
    frame #30: 0x0000000100bd0758 unittestX`testing::UnitTest::Run() + 124
    frame #31: 0x0000000100006c24 unittestX`RUN_ALL_TESTS() at gtest.h:14808:46
    frame #32: 0x0000000100006a10 unittestX`main(argc=1, argv=0x000000016fdf9a58) at unit_tests.C:60:17
    frame #33: 0x0000000184367f28 dyld`start + 2236

@alanw0 and @rcknaus : @psakievich thought you might be able to help on this one?

@marchdf
Copy link
Contributor Author

marchdf commented Apr 18, 2024

I updated the comment above with a stack trace from a debug build

@marchdf marchdf requested a review from rcknaus April 18, 2024 16:08
Copy link
Contributor

@alanw0 alanw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes those changes look good to me.
i.e., create_all_sides needs to be called outside of the modification_begin/modification_end pair of calls.
stk used to allow it to be called 'inside' those calls, but behavior was sometimes incorrect.

@rcknaus
Copy link
Contributor

rcknaus commented Apr 18, 2024

@alanw0 and @rcknaus : @psakievich thought you might be able to help on this one?

Looks like it was trying to write to past the end of multivector. do we have asan testing still?

@marchdf
Copy link
Contributor Author

marchdf commented Apr 18, 2024

[----------] Global test environment tear-down
[==========] 572 tests from 136 test suites ran. (202985 ms total)
[  PASSED  ] 571 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] VOFKernelHex8Mesh.NGP_adv_diff_edge_tpetra

I think we can merge. The failing test has a 1e-14 diff. Thanks @rcknaus for fixing that last issue!

@psakievich psakievich merged commit 83c6597 into Exawind:master Apr 18, 2024
1 of 3 checks passed
@marchdf marchdf deleted the fix-error branch April 18, 2024 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants