Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestJlmRemoteMemoryNoAuth intermittently fails with 'Peak Usage used memory smaller than Current Usage used memory' #274

Open
Mesbah-Alam opened this issue Jun 10, 2019 · 36 comments
Assignees
Labels

Comments

@Mesbah-Alam
Copy link
Contributor

Mesbah-Alam commented Jun 10, 2019

TestJlmRemoteMemoryNoAuth test intermittently fails with Peak Usage used memory smaller than Current Usage used memory.

We should revisit the unstable test to check its validity.

Example:

https://ci.adoptopenjdk.net/view/Test_system/job/openjdk12_hs_systemtest_x86-64_windows/71/tapResults/

CL1 stderr Exception in thread "main" java.lang.AssertionError: Peak Usage used memory smaller than Current Usage used memory
CL1 stderr 	at org.junit.Assert.fail(Assert.java:88)
CL1 stderr 	at net.adoptopenjdk.test.jlm.resources.MemoryData.checkPeakAndCurrentMemoryUsage(MemoryData.java:521)
CL1 stderr 	at net.adoptopenjdk.test.jlm.resources.MemoryData.writeData(MemoryData.java:182)
CL1 stderr 	at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.getStatsViaProxy(MemoryProfiler.java:183)
CL1 stderr 	at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.main(MemoryProfiler.java:110)
STF 11:07:08.791 - **FAILED** Process CL1 ended with exit code (1) and not the expected exit code/s (0)
@Mesbah-Alam Mesbah-Alam changed the title JLMRemoteMemoryNoAuth test intermittently fails with Peak Usage used memory smaller than Current Usage used memory TestJlmRemoteMemoryNoAuth test intermittently fails with Peak Usage used memory smaller than Current Usage used memory Jun 10, 2019
@Mesbah-Alam Mesbah-Alam changed the title TestJlmRemoteMemoryNoAuth test intermittently fails with Peak Usage used memory smaller than Current Usage used memory TestJlmRemoteMemoryNoAuth intermittently fails with Peak Usage used memory smaller than Current Usage used memory Jun 10, 2019
@Mesbah-Alam Mesbah-Alam changed the title TestJlmRemoteMemoryNoAuth intermittently fails with Peak Usage used memory smaller than Current Usage used memory TestJlmRemoteMemoryNoAuth intermittently fails with 'Peak Usage used memory smaller than Current Usage used memory' Jun 10, 2019
@Mesbah-Alam Mesbah-Alam self-assigned this Jun 10, 2019
@Mesbah-Alam
Copy link
Contributor Author

Mesbah-Alam commented Jun 10, 2019

@Mesbah-Alam
Copy link
Contributor Author

Mesbah-Alam commented Jun 10, 2019

Summary of what the failing test is doing:

  1. Given a list of instantiated MemoryPoolMXBean objects, the test invokes the java.lang.management.MemoryPoolMXBean.getUsage() API to get the "current usage" value; and then it invokes java.lang.management.MemoryPoolMXBean.getPeakUsage() to get the "peak usage" value.
  2. It then performs the following validation:
long peakUsageUsed = peak_usage.getUsed();
		long currentUsageUsed = current_usage.getUsed();

		if (peakUsageUsed < currentUsageUsed) {
			Message.logErr("  Peak usage smaller than current usage here:");
			Message.logErr("  Memory Pool:              " + poolName);
			Message.logErr("  Memory Type:              " + poolType);
			Message.logErr("  Peak Usage:               " + peak_usage.toString());
			Message.logErr("  Current Usage:            " + current_usage.toString());
			Assert.fail("Peak Usage used memory smaller than Current Usage used memory");
		}

So, intermittently, on various platforms, this validation is failing.

Question is - whether or not this failure indicates something wrong with what we are testing - i.e., the JLM API of the SDK under test.

Java doc for java.lang.management.MemoryPoolMXBean.getUsage()

Returns an estimate of the memory usage of this memory pool. This method returns null if this memory pool is not valid (i.e. no longer exists).

This method requests the Java virtual machine to make a best-effort estimate of the current memory usage of this memory pool. For some memory pools, this method may be an expensive operation that requires some computation to determine the estimate. An implementation should document when this is the case.

This method is designed for use in monitoring system memory usage and detecting low memory condition.

Javadoc for java.lang.management.MemoryPoolMXBean.getPeakUsage()

Returns the peak memory usage of this memory pool since the Java virtual machine was started or since the peak was reset. This method returns null if this memory pool is not valid (i.e. no longer exists).

FYI..
@lumpfish @JasonFengJ9

@pshipton
Copy link
Contributor

@LinHu2016 can you please take a look. Any relation to eclipse-openj9/openj9#5324?

@LinHu2016
Copy link

@pshipton I will take a look the issue, Thanks

@LinHu2016
Copy link

LinHu2016 commented Jun 12, 2019

base on example (https://ci.adoptopenjdk.net/view/Test_system/job/openjdk12_hs_systemtest_x86-64_windows/71/tapResults/), the error is for OpenJDK hotspot G1, please provide the error case on OpenJ9, Thanks

CL1 stderr j> 2019/06/10 11:07:07.499 Peak usage smaller than current usage here:
CL1 stderr j> 2019/06/10 11:07:07.500 Memory Pool: G1 Old Gen
CL1 stderr j> 2019/06/10 11:07:07.501 Memory Type: Heap memory
CL1 stderr j> 2019/06/10 11:07:07.501 Peak Usage: init = 118489088(115712K) used = 12219608(11933K) committed = 118489088(115712K) max = 268435456(262144K)
CL1 stderr j> 2019/06/10 11:07:07.502 Current Usage: init = 118489088(115712K) used = 12241080(11954K) committed = 50331648(49152K) max = 268435456(262144K)
CL1 stderr Exception in thread "main" java.lang.AssertionError: Peak Usage used memory smaller than Current Usage used memory
CL1 stderr at org.junit.Assert.fail(Assert.java:88)
CL1 stderr at net.adoptopenjdk.test.jlm.resources.MemoryData.checkPeakAndCurrentMemoryUsage(MemoryData.java:521)
CL1 stderr at net.adoptopenjdk.test.jlm.resources.MemoryData.writeData(MemoryData.java:182)
CL1 stderr at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.getStatsViaProxy(MemoryProfiler.java:183)
CL1 stderr at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.main(MemoryProfiler.java:110)

@Mesbah-Alam
Copy link
Contributor Author

@LinHu2016 : The error is intermittent. Started a Grinder for OpenJ9 here: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1664/console

@LinHu2016
Copy link

Hi @Mesbah-Alam,Could you please share test code or point the test src code location? if it has the issue on both Hotspot and OpenJ9, sounds like issue is on JCL or testing code.

@LinHu2016
Copy link

another thought it might relate nonheap memory pool(such as codecache?), wait the grinder to catch the issue on OpenJ9 for investigating the details.

@LinHu2016
Copy link

@LinHu2016 : The error is intermittent. Started a Grinder for OpenJ9 here: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1664/console

@Mesbah-Alam the Grinder failed to run the tests.

@Mesbah-Alam
Copy link
Contributor Author

Likely something is wrong with the Windows build at Adopt. The failure above happened due to AUTO_DETECT mechanism being turned on but the java -version output not containing "Compressed references" string. I launched another Grinder with AUTO_DETECT off, using the published build at Adopt, that failed too: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1675/tapResults/

@lumpfish
Copy link
Contributor

@lumpfish
Copy link
Contributor

This is still occurring in the test builds.
@LinHu2016 - are you able to provide any insight into when the values returned by the API calls

java.lang.management.MemoryPoolMXBean.getUsage()
java.lang.management.MemoryPoolMXBean.getPeakUsage()

are updated by the JVM?

I'm looking for a possible explanation for the intermittent test case behavior - for instance, is the data returned by getUsage() always 'live' data but the getPeakUsage() data only updated periodically?

@LinHu2016
Copy link

Hi @lumpfish Could you please provide a failure case? I would like investigate the case. In general for heap memory pools, the peakUsage would be updated at the begin of GC and at the end of GC or when getUsage(), getPeakUsage() and resetPeakUsage() are called by user, so usage should never be larger than peakUsage. previously there was a bug for retrieving usage for the non heap memory pool, I would like confirm if the failure is related with the fix. Thanks

@LinHu2016
Copy link

also getUsage(), getPeakUsage() are two independent api, heap usage could be changed between api calls, so for realtime you should call getUsage before calling getPeakUsage if you want to compare between the two sizes.

@LinHu2016
Copy link

so above the validation code should be like this

	long currentUsageUsed = current_usage.getUsed();
        long peakUsageUsed = peak_usage.getUsed();

		if (peakUsageUsed < currentUsageUsed) {
			Message.logErr("  Peak usage smaller than current usage here:");
			Message.logErr("  Memory Pool:              " + poolName);
			Message.logErr("  Memory Type:              " + poolType);
			Message.logErr("  Peak Usage:               " + peak_usage.toString());
			Message.logErr("  Current Usage:            " + current_usage.toString());
			Assert.fail("Peak Usage used memory smaller than Current Usage used memory");
		}

@lumpfish
Copy link
Contributor

@LinHu2016 - thank you for the insight about why current usage might be reported as higher than peak usage, I'll try reversing the API calls as you suggest and see if it makes a difference.

I haven't managed to recreate this failure in a smaller, simpler test case, nor reproduce it locally - I have only seen the failure on Jenkins slave machines.

If you want to run the test locally:

  1. Build the openjdk-systemtest and stf repositories.
    Either build via testKitGen (https://github.com/AdoptOpenJDK/openjdk-tests/blob/master/doc/userGuide.md) or follow https://github.com/adoptopenjdk/openjdk-systemtest

  2. Run the test.

    For Windows:

set JAVA_HOME=java_to_test
set PATH=java_to_test\bin;%PATH%
set SYSTEMTEST_HOME=git_checkout_location - e.g. c:\Users\user\git
set RESULTS_HOME=somewhere_to_put_results - e.g. c:\Users\user\stf_results
perl %SYSTEMTEST_HOME%/stf/stf.core/scripts/stf.pl -test-root="%SYSTEMTEST_HOME%/stf;%SYSTEMTEST_HOME%/openjdk-systemtest" -systemtest-prereqs="%SYSTEMTEST_HOME%/../systemtest_prereqs" -results-root="%RESULTS_HOME%/TestJlmRemoteMemoryNoAuth" -test=TestJlmRemoteMemoryNoAuth

For Unix

export JAVA_HOME=java_to_test
export PATH=java_to_test/bin:$PATH
export SYSTEMTEST_HOME=git_checkout_location - e.g. /home/user/git
export RESULTS_HOME=somewhere_to_put_results - e.g. /home/user/stf_results
perl $SYSTEMTEST_HOME/stf/stf.core/scripts/stf.pl -test-root="$SYSTEMTEST_HOME/stf;$SYSTEMTEST_HOME/openjdk-systemtest" -systemtest-prereqs="$SYSTEMTEST_HOME/../systemtest_prereqs" -results-root="$RESULTS_HOME/TestJlmRemoteMemoryNoAuth" -test=LambdaLoadTest

The test creates a new directory under RESULTS_HOME. In there you will see an execute.pl file which is run during the test but can be rerun manually. So the commands issued in the test can be tweaked (e.g. to add extra arguments to java command lines) and the test rerun.

Each time the test is run via stf.pl a new directory is created with a newly generated execute.pl etc. files.

The java test case code is in the openjdk.test.jlm directory in the openjdk-systemtest repository (under SYSTEMTEST_HOME). If you make want to try changes to those files, edit, rebuild and rerun.

@LinHu2016
Copy link

so above the validation code should be like this

	long currentUsageUsed = current_usage.getUsed();
        long peakUsageUsed = peak_usage.getUsed();

		if (peakUsageUsed < currentUsageUsed) {
			Message.logErr("  Peak usage smaller than current usage here:");
			Message.logErr("  Memory Pool:              " + poolName);
			Message.logErr("  Memory Type:              " + poolType);
			Message.logErr("  Peak Usage:               " + peak_usage.toString());
			Message.logErr("  Current Usage:            " + current_usage.toString());
			Assert.fail("Peak Usage used memory smaller than Current Usage used memory");
		}

I mean call getUsagge() before calling getPeakUsage(), not current_usage.getUsed() before peak_usage.getUsed(), so for the confusion.

@lumpfish
Copy link
Contributor

lumpfish commented Feb 3, 2021

After more investigation (using hotspot jdk8 mac, which shows this failure more often than other platforms (about 1 in 10 test runs)), is seems that peak_usage is not updated often enough to guarantee that it will always be equal to or greater than the maximum value that current usage has ever had.

Adding some additional code to the test case demonstrates this:

  Memory Pool:              Code Cache
  Memory Type:              Non-heap memory
  Part of Memory System:    Yes
174: Getting current usage:
  Current Memory Usage:     init = 2555904(2496K) used = 26777216(26149K) committed = 27262976(26624K) max = 251658240(245760K)
178: Getting peak usage:
  Peak Memory Usage:        init = 2555904(2496K) used = 26774912(26147K) committed = 27262976(26624K) max = 251658240(245760K)
185: Checking peak usage:
Trace:  peakUsageUsed = 26774912, currentUsageUsed = 26777216
Trace:  peakUsageUsed = 26774912, currentUsageUsed = 26777216
Trace:  peakUsageUsed less than currentUsageUsed, getting values again
Trace:  new values peakUsageUsed_2 = 26774912, currentUsageUsed_2 = 26772096

The test case was modified such that if the returned peak usage value was less than the returned current usage value the values were retrieved once more. Upon retrieving the values a second time, the peak usage value was unchanged, but the current usage value has fallen back below the peak usage value.

The openjdk test case https://github.com/openjdk/jdk/blob/05a764f4ffb8030d6b768f2d362c388e5aabd92d/test/hotspot/jtreg/compiler/codecache/jmx/PeakUsageTest.java#L77 performs a similar test, suggesting that the behavior observed here is not expected.

@caozg2021
Copy link

I add some debug info in CodeHeapPool::used_in_bytes to print _next_segment and _freelist_segments

https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9dbeec9bf0ab6ab16ce60e51bbf978774d929e74/hotspot/src/share/vm/services/memoryPool.hpp#L222

debug info like this

tty->print("Code Cache printInfo %d, %d\n", _next_segment,  _freelist_segments);

and print the CodeBlob's name at CodeCache::free

https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9dbeec9bf0ab6ab16ce60e51bbf978774d929e74/hotspot/src/share/vm/code/codeCache.cpp#L214

and open log in NMethodSweeper::sweep_code_cache

https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9dbeec9bf0ab6ab16ce60e51bbf978774d929e74/hotspot/src/share/vm/runtime/sweeper.cpp#L347

then reproduce this issue ,the debug is below

Code Cache printInfo 358516, 1984
Code Cache printInfo 358516, 1984
Code Cache printInfo 358516, 1984
### Sweep at 0 out of 4252. Invocations left: 16
Code Cache printInfo 358557, 1984
UsageMemory Code Cache init 2555904, used 22820672,  commit 23003136, max 251658240
### Sweep at 265 out of 4253. Invocations left: 15
CodeCache::free nmethod
### Sweep at 530 out of 4253. Invocations left: 14
CodeCache::free nmethod
### Sweep at 795 out of 4253. Invocations left: 13
CodeCache::free nmethod
Code Cache printInfo 358598, 2060
PeakMemory Code Cache init 2555904, used 22818432,  commit 23003136, max 251658240
Code Cache printInfo 358598, 2060
Code Cache printInfo 358598, 2060
Code Cache printInfo 358598, 2060

I think this issue is caused by CodeCache sweep,
First get the CodeCache pool 's UsageMemory _next_segment is 358557 _freelist_segments is 1984, but now the MemoryPool:_peak_usage not update ,then the sweeper sweep the CodeCache , the _freelist_segments increase , after that , get the CodeCache pool's PeakMemory, _next_segment is 358598 _freelist_segments is 2060 . so the CodeCache's peak memory usage is less than the current memory usage.

@lumpfish
Copy link
Contributor

I had been investigating along these lines:

  1. Peak usage is updated in record_peak_memory_usage(), which gets the current memory usage and updates peak usage if it is now smaller than current usage
  2. record_peak_memory_usage() looks to be called at the start of each gc cycle, and whenever getPeakUsage is called via java.lang.management
  3. getUsage() (i.e. get current usage) just returns the value from get_memory_usage().
  4. So I guessed there is a timing window where in between two calls to record_peak_memory_usage a call to get_memory_usage might return a value greater than the last recorded peak_usage, but which was never recorded as a 'peak'.

Running locally

  1. I've never seen this issue running with a 1 CPU virtual machine
  2. I've run a test where peak usage is updated in get_memory_usage() whenever the current memory is greater than the previous peak usage, and have not seen the failure then. Such a 'fix' would have a performance hit.

I never thought to look at Code Cache sweep though.

@caozg2021
Copy link

the CodeCache usage is related to _next_segment - _freelist_segments

https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9dbeec9bf0ab6ab16ce60e51bbf978774d929e74/hotspot/src/share/vm/memory/heap.cpp#L317

In my debug log , _next_segment alaways increase, but _freelist_segments increase or decrease dynamically .
@lumpfish did you modified all the memorypool‘s get_memory_usage , and how to test this modify would have a performance hit?

@lumpfish
Copy link
Contributor

lumpfish commented Mar 12, 2021

In my experiment I only modified MemoryUsage CodeHeapPool::get_memory_usage() - that is the only memory pool I've seen the error in - but I can't see a logical reason why that would be the only memory pool affected. Perhaps it is simply more dynamic in its behavior?

I added a call:

update_peak_memory_usage(initial_size(), used, committed, maxSize);

before MemoryUsage CodeHeapPool::get_memory_usage() returns its MemoryUsage object, and created

MemoryPool::update_peak_memory_usage(size_t initial_size, size_t used, size_t committed, size_t max_size)

with the same logic as record_peak_memory_usage() but without the call to get_memory_usage(), using the passed in parameters instead.

It appears to stop the problem occurring - but now peak_usage will be updated more frequently. As a rough performance assessment I was going to add counters for the number of times peak_usage is updated with and without the fix. I haven't got round to it yet - busy on other things at the moment.

I tried reproducing the issue using the openjdk head stream (openjdk/jdk repository), but could not. Have you seen it on a release other than jdk8?

@caozg2021
Copy link

My working JDK is jdk8,I only run the systemtest on jdk8.

@lumpfish
Copy link
Contributor

I've created a branch with my attempt at a fix:
AdoptOpenJDK/openjdk-jdk8u@master...lumpfish:fix_peak_memory_usage

The change currently includes code to turn the behaviour on and off and print some diagnostics using environment variables:

  1. If the environment variable PRINT_PEAK_USAGE_UPDATE_COUNT is set, the number of times _peak_usage has been updated (precision of 100 updates) is printed.
  2. If the environment variable UPDATE_PEAK_USAGE_IN_GET_MEMORY_USAGE is set _peak_usage is updated in get_memory_usage rather than in record_peak_memory_usage
  • This results in _peak_usage being updated more frequently, but closes the window for the result of calls to getUsage() exceeding the result of calls to getPeakUsage().

Running the test with and without UPDATE_PEAK_USAGE_IN_GET_MEMORY_USAGE set shows that with the change _peak_usage is updated approximately 1.5 to 2.5 times more frequently with the change depending on the memory pool. The test runs for approximately 5 minutes.

Output at end of the test without the change (i.e. without enviroment variable UPDATE_PEAK_USAGE_IN_GET_MEMORY_USAGE set):

_peak_usage_update_count for memoryPool Metaspace now 13700
_peak_usage_update_count for memoryPool Code Cache now 20600
_peak_usage_update_count for memoryPool PS Eden Space now 12800
_peak_usage_update_count for memoryPool PS Survivor Space now 12800
_peak_usage_update_count for memoryPool PS Old Gen now 12800
_peak_usage_update_count for memoryPool Compressed Class Space now 12900

Output at end of the test with the change (i.e. with enviroment variable UPDATE_PEAK_USAGE_IN_GET_MEMORY_USAGE set):

_peak_usage_update_count for memoryPool PS Eden Space now 32800
_peak_usage_update_count for memoryPool PS Survivor Space now 32800
_peak_usage_update_count for memoryPool PS Old Gen now 26400
_peak_usage_update_count for memoryPool Compressed Class Space now 26200
_peak_usage_update_count for memoryPool Code Cache now 34100
_peak_usage_update_count for memoryPool Metaspace now 27100

@lumpfish
Copy link
Contributor

lumpfish commented Mar 25, 2021

I have created a standalone client which can attach to any java process (started with the required command line options) and monitor the peak usage in that process.

Using this client the issue can also be seen when running the java db-shootout benchmark from the renaissance benchmarks project https://github.com/renaissance-benchmarks/renaissance. (Presumably other benchmarks would also demonstrate the issue.)

When running the benchmark for 5 minutes, peak usage being lower than current usage is reported about one in eight runs.

This is the client test case:
PeakUsageRemoteTest.java.txt. (The .txt extension is required for github to accept the file. Remove the .txt extension (leaving the .java extension) after downloading.

To allow the client to be attached to a running process, that process must have the following options added to its command line:

-Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false

To reproduce using the db-shootout benchmark:

  1. The test needs a machine with multiple CPUs. I have reproduced the failure on an Virtual Box Ubuntu guest with 4 cpus.
  2. Run using hotspot jdk8 (this issue has not (yet?) been seen on jdk11+).
  3. Compile the test case e.g. to /home/user/PeakUsageRemoteTest.class.
  4. Download the benchmark from https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.11.0/renaissance-gpl-0.11.0.jar e.g. to /home/user/Downloads
  5. Start the benchmark with the command:
java -Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -jar /home/user/Downloads/renaissance-gpl-0.11.0.jar db-shootout -t 300
  1. From another terminal, run the test case
cd /home/user && java -cp . PeakUsageRemoteTest

The test case monitors the peak usage every second 240 times (so should complete before the benchmark). If the benchmark ends while the client test case is still running the client ends abnormally with various exceptions (ConnectException, SocketException or others).

If at any point peak usage is lower than current usage the test case outputs a message:

Code Cache : *** Peak usage (8827328) smaller than current usage (8834368) ***

showing the name of the memory pool with the error (Code Cache), and the usage values returned.
I've only ever seen the error reported for 'Code Cache'.

Since the error occurs infrequently this shell script will run both the benchmark and the client repeatedly and should ensure that the benchmark never ends before the client test case.

a=0
rc=0
client_secs=240
benchmark_secs=300
while [ $rc -eq 0 ]
do
  a=$((a + 1))
  echo "Iteration" $a
  # Run benchmark in background
  java -Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -jar /home/user/Downloads/renaissance-gpl-0.11.0.jar db-shootout -t $benchmark_secs >/dev/null 2>&1 &
  bpid=$!
  # Run client in foreground for less time than the benchmark
  java -cp . PeakUsageRemoteTest
  # If the test failed stop the loop
  if [ $? -ne 0 ]
  then
    rc=1
  fi
  # wait for the benchmark to finish
  echo "PeakUsageRemoteTest finished, waiting for background process $bpid to finish"
  wait $bpid
  echo "Background process $bpid finished"
  echo "Iteration" $a "finished"
done

@lumpfish
Copy link
Contributor

@andrew-m-leonard - FYI, #274 (comment)

@lumpfish
Copy link
Contributor

lumpfish commented Mar 31, 2021

I've now seen this on a jdk11 hotspot aarch64_linux nightly build for a different memory pool.
So not unique to jdk8 or to the 'Code Cache' memory pool.
Failing job link: https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.system_aarch64_linux/477/consoleFull

20:42:40  CL2 j> 2021/03/30 19:42:39.276 Starting to write data
20:43:04  CL2 stderr j> 2021/03/30 19:43:02.041   Peak usage smaller than current usage here:
20:43:04  CL2 stderr j> 2021/03/30 19:43:02.042   Memory Pool:              G1 Old Gen
20:43:04  CL2 stderr j> 2021/03/30 19:43:02.045   Memory Type:              HEAP
20:43:04  CL2 stderr j> 2021/03/30 19:43:02.045   Peak Usage:               init = 106954752(104448K) used = 82918016(80974K) committed = 119537664(116736K) max = 268435456(262144K)
20:43:04  CL2 stderr j> 2021/03/30 19:43:02.046   Current Usage:            init = 106954752(104448K) used = 83638704(81678K) committed = 119537664(116736K) max = 268435456(262144K)
20:43:04  CL2 stderr Exception in thread "main" java.lang.AssertionError: Peak Usage used memory smaller than Current Usage used memory

@sophia-guo
Copy link
Contributor

sophia-guo commented Jun 20, 2022

The test fails randomly with jdk8 at a high rate and according to the test history reports it's not platform nor machine related. While the test validity is still under investigation I will exclude it for jdk8 temurin.

sophia-guo added a commit to sophia-guo/openjdk-tests that referenced this issue Jun 20, 2022
llxia pushed a commit to adoptium/aqa-tests that referenced this issue Jun 24, 2022
* Exclude TestJlmRemoteMemoryNoAuth with jdk8 temurin

adoptium/aqa-systemtest#274 (comment)

Signed-off-by: Sophia Guo <[email protected]>

* Update os-arch to be linux-all

Test fails running within a docker container, should be excluded for
linux-all

Signed-off-by: Sophia Guo <[email protected]>

* Update exclude os-arch

Signed-off-by: Sophia Guo <[email protected]>

* Temp change for test only

Signed-off-by: Sophia Guo <[email protected]>

* update exclude tagged issues

Signed-off-by: Sophia Guo <[email protected]>

* Remove duplicate excludes

Signed-off-by: Sophia Guo <[email protected]>

* java/lang/ProcessBuilder/Basic.java on solaris-x64 jdk8

Signed-off-by: Sophia Guo <[email protected]>

* Remove duplicate exclude item

Signed-off-by: Sophia Guo <[email protected]>

* exclude known issue

Signed-off-by: Sophia Guo <[email protected]>

* Fix the indentation

Signed-off-by: Sophia Guo <[email protected]>
@sophia-guo
Copy link
Contributor

also happened with TestJlmRemoteMemoryAuth on alpine-linux

@jiekang
Copy link
Contributor

jiekang commented Oct 18, 2023

The similar JDK test was failing and excluded temporarily, but fixed here:
openjdk/jdk@c0cfad4

The peak usage is acquired before current usage which is done in TestJlmRemoteMemoryAuth, however they also have calls to bean.resetPeakUsage() before acquiring peak usage. TestJlmRemoteMemoryAuth does not seem to call that. Maybe doing so will bypass the issue where peak usage isn't up-to-date because it will force peak usage to be up-to-date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants