Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t/op/threads.t: Test can hang after only 9 out of 30 unit tests run #264

Open
jkeenan opened this issue Sep 4, 2020 · 8 comments
Open
Labels
bug Something isn't working threads threads-related problems

Comments

@jkeenan
Copy link
Collaborator

jkeenan commented Sep 4, 2020

t/op/threads.t appears to be vulnerable to hangs which cause the file to be graded as FAIL.

See this smoke-test report of our simulation in Perl 5: http://perl5.test-smoke.org/report/118070.

I went to the NetBSD VM that generated this smoke testing report. Using the same perl that generated the second of the two FAILS in that report, I first called:

$ ./perl -Ilib -V:config_args
config_args='-des -Dusedevel -Duseithreads -DDEBUGGING';
[perl-reporter-08] $ ./perl -Ilib t/op/threads.t
1..30
ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
 # parent 19460: continue
 # kid 1 before sort
 # parent 19460: continue
 # kid 2 before sort
 # parent 19460: waiting for join
 # kid 1 after sort, sleeping 1
 # kid 2 after sort, sleeping 1
 # kid 1 exit
 # parent 19460: thread exited
 # parent 19460: waiting for join
 # kid 2 exit
 # parent 19460: thread exited
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138
ok 10 - [perl \#45053]
ok 11
ok 12
ok 13 - clone seen-evals
ok 14 - undefing a typeglob doesn't cause a crash during cloning
ok 15 - No del_backref panic [perl \#70748]
ok 16 - No del_backref panic [perl \#70748] (2)
ok 17 - returning a closure
ok 18 - Test for 34394ecd06e704e9
ok 19 - RT \#73046
ok 20 - 0 refcnt neither on tmps stack nor in @_
ok 21 - RT \#73086 - clone used to clone active pads
ok 22 - Just special casing lexicals in ?{ ... }
ok 23 - 0 refcnt during CLONE
ok 24 - avoid peephole recursion
ok 25 - Pipes shared between threads do not block when closed
ok 26 - globs cloned and joined are not recloned
ok 27 - no crash when deleting $::{INC} in thread
ok 28 - no crash modifying extended array element
ok 29 - RT \#36664: Strange behavior of shared array
ok 30 - RT \#41121 binmode(STDOUT,":encoding(utf8) does not crash

So far so good. But then I called:

[perl-reporter-08] $ cd t ; ./perl harness -v op/threads.t; cd -

ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138

The program hung there for several minutes, during which I began typing this report. Then ...

 # Test process timed out - terminating
Failed 21/30 subtests 

Test Summary Report
-------------------
op/threads.t (Wstat: 139 Tests: 9 Failed: 0)
  Non-zero wait status: 139
  Parse errors: Bad plan.  You planned 30 tests but ran 9.
Files=1, Tests=9, 183 wallclock secs ( 0.02 usr  0.05 sys +  2.36 cusr  0.57 csys =  3.00 CPU)
Result: FAIL

ISTR this problem has appeared previously. Granted, this failure occurred in the smoke-me/jkeenan/cumberland-blues branch in Perl 5 -- not in tag alpha-02-MC-4 in our repository. But I know this is not the first time I've seen a hang at this point.

Thank you very much.
Jim Keenan

@jkeenan jkeenan added bug Something isn't working threads threads-related problems blocker-to-strict-by-default Blocks completion of Objective 2, strict on by default labels Sep 4, 2020
@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 4, 2020

t/op/threads.t appears to be vulnerable to hangs which cause the file to be graded as FAIL.

See this smoke-test report of our simulation in Perl 5: http://perl5.test-smoke.org/report/118070.

Parse errors: Bad plan. You planned 30 tests but ran 9.
Files=1, Tests=9, 183 wallclock secs ( 0.02 usr 0.05 sys + 2.36 cusr 0.57 csys = 3.00 CPU)
Result: FAIL


ISTR this problem has appeared previously. Granted, this failure occurred in the smoke-me/jkeenan/cumberland-blues branch in Perl 5 -- not in tag `alpha-02-MC-4` in our repository. But I know this is not the first time I've seen a hang at this point.

Indeed, we got it a lot when we ran alpha-01 thru smoke-testing. See http://perl5.test-smoke.org/submatrix?test=../t/op/threads.t&pversion=7.0.0

@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 4, 2020

Relevant code in t/op/threads.t:

135 # [perl #45053] Memory corruption with heavy module loading in threads
136 #
137 # run-time usage of newCONSTSUB (as done by the IO boot code) wasn't
138 # thread-safe - got occasional coredumps or malloc corruption
139 watchdog(180, "process");
140 {
141     local $SIG{__WARN__} = sub {};   # Ignore any thread creation failure warnings
142     my @t;
143     for (1..10) {
144         my $thr = threads->create( sub { require IO });
145         last if !defined($thr);      # Probably ran out of memory
146         push(@t, $thr);
147     }
148     $_->join for @t;
149     ok(1, '[perl #45053]');
150 }
151 
152 sub matchit {
153     is (ref $_[1], "Regexp");
154     like ($_[0], $_[1]);
155 }
156 
157 threads->new(\&matchit, "Pie", qr/pie/i)->join();
158 
159 # tests in threads don't get counted, so
160 curr_test(curr_test() + 2);

@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 5, 2020

@atoomic, @toddr

This ticket is the last one that I think we need to resolve before merging alpha-dev-02-strict into alpha and deeming Objective 2 achieved.

I think we need to rule out the possibility that these (admittedly intermittent) test failures were caused by changes we made after beginning work on strict-by-default.

If, instead, the failures are due to a poor interaction between the unit test and a memory-constrained environment, then we will need to take up the issue with P5P.

Can you take a look?

Thank you very much.
Jim Keenan

@atoomic
Copy link
Owner

atoomic commented Sep 15, 2020

@jkeenan do you know if this is a specific issue to this branch or blead also have the same problem?

@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 15, 2020

@jkeenan do you know if this is a specific issue to this branch or blead also have the same problem?

Unfortunately, with the apparent demise of perl.test-smoke.org, I am no longer able to answer that question aside from what I've already posted in this ticket. :-(

@atoomic
Copy link
Owner

atoomic commented Sep 17, 2020

I recompiled Perl on FreeBSD (using the NYC perlmonger server) using alpha-dev-02-strict@a9a5af8e53

> git clean -dxf; ./Configure -Dcc="ccache gcc" -Dusedevel -Duseithreads -DDEBUGGING -des
> TEST_JOBS=4 make -j4 test_harness
> ./perl -Ilib -V:config_args
config_args='-Dcc=ccache gcc -Dusedevel -Duseithreads -DDEBUGGING -des';

Then run the test multiple times... the 100 test passes...

> cd t; for i in $(seq 100); do echo "====== $i"; ./perl harness -v op/threads.t; done; cd -
...
...
====== 100
op/threads.t ..
1..30
ok 1 - delete() under threads
ok 2 - weaken ref under threads
ok 3 - weaken ref \#2 under threads
# parent 83305: continue
# kid 1 before sort
# parent 83305: continue
# parent 83305: waiting for join
# kid 2 before sort
# kid 1 after sort, sleeping 1
# kid 2 after sort, sleeping 1
# kid 1 exit
# parent 83305: thread exited
# parent 83305: waiting for join
# kid 2 exit
# parent 83305: thread exited
ok 4
ok 5 - cloning constant subs
ok 6 - Ensure PL_linestr can be cloned
ok 7 - threads in CHECK block
ok 8 - threads in INIT block
ok 9 - Bug \#41138
ok 10 - [perl \#45053]
ok 11
ok 12
ok 13 - clone seen-evals
ok 14 - undefing a typeglob doesn't cause a crash during cloning
ok 15 - No del_backref panic [perl \#70748]
ok 16 - No del_backref panic [perl \#70748] (2)
ok 17 - returning a closure
ok 18 - Test for 34394ecd06e704e9
ok 19 - RT \#73046
ok 20 - 0 refcnt neither on tmps stack nor in @_
ok 21 - RT \#73086 - clone used to clone active pads
ok 22 - Just special casing lexicals in ?{ ... }
ok 23 - 0 refcnt during CLONE
ok 24 - avoid peephole recursion
ok 25 - Pipes shared between threads do not block when closed
ok 26 - globs cloned and joined are not recloned
ok 27 - no crash when deleting $::{INC} in thread
ok 28 - no crash modifying extended array element
ok 29 - RT \#36664: Strange behavior of shared array
ok 30 - RT \#41121 binmode(STDOUT,":encoding(utf8) does not crash
ok
All tests successful.
Files=1, Tests=30,  2 wallclock secs ( 0.02 usr  0.00 sys +  2.03 cusr  0.09 csys =  2.14 CPU)
Result: PASS
~/perl7

I cannot reproduce the described issue.

I do not say it does not exist, but this seem an uncommon issue.

If there is such an issue, I would also doubt that this is related to strict.
It's more a problem about interation between threads/processes.

I do not think this issue should be a blocker to move forward.
If this is a common pattern later we can tackle it at this time.

@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 17, 2020

[snip]

I cannot reproduce the described issue.

I do not say it does not exist, but this seem an uncommon issue.

If there is such an issue, I would also doubt that this is related to strict.
It's more a problem about interation between threads/processes.

I do not think this issue should be a blocker to move forward.
If this is a common pattern later we can tackle it at this time.

Okay, thanks for investigating this. I will close this issue and prepare a Merge Candidate tag.

@jkeenan jkeenan closed this as completed Sep 17, 2020
@jkeenan jkeenan removed the blocker-to-strict-by-default Blocks completion of Objective 2, strict on by default label Sep 17, 2020
@jkeenan
Copy link
Collaborator Author

jkeenan commented Sep 19, 2020

I'm going to re-open this issue, though not as a blocker to the completion of Objective 2.

Here is an additional case where this test failed in our simulation branch in Perl 5:
http://perl.develop-help.com/raw/?id=257242

NetBSD 9.0, 7 out of 8 configurations PASS; FAIL on -Duseithreads -Duse64bitall without debugging.

jimk

@jkeenan jkeenan reopened this Sep 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working threads threads-related problems
Projects
None yet
Development

No branches or pull requests

2 participants