Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrain the most common constructs in non-writable files to not change. #391

Merged
merged 6 commits into from
Jan 29, 2021

Conversation

mattmccutchen-cci
Copy link
Member

@mattmccutchen-cci mattmccutchen-cci commented Jan 21, 2021

Summary (to go in squash commit message)

This PR addresses function and variable declarations (because they are the most obvious case and reasonably straightforward) and checked regions (because they came up in some existing regression tests). We'll leave #387 open for the tail of unhandled cases.

Also:

  • When 3C tries to change a non-writable file, issue an error diagnostic (not an assertion failure because there are known unhandled cases) rather than silently discarding the change.
  • Add a -dump-unwritable-changes flag to the 3c tool to dump the new version of the file when that diagnostic appears.
  • Add an error diagnostic when 3C tries to change a file under the base dir other than the main file in stdout mode. This is a separate feature (part of Better behavior with unchanged files #328) but ended up being easy to implement along with the diagnostic for a non-writable file.
  • Add tests for all the fixes (but not -dump-unwritable-changes).
  • Fix a bug where -warn-all-root-cause didn't work without -warn-root-cause, because this affected one of the new tests. The use of -warn-all-root-cause without -warn-root-cause in the affected test will serve as a regression test for this fix as well.

Fixes part of #387.

Old notes

A new test case tests the same combination of a .c file and a .h file in two configurations: (1) with -base-dir defaulting to a working directory that does not contain the .h file and (2) with -base-dir explicitly specified to contain the .h file. In both configurations, we check that the annotation in the .c file remains consistent with the .h file. Configuration 1 also tests the root cause diagnostic.

The old code fails the new test:

  • If I remove only the new constraint code in ProgramInfo.cpp, then the new assertion fails.
  • If I additionally remove the new assertion, then 3C produces no root cause warning and diagnostic verification fails.
  • If I additionally disable diagnostic verification, then the CHECK_LOWER in the .c file fails.

For review:

  • Is it worth testing multiple declarations of the same element, some in writable files and some not, or can we consider that an orthogonal concern? That is, if one occurrence in a non-writable file generates a constraint and the declaration merging code works correctly, then can we assume that this PR will behave correctly in the presence of multiple declarations? (The declaration merging code is currently broken in at least one case (Constraints on declaration after definition of same function are lost #399), but I could probably find a reasonable way to test this PR without hitting that other bug.) Decision: I think we decided not to test this case specifically.
  • How much risk do we think there is that our benchmark tests or other ongoing work trigger a 3C bug that trips the new error? Now that it is an error instead of an assertion failure and 3C will make all the rest of the changes it is supposed to make, we may be able to just ignore the error and nonzero exit code in manual work. But if the benchmark tests fail on a nonzero exit code (which they should and it appears they do: check_call raises an exception on a nonzero exit), we may have to do something. We could temporarily add a flag to 3C to downgrade the error to a warning. Alternatively, would it be feasible for me (or us) to run this PR on all the benchmark tests and in-progress ports before merging? Sounds like we won't bother.

Also, add an assertion that 3C doesn't generate changes to a
non-canWrite file.

Fixes #387.
@mattmccutchen-cci mattmccutchen-cci marked this pull request as ready for review January 26, 2021 18:04
@mattmccutchen-cci mattmccutchen-cci changed the title Constrain pointers in non-canWrite files to not change. (WIP) Constrain pointers in non-canWrite files to not change. Jan 26, 2021
Copy link
Collaborator

@john-h-kastner john-h-kastner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is that there might be other places where we should be adding this constraint. In particular, this could apply to typdefs, type arguments, and c-style casts since these get constraint variables that might need to be rewritten.

On the other hand, I don't think these expressions get the correct constraint when they appear in macros either.

@mattmccutchen-cci
Copy link
Member Author

Good observation. Let me take a step back. While all of those cases may theoretically be able to cause type checking failures in which one file depends on a discarded change to an unwritable file, I'm assuming that no such failures (not even in the most basic case of a variable or function) have come up in our porting work so far or I would have heard about it, right?

When I realized that 3C wasn't constraining unwritable files at all, I thought it might be worth trying to fix that bug while I was thinking about it even though it hadn't bitten us yet. I hoped, perhaps naively in retrospect, that there would be a reasonably simple and complete fix. Now it's clear that that isn't the case. For type arguments and casts, I imagine it would take some serious work to figure out what constraints we should generate, and even then, it would be hard to be sure we didn't miss anything. I hoped that typedefs might be analogous to variables and functions and I might be able to add them easily to this PR, but when I looked into that, I learned that the way 3C currently handles typedefs is nutty: it generates a separate constraint variable for each reference to the typedef and then constrains them all equal, and there's another case having to do with pointers to structures that I don't understand.

So I think I'm going to cut my losses and propose that, for now, we just merge my fix for functions and variables since it's already written and at least the function case is tested (perhaps I should add a test with variables if I can figure out how to write a reasonable one). Again, we don't know that even that case is important, but at least the test gives us some coverage of different values of -base-dir having a different effect where AFAIK we previously had none. I'd leave #387 open as an umbrella for the remaining cases.

Re what to do when 3C generates a change to an unwritable file F: Since we believe there are many (so far rare but theoretically possible) cases in which that can happen, an assertion failure definitely seems inappropriate. Should I just make it an error diagnostic? Should I have 3C write its new version of F in some special location (not according to the original -output-postfix or -output-dir) so the user can see what the change was, or is it not worth implementing this until we run into the problem?

@john-h-kastner
Copy link
Collaborator

I'm happy to have this merged. It definitely improves the situation, and it may not be worthwhile to track down all the cases where it could potentially go wrong. I think were it's most likely to go wrong is with a typedefed pointer type in an included library. I believe @aaronjeline is working on a change that will mean a single constraint variable is allocated for each typedef. That should be easier to correctly add this constraint onto typdefs.

Long term (when we've tracked down and fixed any potential problems), it should definitely be an assertion eventually. Since we believe that 3C will sometimes attempt to write to these files, it makes sense to limit it to an error diagnostic for the movement. That is still a definite improvement over a silent failure to write. I don't think it's worth implementing anything to dump the new version of a file - it should be fairly rare, and we have a good idea of how to constraint a variable to prevent the rewriting from happening.

@mattmccutchen-cci
Copy link
Member Author

All sounds good. I'm working on (1) changing the assertion to an error diagnostic, (2) adding a test with a variable, and (3) adding a test with a typedef that intentionally triggers the new diagnostic. I'm mostly done, but...

I realized I had unintentionally applied the new error only to output-postfix mode and not to stdout mode. When I refactored the code to apply it to stdout mode too, 4 existing regression tests failed (3d-allocation, graphs, graphs2, linkedlist). So I may need that feature to output the new version of the unwritable file sooner than we thought. 🙁 Investigating now.

- Add -dump-unwritable-changes option.

- Add a test for canWrite constraints on variables.

- When we generate a change to a non-canWrite file, issue an error
  diagnostic instead of an assertion failure.  Also apply the error to
  stdout mode as well as -output-postfix, which exposed some new
  failures in existing tests.

- Add a test that a known case in which canWrite constraints are not
  implemented (a typedef) generates the expected diagnostic.

- While I'm here, go ahead and add the planned error diagnostic when
  stdout mode tries to write a file under the base dir other than the
  main file.  It didn't break any existing regression tests.
Also clean up some duplicative / circuitous code.

Add a dedicated test case for checked regions in unwritable files.
@mattmccutchen-cci mattmccutchen-cci changed the title Constrain pointers in non-canWrite files to not change. Constrain the most common constructs in non-writable files to not change. Jan 28, 2021
@mattmccutchen-cci
Copy link
Member Author

This PR snowballed from three lines in ProgramInfo.cpp to rather more than that... I think everything is in order now! 🎉

@mattmccutchen-cci
Copy link
Member Author

@kyleheadley @aaronjeline You're invited to look at this PR too if you're interested.

The code in 3C.cpp tried to make -warn-all-root-cause imply
-warn-root-cause, but this apparently only works if
_3CInterface::solveConstraints computes the interim state, which was
previously conditional on the _original_ -warn-root-cause option.
Copy link
Collaborator

@john-h-kastner john-h-kastner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@mattmccutchen-cci
Copy link
Member Author

Woohoo! I did not expect approval of a major revision to this PR on the first try.

@kyleheadley Do you want to review anything before I merge the PR? I remember you had concerns about the changes I made to tests in #365.

@kyleheadley
Copy link
Member

I can tell you to use more temp files if you want, but they aren't strictly necessary until a few other changes happen. Otherwise, my only concerns are about things I know nothing about, so if a more senior reviewer approves without questions, it's not worth the delay.

@mattmccutchen-cci
Copy link
Member Author

Thanks Kyle!

I can tell you to use more temp files if you want, but they aren't strictly necessary until a few other changes happen.

Right. As you probably know, I'm not happy about -output-postfix creating temporary files under clang/test/3C either, and fixing that for all test cases using the upcoming -output-dir option will be one of the next things I work on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants