-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better behavior with unchanged files #328
Comments
Today I learned that there's at least one case in which |
I asked in today's meeting if anyone had more information about this. Mike said the inconsistent behavior should be considered a bug. But if we decide to always output unchanged files, we wouldn't have to do the work to investigate where the current inconsistency is coming from. Should we do that or is it useful to not output unchanged files? (In terms of workflow, do users like to use |
I would be OK with outputting the new file every time. We could also output a diagnostic under a "verbose" flag that says the content of the file was not changed. This diagnostic can be ignored most of the time (i.e., the flag is not on by default). |
Thanks. Per the last sentence that I recently added to the initial comment: are you OK with outputting all header files in the base dir (and thus allowed to be written by 3C) that are transitively included from the files on the command line, even if there are no changes to make to them? That could be a huge number of unchanged files. But if the user has a script to diff or whatever, a larger number of files may not be a problem. |
Hm, good point. After a little more thought, I feel that this is not an issue worth pursuing. We are spending a lot of time on what seems to be to be a minor problem. Whether the file is output or not when there is no change, you have to deal with it in post-processing. To me, to avoid clutter to the file system, I lean to the current behavior. Let's just implement the "output directory" feature and aim to only output files that changed, as we have now. If occasionally one is output that should not be, we can decide whether it rises to the level of a real problem and fix it then. |
Sounds reasonable for the modes that create files ( For the mode that outputs to stdout, I still think it's bad that a script can suddenly start to silently exhibit incorrect behavior if the file is unchanged and the script assumes it is supposed to be empty. I might even be in favor of deprecating the mode that outputs to stdout altogether, since nearly all real users will have multi-file programs (which this mode can't handle in general) and the regression tests will no longer use this mode. But if you think this is low priority, let me know and we can mark the issue as such. |
For the output mode to stdout, I can be persuaded that we should always output the result. It's surprising, when running this mode, and nothing is emitted, I'll agree. |
Updates: At Thursday's meeting, some team members were emphatic that:
Outstanding design questions (stretching the scope of this issue a bit, but I think it's a lesser evil than filing a separate one):
|
I think we care less about (2) than about (1). To me it's fine to just reject rewriting a program in stdout mode if included headers would also have to be changed. I think that addresses your issue? |
Included headers never "have" to be changed: we can always constrain them to not change. (In the worst case, this will just cause the new version of the main file to be identical to the original.) To review, I think we're considering three modes here:
|
For stdout mode, it should fail if you give it more than one file on the command line. When you give it one file, the only question is files it |
We discussed the sub-modes of stdout mode in today's meeting: We don't want mode 2 after all, but we do want mode 3 for the case where the user wants to solve for all files under the Remaining questions: Do we want to support mode 1? (I'm not clear if we previously made a decision about this or, if so, whether the decision stands.) If so, what flag(s) control the choice of mode, and what is the default if none of those flags are passed? In mode 3, is the default |
Update from today's meeting: Stdout mode will support only mode 3 and the default |
…nge. (#391) This PR addresses function and variable declarations (because they are the most obvious case and reasonably straightforward) and checked regions (because they came up in some existing regression tests). We'll leave #387 open for the tail of unhandled cases. Also: - When 3C tries to change a non-writable file, issue an error diagnostic (not an assertion failure because there are known unhandled cases) rather than silently discarding the change. - Add a `-dump-unwritable-changes` flag to the `3c` tool to dump the new version of the file when that diagnostic appears. - Add an error diagnostic when 3C tries to change a file under the base dir other than the main file in stdout mode. This is a separate feature (part of #328) but ended up being easy to implement along with the diagnostic for a non-writable file. - Add tests for all the fixes (but not `-dump-unwritable-changes`). - Fix a bug where `-warn-all-root-cause` didn't work without `-warn-root-cause`, because this affected one of the new tests. The use of `-warn-all-root-cause` without `-warn-root-cause` in the affected test will serve as a regression test for this fix as well. Fixes part of #387 and a few unrelated minor issues.
- Add -output-dir option to write updated files to a directory structure parallel to the base dir (#347). When -output-dir is used, a source file outside the base dir can't be handled because there is no way to compute its output path. For consistency, this is now an error even when -output-dir is not used. - Convert all 3C regression tests from -output-postfix to -output-dir to avoid leaving temporary files in the clang/test/3C directory (#378). - Expand "3c -help" documentation. In particular, direct the user to pass "--" when they don't want to use a compilation database to avoid accidentally using unwanted compiler options and suppress the warning if no compilation database is found (#343). - For consistency, have stdout mode output the main file even if it is unchanged (#328). - Fix bugs in matching of file paths against the base dir (#327). - Other minor bug fixes: see the pull request description for details. Co-authored-by: John Kastner <[email protected]>
Normally,
3c
prints the rewritten version of a filefoo.c
to standard output, or if-output-postfix=checked
is passed, the rewritten version is written tofoo.checked.c
. However, if the rewritten version is identical to the original, then:-output-postfix
, the output to standard output is empty. This edge case is bound to cause scripts to behave unexpectedly if they think the result of rewriting is actually an empty file. John pointed out that many of the regression tests take advantage of this behavior to assert that a file needs no rewriting by checking that the output is empty (look forcount 0
), but we think the tests could just as easily (and more clearly!)cmp
to the original file.-output-postfix
, thefoo.checked.c
file is not created. Hopefully scripts not designed for this case will just error out instead of doing something wrong. Some users might consider this useful and there might be a performance benefit in some workflows, but there is at least one other reason why a given file might not be written (-base-dir
), soif we keep the behavior, we should tell the user why a given file wasn't written.Update 2021-02-13: We've deprioritized that part because it's now an error if 3C generates a change to a file outside the base dir (Properly constrain non-canWrite
files to not change (remaining cases) #387), so we're not aware of any other common reason for not writing a file that would need to be distinguished from the file being unchanged; if one arises, we'll try to make sure that that one generates a distinctive message. And the code that writes the files only has a list of modified rewrite buffers; I'm not aware of an easy way to get a list of files that were read but not modified.This issue is to decide whether to stop treating unchanged files specially in one or both cases (or maybe make it an option), and if we keep the special treatment with
-output-postfix
, provide better diagnostics.Note that even if we write unchanged files passed on the command line (by default or as an option), we might not want to write unchanged files included from the
-base-dir
, at least without an additional option.The text was updated successfully, but these errors were encountered: