Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configuration: fix incorrect handling of out-dir argument #415

Closed

Conversation

perillo
Copy link
Contributor

@perillo perillo commented Mar 6, 2024

Currently the argument parsing code expands all the arguments.

This means that an out-dir argument having the same name of an executable is fully expanded, thus becoming an executable instead of a directory.

Examples:

> kcov python echo
kcov: error: Can't find or open echo
> kcov python /bin/echo
kcov: error: Can't open directory /usr/bin/python3.11/

Remove checking if the first argument is an elf file, since the executable must be after the out-dir argument.

Skip options and don't expand arguments, unless lstat fails.

Issue #414: kcov: error when out-dir is an executable

TODO

Add tests

@perillo perillo changed the title configuration: fix incorrect handling of out-dir configuration: fix incorrect handling of out-dir argument Mar 6, 2024
Currently the argument parsing code expands all the arguments.
This means that an out-dir parameter having the same name of an
executable is fully expanded, thus becoming an executable instead of a
directory.

Examples:
   > kcov python echo
   kcov: error: Can't find or open echo

   > kcov python /bin/echo
   kcov: error: Can't open directory /usr/bin/python3.11/

Remove checking if the first argument is an elf file, since the
executable must be after the out-dir argument.

Skip options and don't expand arguments, unless lstat fails.

Issue SimonKagstrom#414: kcov: error when out-dir is an executable
@perillo
Copy link
Contributor Author

perillo commented Mar 6, 2024

The code in this PR does not fully solve the problem. When the out-dir directory does not exists, the code behave incorrectly.

@SimonKagstrom
Copy link
Owner

Thanks for your PR!

One of the tests fail though: https://github.com/SimonKagstrom/kcov/actions/runs/8175261352/job/22356631660?pr=415, which was related to this issue: #368

@perillo
Copy link
Contributor Author

perillo commented Mar 6, 2024

Thanks for your PR!

One of the tests fail though: https://github.com/SimonKagstrom/kcov/actions/runs/8175261352/job/22356631660?pr=415, which was related to this issue: #368

Yes, it was my fault sorry. I tried to skip all options, but that was clearly incorrect.

Still, as I wrote the previous comment, the current change does not fully solve the bug.

I would like to run tests locally, but how do you run tests?

Thanks.

UPDATE

The problem does not seem to be related to my code to skip all the option when searching for an elf executable. Not sure what's causing the bug.

@SimonKagstrom
Copy link
Owner

Unfortunately, it's somewhat convoluted.

Perhaps the easiest is to look at how the ci does it, which is in .github/workflows/ci.yml.

You need to build some test binaries (i.e., cmake against the tests/ directory) and then run tests/tools/run-tests to excecute them. The .github/workflows/ci-run-tests.sh does that, but tests/tools/run-tests will print the arguments it expects.

@SimonKagstrom
Copy link
Owner

... and also look in .github/workflows/generic-build.sh, for the cmake + make command to bind the test binarieis.

@perillo
Copy link
Contributor Author

perillo commented Mar 6, 2024

@SimonKagstrom what do you think about following the standard way to mark extra arguments?
As an example:

kcov --clean coverage tests/python/link_main -- 5 --foo

instead of:

kcov --clean coverage tests/python/link_main 5 --foo

I also tried to remove completely the code scanning for an elf executable, and it seems to still work, but I need to run the tests.

@SimonKagstrom
Copy link
Owner

Well, I kind of like the current method, since it allows this pattern:

tests/unit-tests/ut --test-case="..."                  # First run without kcov
kcov /tmp/kcov tests/unit-tests/ut --test-case="..."   # Then arrow up and just add kcov at the beginnig

anyway, isn't the problem really that all paths are expanded with $PATH? Only the binary should be processed that way, and perhaps that can be done after the loop?

@perillo

This comment was marked as outdated.

@perillo
Copy link
Contributor Author

perillo commented Mar 7, 2024

Well, I kind of like the current method, since it allows this pattern:

tests/unit-tests/ut --test-case="..."                  # First run without kcov
kcov /tmp/kcov tests/unit-tests/ut --test-case="..."   # Then arrow up and just add kcov at the beginnig

anyway, isn't the problem really that all paths are expanded with $PATH? Only the binary should be processed that way, and perhaps that can be done after the loop?

The purpose of the scan is to find the argv index of the executable, so that getopt_long can stop processing arguments after the executable.

Unfortunately I think this is an impossible task, since you can never tell what parameter is an executable, when they are relative names instead of full paths.

@SimonKagstrom
Copy link
Owner

Yeah, sorry, I haven't touched that code in a long time, so I had forgotten quite how it worked...

Anyway, shouldn't it work to do:

  1. Skip through all options (i.e., starting with -)
  2. The next argument is the output directory, which should not be looked up in PATH
  3. After that is the binary, and all else are arguments to the binary

The path in 3 should be looked up in $PATH, if not found.

Your patch seems to pretty much do this, but maybe I'm missing something.

@perillo
Copy link
Contributor Author

perillo commented Mar 8, 2024

Yeah, sorry, I haven't touched that code in a long time, so I had forgotten quite how it worked...

Anyway, shouldn't it work to do:

1. Skip through all options (i.e., starting with -)

2. The next argument is the output directory, which should not be looked up in PATH

3. After that is the binary, and all else are arguments to the binary

The path in 3 should be looked up in $PATH, if not found.

Your patch seems to pretty much do this, but maybe I'm missing something.

Coincidentally I also found a possible solution:

  1. Expand all positional arguments, but stop only when the second argument is found

Unfortunately there is still a small window for a bug:

kcov --include-path python

This is possible because GNU getopt_long parses both --name=arg and --name arg.

Additionally I would like to improve the current code, by adding two additional functions in utils.cc:

  • look_path, for expanding an executable (name borrowed from Go os.exec std package)
  • is_elf, a simple wrapper for IParserManager::getInstance().matchParser

These change should make the code more readable.

I will try to implement this, testing the code locally before push the code.

Let me know if you don't like adding the two new functions.

@SimonKagstrom
Copy link
Owner

Sounds very good, thanks!

The only comment I have is that is_elf is a bit Linux/FreeBSD:ish. I mainly run on MacOS now, which uses Mach-O. Now, is_macho would be a fun name, but I think is_binary_executable is more generic and probably better.

@perillo
Copy link
Contributor Author

perillo commented Mar 9, 2024

@SimonKagstrom I compared how kcov behaves between this change and the old code.

The example is:

build/bin/kcov --python-parser=python2 cover tests/python/main 5 --foo

With this change I get:

NOT FOUND: 6, (null)
build/bin/kcov: unrecognized option '--foo'
kcov: error: Unrecognized option: -

With the old master:

NOT FOUND: 1, --python-parser=python2
NOT FOUND: 2, cover
hello!
...

So the old code inner for loop (over PATH) was the reason why the code actually worked!
This was not documented.

This can be probably fixed in two way:

  1. Use -- after the executable (this is the correct way)
  2. Make IParserManager.matchParse also parse python and bash scripts
    I assumed this was true, so I named the function is_supported_executable.
  3. Add an inner loop, but I have no idea how many iterations are needed.
    Some testing is necessary, but I feels this totally unreadable.

@SimonKagstrom
Copy link
Owner

  1. Use -- after the executable (this is the correct way)
  2. Make IParserManager.matchParse also parse python and bash scripts
    I assumed this was true, so I named the function is_supported_executable.

It should already match python and bash scripts. It calls matchParser in e.g., python-engine.cc, which looks for python in the shebang at the start of the file, so I don't quite understand why it doesn't work

@SimonKagstrom
Copy link
Owner

I wonder if it's just something like this that's needed:

--- a/src/utils.cc
+++ b/src/utils.cc
@@ -778,6 +778,13 @@ uint32_t hash_file(const std::string &filename)

 int look_path(const std::string &file, std::string *out)
 {
+       struct stat st;
+       if (file_exists(file))
+       {
+               *out = get_real_path(file);
+               return 0;
+       }
+
        const char *sys_path = getenv("PATH");
        if (!sys_path)
                return -1;
@@ -788,7 +795,6 @@ int look_path(const std::string &file, std::string *out)
        {
                const std::string root = *it;
                const std::string path = get_real_path(root + "/" + file);
-               struct stat st;

                if (lstat(path.c_str(), &st) < 0)
                        continue;

I.e., accept the file if it's either already given in a relative or absolute path?

@perillo
Copy link
Contributor Author

perillo commented Mar 9, 2024

I wonder if it's just something like this that's needed:
...
I.e., accept the file if it's either already given in a relative or absolute path?

Yes, I have reached the same conclusion. The problem was that tests/python/main is always incorrectly expanded. It has nothing to do with parser matching.

As an example, Go os.exec package tries to resolve a file name with slash without consulting PATH. In an older version it could return a path relative to the current directory (this is probably what we want).

Currently I'm trying to split look_path into two functions: find_executable and look_path (that calls find_executable).

@perillo
Copy link
Contributor Author

perillo commented Mar 9, 2024

The code seems to work but I got two errors:

======================================================================
FAIL: runTest (bash_linux_only.bash_sh_shebang.runTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/manlio/src/contrib/kcov/github/perillo/kcov/tests/tools/bash_linux_only.py", line 12, in runTest
    assert parse_cobertura.hitsPerLine(dom, "sh-shebang.sh", 4) == 1
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: runTest (python.python_issue_368_can_handle_symlink_target.runTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/manlio/src/contrib/kcov/github/perillo/kcov/tests/tools/python.py", line 39, in runTest
    assert b"unrecognized option '--foo'" not in o
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

@SimonKagstrom
Copy link
Owner

SimonKagstrom commented Mar 10, 2024

Funnily enough, the python test works fine here (MacOS, but the python stuff should be exactly the same):

runTest (python.python_can_set_legal_parser.runTest) ... ok
runTest (python.python_exit_status.runTest) ... ok
runTest (python.python_issue_368_can_handle_symlink_target.runTest) ... ok
runTest (python.python_tricky_single_dict_assignment.runTest) ... expected failure
runTest (python.python_tricky_single_line_string_assignment.runTest) ... ok
runTest (python.python_unittest.runTest) ... ok

So since it's an improvement, I'd suggest to add that to the PR.

Edit: The bash failure is Linux-specific, so wasn't run on my machine (so all tests ran fine here).

@perillo
Copy link
Contributor Author

perillo commented Mar 10, 2024

My handle-out-dir-correctly branch is in a diverged state (due to me messing up when trying to replace the last commit). I will try to solve the problem without opening a new PR.

@SimonKagstrom
Copy link
Owner

@perillo is this PR still relevant, or was it solved by the other?

@perillo
Copy link
Contributor Author

perillo commented Mar 12, 2024 via email

@perillo
Copy link
Contributor Author

perillo commented Mar 13, 2024

I have a change that should finally fix the problem, but I think that reusing this PR is not a good idea.

@SimonKagstrom
Copy link
Owner

OK, so feel free to close this and open a new one when you feel ready! Thanks!

@perillo
Copy link
Contributor Author

perillo commented Mar 15, 2024

Replaced by #426.

@perillo perillo closed this Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants