Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: GPL false positive license detections with v32.3.0 #4005

Open
alexzurbonsen opened this issue Dec 5, 2024 · 4 comments · May be fixed by #4009
Open

Regression: GPL false positive license detections with v32.3.0 #4005

alexzurbonsen opened this issue Dec 5, 2024 · 4 comments · May be fixed by #4009
Labels

Comments

@alexzurbonsen
Copy link
Contributor

alexzurbonsen commented Dec 5, 2024

Description

With v32.3.0 we are observing false positive GPL license detections that did not occurr with v32.2.1.

The examples we have found are caused by matches with the gpl_bare_word_only.RULE. In v32.2.1 these detections were categorized as license_clues.

An example:

https://github.com/steinwurf/boost/blob/ade3189e2c03fd975dbfa667a4f49e98a49d2fdf/boost/assign/ptr_list_of.hpp#L196

For example the lines 196-198

assign_detail::generic_ptr_list<T> gpl;
        gpl();
        return gpl;

yield three GPL detections with v32.3.0. (There are other similar snippets in the file.)

How To Reproduce

scancode -l <your path to boost repo>/boost/boost/assign/ptr_list_of.hpp --json scancode.json

Run once for v32.3.0 and once for v32.2.1

See attached scancode files for my results.

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? MacOS 15.1.1
  • What version of scancode-toolkit was used to generate the scan file? See above.
  • What installation method was used to install/run scancode? pip with git version tag checked out

scancode_32.2.1.json
scancode_32.3.0.json

@alexzurbonsen
Copy link
Contributor Author

tagging @meretp @leslielazzarino

@AyanSinhaMahapatra
Copy link
Member

@alexzurbonsen thanks a lot for your report! At first glance this does seem like a side effect of modifying the false positive detection heuristics at (we report possible false positives as license clues) f9863e6 which caused the regression. This would be a nice example to add and further refine this part.

@alexzurbonsen
Copy link
Contributor Author

@AyanSinhaMahapatra thanks for the swift reply! Not sure I got it right: Are you looking into it or should I do something?

@alexzurbonsen alexzurbonsen linked a pull request Dec 9, 2024 that will close this issue
6 tasks
@alexzurbonsen
Copy link
Contributor Author

Hey @AyanSinhaMahapatra, I think I found the problem and opened #4009, see above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants