Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use matcher from a .gitignore against includePatterns #1297

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

Blugatroff
Copy link
Contributor

Description of the change

When attempting to determine whether a .gitignore should be respected, we previously tried to see, whether any of the patterns in the .gitignore match against the includePatterns.
As @fsoikin noticed here this does not really make any sense in case the includePatterns really are actual patterns and not just paths.
Therefore, instead of matching the .gitignore patterns against the includePatterns directly, we now only match them against the base of each includePattern. The base of a pattern is always a path. However, some includePatterns may not have a base, for example when searching for **/foo/bar. What should one do here? I think disrespecting .gitignores is not really feasible in such a case, because how can we tell if something that the .gitignore excludes could match the baseless pattern **/foo/bar without actually walking the excluded directories?

So for example, if there was a file: ./abc/foo and a .gitignore with the pattern abc and we glob for **/foo, then we do not find ./abc/foo, because the .gitignore was respected.

I think, this is the behaviour we would want anyway, we only ignore a .gitignore, when it is obvious that the glob target is necessarily within an ignored directory. Like globbing for
.spago/p/arrays-7.3.0/src/**/*.purs when .spago is gitignored.

This was already the behaviour previously. In the above example, we would have previously matched abc against **/foo, which would be false and therefore no conflict would have been dectected, so the .gitignore would also have been respected.

This PR only makes this behaviour explicit and removes the, incidentally working, match of a pattern against another pattern. The new test also passes without this change, but i think it's a nice kind of documentation.

I wonder, whether the globbing code should even attempt to disrespect .gitignore files, doesn't the calling code always know, whether it should limit its search to the non-ignored files?
For example, when globbing for the source files of a package in .spago, we know beforehand, that a nonGitignoringGlob (otherwise known as glob :^D) would suffice.

Checklist:

  • Added the change to the "Unreleased" section of the changelog
  • Added some example of the new feature to the README
  • Added a test for the contribution (if applicable)

When attempting to determine whether a .gitignore should be respected,
we previously tried to see, whether any of the patterns in the
.gitignore match against the includePatterns.
As @fsoikin noticed [here](https://github.com/purescript/spago/pull/1296/files/b84981dd086219bc1157a440667aca92ea23efb4#r1815996597)
this does not really make any sense in case the includePatterns really
are actual patterns and not just paths.
Therefore, instead of matching the .gitignore patterns against the
includePatterns directly, we now only match them against the base of
each includePattern. The base of a pattern is always a path.
However, some includePatterns may not have a base, for example when
searching for `**/foo/bar`. What should one do here?
I think disrespecting .gitignores is not really feasible in such a case,
because how can we tell if something that the .gitignore excludes
could match the baseless pattern `**/foo/bar` without actually walking
the excluded directories?

So for example, if there was a file: `./abc/foo` and a .gitignore with
the pattern `abc` and we glob for `**/foo`, then we do not find
`./abc/foo`, because the .gitignore was respected.

I think, this is the behaviour we would want anyway, we only ignore a
.gitignore, when it is obvious that the glob target is necessarily
within an ignored directory. Like globbing for
`.spago/p/arrays-7.3.0/src/**/*.purs` when `.spago` is gitignored.

This was already the behaviour previously. In the above example, we
would have previously matched `abc` against `**/foo`, which would be
false and therefore no conflict would have been dectected, so the
.gitignore would also have been respected.

This PR only makes this behaviour explicit and removes the, incidentally
working, match of a pattern against another pattern.
The new test also passes without this change, but i think it's a nice
kind of documentation.

I wonder, whether the globbing code should even attempt to disrespect
.gitignore files, doesn't the calling code always know, whether it
should limit its search to the non-ignored files?
For example, when globbing for the source files of a package in .spago,
we know beforehand, that a nonGitignoringGlob (otherwise known as
`glob` :^D) would suffice.
Copy link
Member

@f-f f-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@fsoikin could you also take a look?

Copy link
Collaborator

@fsoikin fsoikin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic makes sense to me (though it took me some time to wrap my head around it), but I take (a minor) issue with the code structure.

  1. Inconsistency between let and where: since includePatternBases is where-bound, why is allIncludePatternHaveBase then let-bound? Seems random and creates noise.
    • Also: should it be "patterns" instead of "pattern" in that name?
  2. Nested or: there is an or in the defintion of newMatchers and then it's nested under another or in addMatcher. How about making newMatchers an array instead?
    • This would also allow you to define newMatchers with guards instead of case true false, making it much more concise and readable.

@Blugatroff
Copy link
Contributor Author

includePatternBases and allIncludePatternHaveBase were actually both part of the same single huge let. They were just very far apart and in an unintuitive order for no reason.
I think i arranged them better now.

Copy link
Collaborator

@fsoikin fsoikin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Nice fix, thanks!

@f-f
Copy link
Member

f-f commented Oct 30, 2024

Lovely. Thank you both!

@f-f f-f merged commit c4de1e7 into purescript:master Oct 30, 2024
5 checks passed
@Blugatroff Blugatroff deleted the dont-match-glob-with-pattern branch October 30, 2024 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants