-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: cloud paths checking against patterns #1094
Conversation
The _check_path_matches_patterns function was failing for cloud (e.g. s3, gc3) URIs. 1) the path.relative_to() function doesn't work for cloud URIs (it just returns the original). 2) combining Path('/') with the result of relative_to now also throws an error when using cloud (or non-posix) paths. This fix drops the uri prefix by using .path to get the posix-like part of the path, so that the regex check works as expected.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1094 +/- ##
==========================================
- Coverage 89.79% 89.78% -0.01%
==========================================
Files 63 64 +1
Lines 7141 7154 +13
Branches 1367 835 -532
==========================================
+ Hits 6412 6423 +11
- Misses 531 534 +3
+ Partials 198 197 -1 ☔ View full report in Codecov by Sentry. |
was just for local use
Could you add tests with examples? That will make it easier to be explicit about what was going wrong and allow us to consider other solutions quickly. |
The check if in a derivatives folder was not working when the root had a uri prefix. This fixes it so the uri prefix is stripped off for the check.
Checks the number of files in an example openneuro s3 dataset. Also adds s3fs as a test dependency.
Yes good idea, I've added one simple test using an s3 uri from openneuro, and also made another minor fix. The test passes locally, but I see now it is failing in the CI as botocore can't find the credentials file.. I'll see if there is a good workaround for this later.. For reference: running the test on the code before this PR gives this error:
|
Okay, I'm starting to get a handle on the issue. Basically, although we can specify things like If |
Ah I see. I wonder if we could use the storage_options dict for this purpose? This is something inherited from fsspec so can be used with universal_pathlib. The api within storage_options is specific to the underlying filesystem (eg anon is there for s3fs but not for others), but I hoping that shouldn't prevent it from being passed along and serialized with the object? |
If you can figure out how to get I've run out of time to poke at this this week and am unlikely to get back to it very soon. Ping @adelavega in case you have ideas on how to tackle this. |
Sounds good will give it a shot, thanks for helping out with this!! If it turns out trickier than expected, wondering if a stopgap solution is just to document that custom options are not currently passed along. Doesn't resolve the anon s3 access issue but does resolve any other use of universal paths where custom options aren't needed. Eg I didn't even know there was an issue with anon s3 access because I had a credentials file set-up already.. anyways can consider that later if/when we get to that point.. |
Created #1098 to track the lesser bug. Let's merge this now. |
The _check_path_matches_patterns function was failing for cloud (e.g. s3, gc3) URIs (added in #1074):
(it just returns the original).
an error when using cloud paths.
So if using universal_pathlib >0.2.3, any cloud URIs as bids datasets would throw an error message.
This fix drops the uri prefix by using .path, so that the regex check works as expected.