-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typecheck spatial module #605
base: development
Are you sure you want to change the base?
Conversation
👈 Launch a binder notebook on this branch for commit e314990 I will automatically update this comment whenever this PR is modified 👈 Launch a binder notebook on this branch for commit 7b563b1 👈 Launch a binder notebook on this branch for commit 1f1cfe5 👈 Launch a binder notebook on this branch for commit 749fc03 👈 Launch a binder notebook on this branch for commit a63ae13 👈 Launch a binder notebook on this branch for commit bfa9562 👈 Launch a binder notebook on this branch for commit 153c307 👈 Launch a binder notebook on this branch for commit e0ef967 👈 Launch a binder notebook on this branch for commit 130160b 👈 Launch a binder notebook on this branch for commit 830f471 👈 Launch a binder notebook on this branch for commit 0ca4485 👈 Launch a binder notebook on this branch for commit b035695 👈 Launch a binder notebook on this branch for commit cbc761c 👈 Launch a binder notebook on this branch for commit 52037e9 👈 Launch a binder notebook on this branch for commit 7c64991 👈 Launch a binder notebook on this branch for commit 2adc38d 👈 Launch a binder notebook on this branch for commit 9df3ec7 👈 Launch a binder notebook on this branch for commit 6f771d8 👈 Launch a binder notebook on this branch for commit 3f923a6 👈 Launch a binder notebook on this branch for commit 27de548 👈 Launch a binder notebook on this branch for commit 191181e 👈 Launch a binder notebook on this branch for commit 9aa54dc 👈 Launch a binder notebook on this branch for commit 207be53 👈 Launch a binder notebook on this branch for commit c89df21 👈 Launch a binder notebook on this branch for commit b632617 👈 Launch a binder notebook on this branch for commit 7675584 👈 Launch a binder notebook on this branch for commit 7bd668c |
e314990
to
7b563b1
Compare
7b563b1
to
1f1cfe5
Compare
1f1cfe5
to
749fc03
Compare
e805f01
to
7b57037
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## development #605 +/- ##
===============================================
- Coverage 71.81% 67.74% -4.07%
===============================================
Files 38 36 -2
Lines 3136 3150 +14
Branches 426 430 +4
===============================================
- Hits 2252 2134 -118
- Misses 774 932 +158
+ Partials 110 84 -26 ☔ View full report in Codecov by Sentry. |
@mfisher87 Any idea if this was rebased after merging #604? Trying to determine where to start for fixing merge conflicts and moving this forward. |
a63ae13
to
bfa9562
Compare
icepyx/core/spatial.py
Outdated
|
||
def geodataframe( | ||
extent_type: ExtentType, | ||
spatial_extent: Union[str, list[float]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spatial_extent: Union[str, list[float]], | |
spatial_extent: Union[str, list[float, tuple[float]], Polygon], |
Is this the proper way to do this? The list can be a list of floats or a list of tuples (which contain floats)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, should this also accept a Polygon
?
Assuming that's the case, I think this is what we would want:
spatial_extent: Union[str, list[float]], | |
spatial_extent: Union[str, list[float], list[tuple[float, ...]], Polygon], |
This accepts either a string, a list of floats, a list of tuples containing floats, or a Polygon.
Edit: I tried adding this annotation to the code and ran pyright
. This change resulted in 23 errors that will need to be resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tackling this incrementally. Added support for Polygon
input with: 1c1c490
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this adds support for the list[tuple[float, float]]
case: 52037e9. I also add unit tests for each of these new cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... I'm surprised this resulted in so many errors (I haven't gotten fully up to speed on pyright, so I was just trying to make the typing match what the code already did). It should have already been handling Polygons (now circa line 137) and lists (of either floats or tuples, now circa line 140). I wonder if pyright is choking on the fact that which types are valid for spatial_extent
depends on the input for extent_type
?
Thanks for adding those tests!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only thing still outstanding for this PR is we still need to resolve the duplicate functionality for handling Polygons (lines 85 + 135) and lists (92+140) that was added in your above linked commits. I think the original ordering got confusing when @mfisher87 moved the if file is True bit to the top, so it transitioned from handling all extent_type==bounding_box and then all extent_type==polygon cases to handling extent_type==polygon if it's from a file, extent_type==bounding_box, and then the rest of the extent_type==polygon cases. I don't think it particularly matters what order we do things in, so perhaps moving the rest of the polygon handling above the bounding_box handling is the clearest thing to do?
I'll start taking a look at this now! I'll be working on getting as much done with icepyx/harmony integration between now and our discussion Thursday as possible. Look forward to chatting then :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think another part of the confusion here is that the extent_type
and what's allowed by spatial_extent
are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.
For example, the check_dateline
function:
- If
extent_type
is"bounding_box"
, thenspatial_extent
must be a list of floats (list[float]
). - If
extent_type
ispolygon
,spatial_extent
must be either a list or a tuplelist | tuple
and assumes that the list/tuple contains pairs of lon/lat.
This function does not support some of the other input types that have been typed on the geodataframe
. There we have spatial_extent
accepting a string, a list of floats, a list of tuples, or a Polygon
. So we need to do some pre-processing to make the input spatial_extent
match what called functions expect (like check_dateline
) or we need to update those functions to accept the full list of possible extent_type
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, here's a commit that refactors the geodataframe
function to more clearly show what is and is not supported: c89df21. Note that I removed the "redundant" handling I added in previous commits.
Most of the code assumes that the spatial_extent
is a list of floats.
spatial_extent
can be a shapely Polygon
only if the extent_type
is "polygon"
and xdateline
is passed in as a parameter. If None
is given (the default) for xdatetline
, the function will fail.
Otherwise spatial_extent
must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)]
is not supported by this code and will require additional effort to support. I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JessicaS11 Ok, I added a function that does the conversion to list[float]
and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!
Note that I did have to update some unit tests because test data contained lists of int
. I converted those to list[float]
. Maybe we should support list[float]
and list[int]
? Currently, the code will fail if a list of ints is passed instead of the list of floats, as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think another part of the confusion here is that the extent_type and what's allowed by spatial_extent are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.
Makes sense. I think another [major] part of the issues we're running into (which I didn't realize fully until today) was how much of the context of when the functions are actually called isn't being accounted for. As a result, we're trying to type geodataframe
according to ways it wasn't intended to be used. Some of the docstrings throughout this module were definitely not-quite-right, but the geodataframe
one was almost spot on (with the critical caveats you need to interpret it correctly based on your above statement, which is definitely NOT ideal, and it was missing that we handle shapely.Polygon
types too). For instance, if extent_type == 'polygon'
, then the only options for spatial_extent
are string (if it's a filename), list (of int, float, np.int64, if it's a list of coordinates), or a valid Polygon object. The conversion of the list of tuples or list of lists has already happened in validate_polygon_pairs
and validate_polygon_list
, which are called during the __init__
.
Otherwise spatial_extent must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)] is not supported by this code and will require additional effort to support.
Ok, I added a function that does the conversion to list[float] and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!
As I alluded to above, at a glance some of these additions are now duplicating the functionality that's in the validate_...
set of functions, which are set up to handle all of the possible user inputs and turn them into one common format (a list of floats, in the case of a polygon input) that is what would actually be passed into geodataframe
.
I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.
Please don't hesitate to do so. This PR has turned into a way bigger effort than anticipated, and has brought up some interesting discussions more broadly (e.g. related to nsidc/earthaccess#804) about what inputs we/earthaccess should/shouldn't accept and/or check for the user. The spatial module was our first effort towards isolating handling the various input types and returning something uniform that icepyx could latch on to for any further spatial formatting needs. It clearly needs some streamlining and updating (and my hope is to move some revised version of it upstream to earthaccess, which I'm starting conversations about there).
Thanks, @mfisher87! This PR is pretty close - a few suggestions where we may need to make type adjustments (would love a second set of eyes on these) EDIT: DONE and some notes for some docstring and error message edits. |
@JessicaS11 just a heads-up that I am beginning to review and work on the Harmony integration tasks that @mfisher87 started, beginning with this PR. I'll be looking over things here and will try to respond to your comments this afternoon! |
for more information, see https://pre-commit.ci
Also added tests for the Polygon and `list[tuple[float, float]]` case.
Trying to resolve code coverage complaints...
for more information, see https://pre-commit.ci
Not sure about the failing Aside from that issue, this should be ready for re-review! |
Co-authored-by: Jessica Scheick <[email protected]>
No description provided.