Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typecheck spatial module #605

Open
wants to merge 25 commits into
base: development
Choose a base branch
from
Open

Typecheck spatial module #605

wants to merge 25 commits into from

Conversation

mfisher87
Copy link
Member

@mfisher87 mfisher87 commented Sep 10, 2024

No description provided.

Copy link

github-actions bot commented Sep 10, 2024

Binder 👈 Launch a binder notebook on this branch for commit e314990

I will automatically update this comment whenever this PR is modified

Binder 👈 Launch a binder notebook on this branch for commit 7b563b1

Binder 👈 Launch a binder notebook on this branch for commit 1f1cfe5

Binder 👈 Launch a binder notebook on this branch for commit 749fc03

Binder 👈 Launch a binder notebook on this branch for commit a63ae13

Binder 👈 Launch a binder notebook on this branch for commit bfa9562

Binder 👈 Launch a binder notebook on this branch for commit 153c307

Binder 👈 Launch a binder notebook on this branch for commit e0ef967

Binder 👈 Launch a binder notebook on this branch for commit 130160b

Binder 👈 Launch a binder notebook on this branch for commit 830f471

Binder 👈 Launch a binder notebook on this branch for commit 0ca4485

Binder 👈 Launch a binder notebook on this branch for commit b035695

Binder 👈 Launch a binder notebook on this branch for commit cbc761c

Binder 👈 Launch a binder notebook on this branch for commit 52037e9

Binder 👈 Launch a binder notebook on this branch for commit 7c64991

Binder 👈 Launch a binder notebook on this branch for commit 2adc38d

Binder 👈 Launch a binder notebook on this branch for commit 9df3ec7

Binder 👈 Launch a binder notebook on this branch for commit 6f771d8

Binder 👈 Launch a binder notebook on this branch for commit 3f923a6

Binder 👈 Launch a binder notebook on this branch for commit 27de548

Binder 👈 Launch a binder notebook on this branch for commit 191181e

Binder 👈 Launch a binder notebook on this branch for commit 9aa54dc

Binder 👈 Launch a binder notebook on this branch for commit 207be53

Binder 👈 Launch a binder notebook on this branch for commit c89df21

Binder 👈 Launch a binder notebook on this branch for commit b632617

Binder 👈 Launch a binder notebook on this branch for commit 7675584

Binder 👈 Launch a binder notebook on this branch for commit 7bd668c

icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 94.73684% with 6 lines in your changes missing coverage. Please review.

Project coverage is 67.74%. Comparing base (c454aab) to head (7bd668c).
Report is 3 commits behind head on development.

Files with missing lines Patch % Lines
icepyx/core/spatial.py 90.62% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #605      +/-   ##
===============================================
- Coverage        71.81%   67.74%   -4.07%     
===============================================
  Files               38       36       -2     
  Lines             3136     3150      +14     
  Branches           426      430       +4     
===============================================
- Hits              2252     2134     -118     
- Misses             774      932     +158     
+ Partials           110       84      -26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Base automatically changed from typecheck-temporal to development September 27, 2024 17:07
@JessicaS11
Copy link
Member

@mfisher87 Any idea if this was rebased after merging #604? Trying to determine where to start for fixing merge conflicts and moving this forward.

@mfisher87
Copy link
Member Author

Rebased this PR (and #612 and #613) and fixed merge conflicts!

icepyx/core/query.py Outdated Show resolved Hide resolved

def geodataframe(
extent_type: ExtentType,
spatial_extent: Union[str, list[float]],
Copy link
Member

@JessicaS11 JessicaS11 Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
spatial_extent: Union[str, list[float]],
spatial_extent: Union[str, list[float, tuple[float]], Polygon],

Is this the proper way to do this? The list can be a list of floats or a list of tuples (which contain floats)?

Copy link
Contributor

@trey-stafford trey-stafford Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, should this also accept a Polygon?

Assuming that's the case, I think this is what we would want:

Suggested change
spatial_extent: Union[str, list[float]],
spatial_extent: Union[str, list[float], list[tuple[float, ...]], Polygon],

This accepts either a string, a list of floats, a list of tuples containing floats, or a Polygon.

Edit: I tried adding this annotation to the code and ran pyright. This change resulted in 23 errors that will need to be resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tackling this incrementally. Added support for Polygon input with: 1c1c490

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this adds support for the list[tuple[float, float]] case: 52037e9. I also add unit tests for each of these new cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I'm surprised this resulted in so many errors (I haven't gotten fully up to speed on pyright, so I was just trying to make the typing match what the code already did). It should have already been handling Polygons (now circa line 137) and lists (of either floats or tuples, now circa line 140). I wonder if pyright is choking on the fact that which types are valid for spatial_extent depends on the input for extent_type?

Thanks for adding those tests!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only thing still outstanding for this PR is we still need to resolve the duplicate functionality for handling Polygons (lines 85 + 135) and lists (92+140) that was added in your above linked commits. I think the original ordering got confusing when @mfisher87 moved the if file is True bit to the top, so it transitioned from handling all extent_type==bounding_box and then all extent_type==polygon cases to handling extent_type==polygon if it's from a file, extent_type==bounding_box, and then the rest of the extent_type==polygon cases. I don't think it particularly matters what order we do things in, so perhaps moving the rest of the polygon handling above the bounding_box handling is the clearest thing to do?

I'll start taking a look at this now! I'll be working on getting as much done with icepyx/harmony integration between now and our discussion Thursday as possible. Look forward to chatting then :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another part of the confusion here is that the extent_type and what's allowed by spatial_extent are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.

For example, the check_dateline function:

  • If extent_type is "bounding_box", then spatial_extent must be a list of floats (list[float]).
  • If extent_type is polygon, spatial_extent must be either a list or a tuple list | tuple and assumes that the list/tuple contains pairs of lon/lat.

This function does not support some of the other input types that have been typed on the geodataframe. There we have spatial_extent accepting a string, a list of floats, a list of tuples, or a Polygon. So we need to do some pre-processing to make the input spatial_extent match what called functions expect (like check_dateline) or we need to update those functions to accept the full list of possible extent_types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, here's a commit that refactors the geodataframe function to more clearly show what is and is not supported: c89df21. Note that I removed the "redundant" handling I added in previous commits.

Most of the code assumes that the spatial_extent is a list of floats.

spatial_extent can be a shapely Polygon only if the extent_type is "polygon" and xdateline is passed in as a parameter. If None is given (the default) for xdatetline, the function will fail.

Otherwise spatial_extent must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)] is not supported by this code and will require additional effort to support. I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaS11 Ok, I added a function that does the conversion to list[float] and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!

Note that I did have to update some unit tests because test data contained lists of int. I converted those to list[float]. Maybe we should support list[float] and list[int]? Currently, the code will fail if a list of ints is passed instead of the list of floats, as expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another part of the confusion here is that the extent_type and what's allowed by spatial_extent are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.

Makes sense. I think another [major] part of the issues we're running into (which I didn't realize fully until today) was how much of the context of when the functions are actually called isn't being accounted for. As a result, we're trying to type geodataframe according to ways it wasn't intended to be used. Some of the docstrings throughout this module were definitely not-quite-right, but the geodataframe one was almost spot on (with the critical caveats you need to interpret it correctly based on your above statement, which is definitely NOT ideal, and it was missing that we handle shapely.Polygon types too). For instance, if extent_type == 'polygon', then the only options for spatial_extent are string (if it's a filename), list (of int, float, np.int64, if it's a list of coordinates), or a valid Polygon object. The conversion of the list of tuples or list of lists has already happened in validate_polygon_pairs and validate_polygon_list, which are called during the __init__.

Otherwise spatial_extent must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)] is not supported by this code and will require additional effort to support.

Ok, I added a function that does the conversion to list[float] and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!

As I alluded to above, at a glance some of these additions are now duplicating the functionality that's in the validate_... set of functions, which are set up to handle all of the possible user inputs and turn them into one common format (a list of floats, in the case of a polygon input) that is what would actually be passed into geodataframe.

I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.

Please don't hesitate to do so. This PR has turned into a way bigger effort than anticipated, and has brought up some interesting discussions more broadly (e.g. related to nsidc/earthaccess#804) about what inputs we/earthaccess should/shouldn't accept and/or check for the user. The spatial module was our first effort towards isolating handling the various input types and returning something uniform that icepyx could latch on to for any further spatial formatting needs. It clearly needs some streamlining and updating (and my hope is to move some revised version of it upstream to earthaccess, which I'm starting conversations about there).

icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
icepyx/core/spatial.py Outdated Show resolved Hide resolved
@JessicaS11
Copy link
Member

JessicaS11 commented Oct 24, 2024

Rebased this PR (and #612 and #613) and fixed merge conflicts!

Thanks, @mfisher87! This PR is pretty close - a few suggestions where we may need to make type adjustments (would love a second set of eyes on these) EDIT: DONE and some notes for some docstring and error message edits.

@trey-stafford
Copy link
Contributor

@JessicaS11 just a heads-up that I am beginning to review and work on the Harmony integration tasks that @mfisher87 started, beginning with this PR. I'll be looking over things here and will try to respond to your comments this afternoon!

@trey-stafford
Copy link
Contributor

Not sure about the failing codecov/project check but I have added some additional unit tests to cover branches identified by codecov as not being covered by this PR.

Aside from that issue, this should be ready for re-review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants