Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing error when submitting list instead of tuple for temporal parameter #804

Open
1 task done
mfisher87 opened this issue Sep 14, 2024 · 3 comments
Open
1 task done
Labels
enhancement New feature or request

Comments

@mfisher87
Copy link
Collaborator

mfisher87 commented Sep 14, 2024

Is this issue already tracked somewhere, or is this a new report?

  • I've reviewed existing issues and couldn't find a duplicate for this problem.

Current Behavior

The error is confusing because it comes from python-cmr after earthaccess has done stuff with the parameters and passed them on. This message tells me I should pass a string instead of a list when it should be telling me to pass a tuple instead of a list.

>>> results = earthaccess.search_data(short_name="GLDAS_NOAH025_3H", temporal=["2020-01-01", "2020-01-07"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/robatt/Projects/nsidc/earthaccess/earthaccess/api.py", line 125, in search_data
    query = DataGranules().parameters(**kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/robatt/Projects/nsidc/earthaccess/earthaccess/search.py", line 559, in parameters
    methods[key](val)
  File "/home/robatt/Projects/nsidc/earthaccess/earthaccess/search.py", line 844, in temporal
    return super().temporal(date_from, date_to, exclude_boundary)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/robatt/.local/share/miniforge3/envs/earthaccess/lib/python3.12/site-packages/cmr/queries.py", line 447, in temporal
    date_from, date_to = self._format_date(date_from, date_to)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/robatt/.local/share/miniforge3/envs/earthaccess/lib/python3.12/site-packages/cmr/queries.py", line 387, in _format_date
    date_from = convert_to_string(date_from, datetime(1, 1, 1, 0, 0, 0, tzinfo=timezone.utc))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/robatt/.local/share/miniforge3/envs/earthaccess/lib/python3.12/site-packages/cmr/queries.py", line 374, in convert_to_string
    raise TypeError(
TypeError: Date must be a date object or ISO 8601 string, not list.

Expected Behavior

Error should reflect that the container is the wrong type, e.g. Temporal must be a tuple of date objects or ISO8601 strings, or earthaccess should handle the input in a way that gives python-cmr the correct container when the user makes a trivial mistake like this.

Steps To Reproduce

earthaccess.search_data(short_name="GLDAS_NOAH025_3H", temporal=["2020-01-01", "2020-01-07"])

Environment

- OS: Pop!_OS 22
- Python: 3.12

Additional Context

No response

@mfisher87 mfisher87 changed the title Confusing error when submitting list instead of tuple for temporal Confusing error when submitting list instead of tuple for temporal parameter Sep 14, 2024
@itcarroll
Copy link
Collaborator

Our implementation of the temporal filter does nothing other than override typing and docstring of the python_cmr method.

Is the best resolution for this issue to add a type check on the argument before calling super()?

@override
def temporal(
self,
date_from: Optional[Union[str, dt.date, dt.datetime]] = None,
date_to: Optional[Union[str, dt.date, dt.datetime]] = None,
exclude_boundary: bool = False,
) -> Self:
"""Filter by an open or closed date range. Dates can be provided as date objects
or ISO 8601 strings. Multiple ranges can be provided by successive method calls.
???+ Tip
Giving either `datetime.date(YYYY, MM, DD)` or `"YYYY-MM-DD"` as the `date_to`
parameter includes that entire day (i.e. the time is set to `23:59:59`).
Using `datetime.datetime(YYYY, MM, DD)` is different, because `datetime.datetime`
objects have `00:00:00` as their built-in default.
Parameters:
date_from: start of temporal range
date_to: end of temporal range
exclude_boundary: whether to exclude the date_from and date_to in the matched range
Returns:
self
Raises:
ValueError: `date_from` or `date_to` is a non-`None` value that is
neither a datetime object nor a string that can be parsed as a datetime
object; or `date_from` and `date_to` are both datetime objects (or
parsable as such) and `date_from` is after `date_to`.
"""
return super().temporal(date_from, date_to, exclude_boundary)

Good to bear in mind #488, and (I don't see an issue for this) that we could easily support multiple temporal ranges because python_cmr already supports it.

@itcarroll itcarroll added the enhancement New feature or request label Sep 17, 2024
@mfisher87
Copy link
Collaborator Author

mfisher87 commented Sep 17, 2024

Our implementation of the temporal filter does nothing other than override typing and docstring of the python_cmr method.

We also splat the arg's container IFF the container is a tuple:

# call the method
if isinstance(val, tuple):
methods[key](*val)
else:
methods[key](val)

Is the best resolution for this issue to add a type check on the argument before calling super()?

Yeah, I think so! At least maybe for now.

But also, I feel like the "splat the arg only if it's a tuple" behavior is a bit confusing, or at the wrong level of abstraction. I feel like we should never mess with the args at the parameters method, instead passing them whole to the handler method so that when we ask "what are we doing with the temporal argument before handing it off to python-cmr?" we can find all of the things being done to it in one place!

If we add this handling to the temporal method, we're just adding compensation for the behavior at the parameters method by checking if we've received a Union[str, dt.date, dt.datetime], in which case the tuple-breaking code worked. I'd rather be able to check in the temporal method if we got a sequence of two things and break those up, and if not provide a better error message than python-cmr, since our API is different than python-cmr's. Maybe "Received a {Type}, but expected a sequence of two items". Perhaps it would be better to do this refactoring in the course of addressing this issue. 🤔

What do you think, Ian?

@JessicaS11
Copy link
Collaborator

Just picking up on this thread as I explore how we could replace icepyx.Query with earthaccess.search (icesat2py/icepyx#575). One of the things I'm finding (here temporal is the relevant input, but it's also the case for spatial) is that icepyx is a lot more flexible and forgiving (and broad) in terms of the inputs it accepts. For instance, users can input not only dates but times (if they want), and icepyx will either use defaults or pass those days/times through all functions. To do this, we essentially have a temporal module that takes the user input (tuple or list, of strings, date, or datetime objects) with optional start and end time kwargs (if they're not already attached to an input datetime object) and makes sure the user inputs can all be turned into valid datetime objects. Then we attach these internally-defined temporal objects as private variables to our query object, and use that temporal module to manipulate our "validated" datetimes for whereever we need to pass it next (like CMR).

The motivation for this (and especially for the analogous spatial version; the temporal module is much simpler!) was that we could turn multiple types of user input into a single "desired" (and validated) object, and then depending on what the user asks for later we have a "known" version of it we can easily manipulate (e.g. to pass in whatever format the API we're calling needs - like spatial.fmt_for_CMR).

I bring this up here because I've been grappling with some questions that I think are relevant to this conversation (I'll limit it to temporal ones for thread relevance):

  • Is it better to just require everyone to turn their input into a tuple, or accept both tuples and lists?
  • within those tuples/lists, what input types are acceptable? Where does it transition from the user being required to turn their YYYY-DOY string into a valid datetime object vs earthaccess providing this as a courtesy behind-the-scenes conversion?
  • should earthaccess also deal explicitly with start or end times? Or only if the user submits a datetime object (does earthaccess then even use these times)?
  • would there be a benefit to transitioning something like icepyx.temporal to earthaccess, or is icepyx too nice and we need to steer our users a bit more strongly towards a particular input format?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants