Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail if pyarrow is not installed. #6

Merged
merged 3 commits into from
Nov 21, 2023

Conversation

thetorpedodog
Copy link
Contributor

Since this doesn't directly depend upon pyarrow, it is possible that some end-user code does import pyarrow_hotfix as a defense mechanism, despite not needing pyarrow_hotfix itself. If this happens, it should not be a failure.


This is a simple proposal to take or leave; it also may be reasonable to add a warning that Arrow isn’t installed.

Since this doesn't directly depend upon pyarrow, it is possible that
some end-user code does `import pyarrow_hotfix` as a defense mechanism,
despite not needing `pyarrow_hotfix` itself. If this happens, it should
not be a failure.
@pitrou
Copy link
Owner

pitrou commented Nov 20, 2023

I don't have a strong opinion on this, but I'm a bit wary of potential mishaps due to silently silencing the error. Since this is a security feature, I'd rather be strict here.

cc @jorisvandenbossche for opinions

@jorisvandenbossche
Copy link
Collaborator

An advantage of this is that it makes it easier to adopt in case you only have an optional dependency on pyarrow; you can just import it without having to care about pyarrow being available or not.

However, I think it is also rather easy to get this as well by just only importing this where you import pyarrow (at least for a library depending on the hotfix package, that should be easy I think).
So for that reason, this might not be needed. @thetorpedodog you are more thinking about end-users rather than library developers?

@thetorpedodog
Copy link
Contributor Author

The case I am thinking of is generally the "second-level" dependency of Arrow on the part of either a library developer or an end developer.

Let's say that I as a library developer have my_library that depends upon other_library that depends upon pyarrow. other_library is rarely maintained and (for whatever reason) has a restrictive Arrow dependency. my_library is targeted at users who are not primarily developers (e.g. scientists). I want to provide library users with a secure-by-default installation, so I put import pyarrow_hotfix in my_library/__init__.py.

If other_library is updated and no longer requires pyarrow, suddenly my_library starts crashing, despite the fact that users are no less secure than before.

Similarly, if I as an end developer depend upon other_library in its insecure state, if other_library drops its Arrow dependency entirely, my program suddenly breaks. While it is easier for me to fix (just remove the import from my main module), it is still strange that a security fix caused my program to stop working even though it doesn't have a direct dependency.

Both of these may be somewhat of corner cases, and honestly I completely understand if you would prefer not to accommodate them directly for other reasons.

Alternately, a really short justification might be "if your library does not depend upon another library, the other library's absence should not cause any problems," but again I understand that this particular library is kind of a special case.

Either way, hopefully this at least makes clear what was going through my head when writing this.

@jorisvandenbossche
Copy link
Collaborator

Personally I am fine with this change if it helps adopting the hotfix package.
I think silencing the potential import error shouldn't be much of a problem, because this only would be a problem if you actually use pyarrow, in which case you do another import of pyarrow which will then raise the ImportError anyway?

@pitrou
Copy link
Owner

pitrou commented Nov 21, 2023

Ok, if this is desirable, can a test be added for it?

import pyarrow as pa
try:
import pyarrow as pa
except ImportError:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ModuleNotFoundError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Changed here and below.

Copy link
Contributor Author

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added as well.

import pyarrow as pa
try:
import pyarrow as pa
except ImportError:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Changed here and below.

@pitrou
Copy link
Owner

pitrou commented Nov 21, 2023

Ow, sorry: ModuleNotFoundError appeared in Python 3.6. A fallback is needed for 3.5...

@thetorpedodog
Copy link
Contributor Author

Ow, sorry: ModuleNotFoundError appeared in Python 3.6. A fallback is needed for 3.5...

one more round!

@pitrou pitrou merged commit 1e6f9d3 into pitrou:main Nov 21, 2023
24 checks passed
@pitrou
Copy link
Owner

pitrou commented Nov 21, 2023

Thanks @thetorpedodog ! I've released 0.6 with this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants