Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editable installs can set __package__ wrong for an extension submodule of a package #893

Closed
godlygeek opened this issue Sep 9, 2024 · 16 comments

Comments

@godlygeek
Copy link

godlygeek commented Sep 9, 2024

Editable installs can set __package__ to the wrong value for an extension submodule within a package.

Steps to reproduce:

git clone https://github.com/godlygeek/scikit-build-core-bug-reproducer
cd scikit-build-core-bug-reproducer
python3 -m venv venv
source venv/bin/activate
pip install .
python -c "import demo"  # suceeds
pip install -e .
python -c "import demo"  # fails

The demo package's __init__.py contains an assertion that the extension submodule has __package__ set correctly (to demo, the name of the package containing the extension module). When you do a regular pip install, it does, but when you do an editable pip install the assertion fails because __package__ has been incorrectly set to demo._demo instead of to demo.

The problem goes away if you edit the _demo_editable.py in the virtualenv and change:

        if fullname in self.known_wheel_files:

to

        if fullname in self.known_wheel_files:
            submodule_search_locations = None

though I assume that change would break other things. The problem is definitely caused by the submodule_search_locations being passed to importlib.util.spec_from_file_location though, and it seems to be triggered by the fact that the module's path demo._demo matches the name of a folder in the repo demo/_demo. Note that this is a common naming convention for extension modules though, including in CPython itself (e.g. Modules/_decimal/_decimal.c in CPython).

This was found in a real world codebase, in bloomberg/memray#673

@godlygeek
Copy link
Author

CC @pablogsal who originally noticed this issue.

@henryiii
Copy link
Collaborator

henryiii commented Sep 10, 2024

As a quick fix for this very specific case, you could filter the source files from the wheel (wheel.exclude). Is demo._demo the extension module with an incorrect __package__, or is it the folder instead of the extension module? Edit: the former.

@henryiii
Copy link
Collaborator

henryiii commented Sep 10, 2024

FYI, your example run code can be written as a one-liner:

$ nox -s downstream -- https://github.com/godlygeek/scikit-build-core-bug-reproducer --editable -c "import demo"
nox > Running session downstream
nox > Creating virtual environment (uv) using python in .nox/downstream
nox > uv pip install build hatch-vcs hatchling
nox > uv pip install . --no-build-isolation
nox > git clone https://github.com/godlygeek/scikit-build-core-bug-reproducer .nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer --recurse-submodules
Cloning into '.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer'...
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 10 (delta 0), reused 10 (delta 0), pack-reused 0 (from 0)
Receiving objects: 100% (10/10), done.
nox > cd .nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer
nox > uv pip install -e.
nox > python -c 'import demo'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer/src/demo/__init__.py", line 3, in <module>
    assert _demo.__package__ == "demo"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
nox > Command python -c 'import demo' failed with exit code 1
nox > Session downstream failed.

This also uses the current source for scikit-build-core so it's easy to iterate on a solution. :)

I'll try to look into this tomorrow.

@henryiii
Copy link
Collaborator

$ .nox/downstream/bin/python
>>> import sys
>>> import pprint
>>> sk = sys.meta_path[2]
>>> pprint.pprint(sk.__dict__)
{'build_options': [],
 'dir': '/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/lib/python3.12/site-packages',
 'install_dir': '/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/lib/python3.12/site-packages/',
 'install_options': [],
 'known_source_files': {'demo': '/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer/src/demo/__init__.py',
                        'demo._demo._demo': '/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer/src/demo/_demo/_demo.c'},
 'known_wheel_files': {'demo._demo': 'demo/_demo.cpython-312-darwin.so'},
 'path': None,
 'pkgs': ['demo'],
 'rebuild_flag': False,
 'submodule_search_locations': {'demo': {'/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/lib/python3.12/site-packages/demo',
                                         '/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer/src/demo'},
                                'demo._demo': {'/Users/henryschreiner/git/scikit-build/scikit-build-core/.nox/downstream/tmp/https:__github.com_godlygeek_scikit-build-core-bug-reproducer/src/demo/_demo'}},
 'verbose': True}

It's adding demo/_demo since it thinks you might need to access demo/_demo/_demo.c. I think we only need to add submodule_search_locations if this is an __init__.py file, so I think this patch fixes it:

--- a/src/scikit_build_core/resources/_editable_redirect.py
+++ b/src/scikit_build_core/resources/_editable_redirect.py
@@ -94,14 +95,18 @@ class ScikitBuildRedirectingFinder(importlib.abc.MetaPathFinder):
             return importlib.util.spec_from_file_location(
                 fullname,
                 os.path.join(self.dir, redir),
-                submodule_search_locations=submodule_search_locations,
+                submodule_search_locations=submodule_search_locations
+                if redir.endswith("__init__.py")
+                else None,
             )
         if fullname in self.known_source_files:
             redir = self.known_source_files[fullname]
             return importlib.util.spec_from_file_location(
                 fullname,
                 redir,
-                submodule_search_locations=submodule_search_locations,
+                submodule_search_locations=submodule_search_locations
+                if redir.endswith("__init__.py")
+                else None,
             )
         return None

@LecrisUT and/or @vyasr, does that sound right?

@LecrisUT
Copy link
Collaborator

It's adding demo/_demo since it thinks you might need to access demo/_demo/_demo.c. I think we only need to add submodule_search_locations if this is an __init__.py file, so I think this patch fixes it:

I think it makes sense, mesonpy does it similar with loader.is_package.

But isn't it suspicious that known_source_files is populated by demo._demo._demo? Maybe we should explicitly filter for only .py, .pyi files in there since .c, .pyx would be used in the compilation during the build?

@henryiii
Copy link
Collaborator

henryiii commented Sep 10, 2024

You can load via importlib.resources, though? That would the only reason to keep files like this in the wheel in the first place.

@henryiii
Copy link
Collaborator

Though in that case it should contain the extension.

@henryiii
Copy link
Collaborator

henryiii commented Sep 10, 2024

And I don't think we need to handle pyi, since that's not a runtime thing. pyc is possible, though quite unlikely.

@LecrisUT
Copy link
Collaborator

You can load via importlib.resources, though? That would the only reason to keep files like this in the wheel in the first place.

🤔 but wouldn't it be lying about what the source and python module are represented there since the file is in demo._demo._demo.c but it is compiled to demo._demo module

Though in that case it should contain the extension.

Right but then there would be a huge ambiguity when trying to resolve it importlib.resources.files("demo._demo._demo.c"). They would still be able to navigate as a regular resource file, i.e. files("demo").joinpath("_demo/_demo.c").

There is also the functionality of queering the source-code which could be related, but then you would have again more ambiguity with cython, would it be .c code or .pyx. Probably best not to make that linkage.

@godlygeek
Copy link
Author

Is it possible to recognize that the module being loaded is an extension module and set submodule_search_locations to None in that case? Extension modules can't have submodules, AFAIU

@henryiii
Copy link
Collaborator

That's what this does as well, since submodules won't be named __init__.py.

henryiii added a commit that referenced this issue Sep 10, 2024
Proposed fix for
#893.

- **fix: nox + uv downstream improvements**
- **fix: editable subpackage issue**

---------

Signed-off-by: Henry Schreiner <[email protected]>
@godlygeek
Copy link
Author

As a quick fix for this very specific case, you could filter the source files from the wheel (wheel.exclude)

For what it's worth, I've tested and this doesn't seem to be the case. I see the same behavior even if I add:

wheel.exclude = ["demo/_demo/**"]

I can confirm with a python -m build that the wheel indeed no longer contains the demo/_demo directory, but it's still there in the source tree, and that seems to still be enough to break the editable install.

@henryiii
Copy link
Collaborator

Ahh, yes, editable installs ignore the ignore lists. The idea was that the files will still physically be there, didn't realize that removing these entries would help. Might be worth implementing. 0.10.6 should fix your initial problem directly, though.

@henryiii
Copy link
Collaborator

Can you verify 0.10.6 fixed this?

@godlygeek
Copy link
Author

Yes, I confirm that 0.10.6 fixes things for Memray. Thanks for the quick turnaround!

@vyasr
Copy link
Contributor

vyasr commented Sep 23, 2024

Apologies for the delayed response, I was out on vacation. Yes, the proposed fix looks correct to me, and I agree that we can ignore pyi files since they aren't relevant for importing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants