Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unspecified Numpy version causing installation issue #477

Closed
RJ3 opened this issue Aug 23, 2024 · 4 comments
Closed

Unspecified Numpy version causing installation issue #477

RJ3 opened this issue Aug 23, 2024 · 4 comments

Comments

@RJ3
Copy link

RJ3 commented Aug 23, 2024

Using the environment.yml with unspecified numpy version causes it to pull in Numpy 2.1.0 as of today's date. This causes an issue with the specified version of deepspeed.

Specifying Numpy to version 1.26 appears to get past the installation error, but I cannot confirm any other regressions.

- numpy

  Using cached deepspeed-0.12.4.tar.gz (1.2 MB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [34 lines of output]
      
      A module that was compiled using NumPy 1.x cannot be run in
      NumPy 2.1.0 as it may crash. To support both 1.x and 2.x
      versions of NumPy, modules must be compiled with NumPy 2.0.
      Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
      
      If you are a user of the module, the easiest solution will be to
      downgrade to 'numpy<2' or try to upgrade the affected module.
      We expect that some modules will need time to support NumPy 2.
      
      Traceback (most recent call last):  File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/setup.py", line 31, in <module>
          import torch
        File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/__init__.py", line 1382, in <module>
          from .functional import *  # noqa: F403
        File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/functional.py", line 7, in <module>
          import torch.nn.functional as F
        File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
          from .modules import *  # noqa: F403
        File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
          from .transformer import TransformerEncoder, TransformerDecoder, \
        File "/home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
          device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      /home/ra29435/micromamba/envs/openfold-pl/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /opt/conda/conda-bld/pytorch_1702400410390/work/torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/setup.py", line 100, in <module>
          cuda_major_ver, cuda_minor_ver = installed_cuda_version()
        File "/tmp/pip-install-ukaj6i9k/deepspeed_2c7096c05c1647b692f817716c6fb3f3/op_builder/builder.py", line 50, in installed_cuda_version
          raise MissingCUDAException("CUDA_HOME does not exist, unable to compile CUDA op(s)")
      op_builder.builder.MissingCUDAException: CUDA_HOME does not exist, unable to compile CUDA op(s)
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
critical libmamba pip failed to install packages```
@ahmedselim2017
Copy link

I have been facing the same problem too, did you see any effects of specifying the Numpy version on training/inference?

@vaclavhanzl
Copy link
Contributor

vaclavhanzl commented Oct 23, 2024

Specifying numpy<2.0.0 works for me in the pl_upgrades branch, now together with other fixes for recent problems, see my PR #496.

@vaclavhanzl
Copy link
Contributor

PR #496 is now merged to the pl_upgrades branch @RJ3, maybe fixing this (in this branch, which is sort-of officially the right one for cuda 12).

@RJ3
Copy link
Author

RJ3 commented Nov 8, 2024

Thank you @vaclavhanzl closing this

@RJ3 RJ3 closed this as completed Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants