Illegal instruction (SIGILL) (core dumped) when importing netket #670
-
Hello, I use Linux Mint distribution with following parameters: System: Machine: CPU: Graphics: I had exactly the same problem with Ubuntu 20.10 (Groovy Gorilla), so I tried Mint distribution instead, but the problem remains. Thank you for your advice. |
Beta Was this translation helpful? Give feedback.
Replies: 25 comments
-
CAN you give more details? How did you install netket? What is the full stack trace? |
Beta Was this translation helpful? Give feedback.
-
I installed netket with command pip install --pre netket (according to the website). |
Beta Was this translation helpful? Give feedback.
-
Sorry I don't understand... |
Beta Was this translation helpful? Give feedback.
-
When I import jax, I get a crash. |
Beta Was this translation helpful? Give feedback.
-
And the same goes when I import mpi4jax. |
Beta Was this translation helpful? Give feedback.
-
Yea well, mpi4jax imports jax internally so if you crash importing jax you will crash importing mpi4jax. I have no idea what might be wrong. I suggest you open an issue on google/jax repository. |
Beta Was this translation helpful? Give feedback.
-
maybe you can try to install it without conda? conda can be problematic... |
Beta Was this translation helpful? Give feedback.
-
Ah wait, maybe I know what it is. |
Beta Was this translation helpful? Give feedback.
-
I didn't use conda, I installed everything with pip. Oh, ok, I will try to downgrade it, thanks. |
Beta Was this translation helpful? Give feedback.
-
Try jaxlib 0.1.61 (0.1.62 requires AVX, and I think your processor does not have that) |
Beta Was this translation helpful? Give feedback.
-
I've had version 0.1.64, so I uninstalled it and installed 0.1.61 instead. |
Beta Was this translation helpful? Give feedback.
-
Decrease jax version 1 by 1 until you get the one that works |
Beta Was this translation helpful? Give feedback.
-
Yes i'm now sure it's that. I have to say that jax could at least error humanly and say what the problem is instead of just giving cryptic informations... jax 0.2.11 requires jaxlib 0.1.62 so you probably must go back to jax 0.2.10 |
Beta Was this translation helpful? Give feedback.
-
Alternatively, you can build jaxlib for yourself disabling AVX. You must pass the flags specified in the changelog of version 0.1.62 i linked above. (By the way, sorry if my first messages sounded a bit cold (they weren't supposed to be) but I was trying to answer quickly from my phone, and I can be a bit direct sometimes.) |
Beta Was this translation helpful? Give feedback.
-
Ok, thank you very much, I will try it, hope it will work =) Don't worry about that, I apperciate your prompt reaction. |
Beta Was this translation helpful? Give feedback.
-
Let me know how it goes. That way, if anybody stumbles on that again at least he'll find some decent documentation on it, since jax doesn't provide it. |
Beta Was this translation helpful? Give feedback.
-
Ok, now I have jax 0.2.10 and jaxlib 0.1.61, when I run the program, the terminal writes: 2021-04-29 16:11:46.739706: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory Is there some problem with tensorflow? I have version 2.4.1 installed. |
Beta Was this translation helpful? Give feedback.
-
how did you install jaxlib? did you do pip install jaxlib==0.1.61 ? Or did you install it from the url of google to have nvidia gpu support? It seems that jaxlib is trying to me that jax is trying to load cuda but you have no cuda installed (according to your original post, you don't have a nvidia gpu in your computer). The fact that he says tensorflow is that jaxlib is basically a part of tensorflow. -- About the error, ah. |
Beta Was this translation helpful? Give feedback.
-
I installed jaxlib with this command: pip install -U jax jaxlib==0.1.61+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html. |
Beta Was this translation helpful? Give feedback.
-
Do you actually have a cuda gpu? |
Beta Was this translation helpful? Give feedback.
-
It seems I don't have a cuda gpu. |
Beta Was this translation helpful? Give feedback.
-
So just install jaxlib with pip install jaxLob==x.y.z |
Beta Was this translation helpful? Give feedback.
-
Ok, so now I have installed jaxlib according to your last advice, I still get this error: |
Beta Was this translation helpful? Give feedback.
-
It should not be a problem for most things. pip install -U 'git+https://github.com/netket/netket.git@compat' I fixed it for you. Eventually it will end up in the official version but for a while it will be there |
Beta Was this translation helpful? Give feedback.
-
Ok, it seems it works. Thank you very much for your help! |
Beta Was this translation helpful? Give feedback.
Yes i'm now sure it's that.
If you read Jax changelog for version 0.1.62 you see that they enabled AVX by default. however your cpu does not have AVX but only SSE4.2 and so you get an Illegal instruction.
I have to say that jax could at least error humanly and say what the problem is instead of just giving cryptic informations...
jax 0.2.11 requires jaxlib 0.1.62 so you probably must go back to jax 0.2.10
Mpi4jax requires jaxlib 0.1.62 too, so you will have to uninstall mpi4jax too (mpi4jax is basically broken for all jaxlib versions between 0.1.58 and 0.1.61, inclusive. See mpi4jax#46)