You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been using pyskl framework with the specified conda environment to train posec3d and stgcnn++. Training and inference works fine.
However when I tried the MSG3D config (configs/msg3d/msg3d_pyskl_ntu60_xsub_hrnet) as soon as training starts, pytorch throws an error regarding inplace operation in the model structure.
I have experimented by setting Relu activations in msg3d with inplace=False without much success, any help is much appreciated.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 192, 25, 85]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 5051) of binary: /opt/conda/envs/pyskl/bin/python
The text was updated successfully, but these errors were encountered:
Hi, I was facing the same problem today, what worked for me was changing these lines in the msg3d_utils.py file. Line 139, Line 232 and finally Line 316. Basically replace all "something1 += something2" by "something1 = something1 + something2"
Hi,
I have been using pyskl framework with the specified conda environment to train posec3d and stgcnn++. Training and inference works fine.
However when I tried the MSG3D config (configs/msg3d/msg3d_pyskl_ntu60_xsub_hrnet) as soon as training starts, pytorch throws an error regarding inplace operation in the model structure.
I have experimented by setting Relu activations in msg3d with inplace=False without much success, any help is much appreciated.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 192, 25, 85]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 5051) of binary: /opt/conda/envs/pyskl/bin/python
The text was updated successfully, but these errors were encountered: