Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python2 crashes if tensorboard_logger is imported before torch #25

Open
kukuruza opened this issue Dec 12, 2018 · 3 comments
Open

Python2 crashes if tensorboard_logger is imported before torch #25

kukuruza opened this issue Dec 12, 2018 · 3 comments

Comments

@kukuruza
Copy link

kukuruza commented Dec 12, 2018

Thanks for the great package, it really brings much value for me. But I've recently come across a python crash.

*** Error in `python': malloc(): memory corruption: 0x000000007842e4c0 ***
Aborted (core dumped)

Steps to reproduce. Running the script below causes the crash on the last line (forward pass of the network).

from tensorboard_logger import configure

import torch
from torch.autograd import Variable

mymodel = torch.nn.Sequential(torch.nn.Conv2d(3, 10, kernel_size=3, bias=True))
imgs = Variable(torch.zeros((1,3,64,64), dtype=torch.float32)).cuda()
mymodel.cuda()
mymodel(imgs)

I also found that switching the order of the imports solves the problem. The following works fine.

import torch
from torch.autograd import Variable

from tensorboard_logger import configure

mymodel = torch.nn.Sequential(torch.nn.Conv2d(3, 10, kernel_size=3, bias=True))
imgs = Variable(torch.zeros((1,3,64,64), dtype=torch.float32)).cuda()
mymodel.cuda()
mymodel(imgs)

If I am not using .cuda() in the code, any order works fine.

System:

Ubuntu 14.04.5 LTS
Cuda  8.0, V8.0.61

Packages:

python                    2.7.15               h33da82c_4    conda-forge
pytorch                   0.4.1                py27__9.0.176_7.1.2_2
tensorboard-logger        0.1.0

I installed them with

conda install pytorch torchvision -c pytorch
pip install tensorboard_logger

I assume the order of imports was tested before, so my only guess is that conda and pip don't work well together and load different versions of some package.

@lopuhin
Copy link
Contributor

lopuhin commented Dec 12, 2018

wow, that's quite a nasty bug! And thanks for reducing this to a small example.

If I am not using .cuda() in the code, any order works fine.

so the crash happens not on import, right?

I tried running the first (problematic) version of the script, and it didn't crash on me (using python 3.6 and same versions of packages). I remember having issues with torch import order like this: pytorch/pytorch#2083 but not the memory corruption.

To sum up, I'm not sure I'll be able to help here much, sorry. tensorboard_logger is pure python and is not supposed to do anything nasty, but still I can't explain why this error is happening. This could be some unrelated issue in pytorch or some other C library which is triggered only upon specific conditions, maybe if you obtain the backtrace from the crash this would help to narrow it down.

And thank you for the kind words about the library.

@kukuruza
Copy link
Author

That's correct, the crash happens on the last line, which is the forward pass in the network (I edited the issue for clarity of others.)

I guess I just wanted to document this as an issue so that anyone who comes across similar behavior has one idea to try out.

And if I figure out that combination of factors that causes the problem, I'll comment here.

@lopuhin
Copy link
Contributor

lopuhin commented Dec 13, 2018

Thanks @kukuruza let's keep it open so that it's more visible in case someone else also has this problem

@lopuhin lopuhin reopened this Dec 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants