You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I try to train YOLOX-X using my custom dataset in COCO format. While it was fine with a small version of my dataset (~3K images) using the default settings in yolox_base.py (only batch size set to 4, due to the lack of GPU memory), when I train it using the big version of my dataet (~10K) I cant find any solution to prevent getting NAN for each class, since the first epoch. I have tried many things:
Decreasing self.basic_lr_per_img by 10
Decreasing self.basic_lr_per_img by 100
Decreasing epoch number to 100, to 75 (thinking that the yolox cosine warmup learning rate scheduler gets arranged according to the total iteration number, and since my iteration per epoch is 3x more now, decreasing max epoch could...)
Using multi GPU so that I can put batch size = 16, which gives me very similar iteration number per epoch with my previous training
NONE of them worked. I don't know what to do else, is anyone have any idea?
Apart from that, I have checked my COCO format labels in the platform that I used to convert my labels, and they all seem fine. But maybe in YOLOX dataloader something is wrong, how can I visualize my ground truths easily after loading a batch in YOLOX training???
The text was updated successfully, but these errors were encountered:
How many samples are there in your validation dataset? A common way to get NaN values is not to have enough samples in each classe to perform an evaluation.
For instance, if your validation dataset contains 0 "car", your "car" mAP will always beNaN. But also, if you have too few car, the validation batch may not contain any (since YOLOX use a random subset of this dataset for each epoch evaluation).
Hello, I try to train YOLOX-X using my custom dataset in COCO format. While it was fine with a small version of my dataset (~3K images) using the default settings in yolox_base.py (only batch size set to 4, due to the lack of GPU memory), when I train it using the big version of my dataet (~10K) I cant find any solution to prevent getting NAN for each class, since the first epoch. I have tried many things:
NONE of them worked. I don't know what to do else, is anyone have any idea?
Apart from that, I have checked my COCO format labels in the platform that I used to convert my labels, and they all seem fine. But maybe in YOLOX dataloader something is wrong, how can I visualize my ground truths easily after loading a batch in YOLOX training???
The text was updated successfully, but these errors were encountered: