You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try to run the code python tools/train_net.py --config-file configs/R_50/CTW1500/finetune_96voc_50maxlen.yaml --num-gpus 4
some errors occur. However ,it can run correctly if I set the --num-gpus 1 or change the code
on config file configs/R_50/CTW1500/pretrain_96voc_50maxlen.yaml
and it will be error when set ' TRAIN: ("syntext1_96voc","ic13_train_96voc","totaltext_train_96voc")'
[10/11 08:19:10 adet.data.dataset_mapper]: Cropping used in training: RandomCropWithInstance(crop_type='relative_range', crop_size=[0.1, 0.1], crop_instance=False)
[10/11 08:19:11 adet.data.datasets.text]: Loaded 229 images in COCO format from /dataset/ic13/train_96voc.json
[10/11 08:19:46 adet.data.datasets.text]: Loading /dataset/syntext1/annotations/train_96voc.json takes 35.33 seconds.
[10/11 08:19:47 adet.data.datasets.text]: Loaded 94723 images in COCO format from /dataset/syntext1/annotations/train_96voc.json
[10/11 08:24:02 d2.data.build]: Removed 0 images with no usable annotations. 94950 images left.
[10/11 08:24:02 d2.data.build]: Using training sampler TrainingSampler
[10/11 08:24:03 d2.data.common]: Serializing 94950 elements to byte tensors and concatenating them all ...
Traceback (most recent call last):
File "train_net.py", line 304, in <module>
launch(
File "/usr/local/lib/python3.8/dist-packages/detectron2/engine/launch.py", line 67, in launch
mp.spawn(
File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/spawn.py", line 130, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGKILL
Try to reduce the number of workers? Maybe it's a memory issue?
I try reducing the number of workers , however , it also occur this error.
But interestingly,it can work if set DATASETS: TRAIN: ("totaltext_train_96voc") TEST: ("ctw1500_test",)
and i can correctly run python tools/train_net.py --config-file configs/R_50/pretrain/150k_tt_mlt_13_15.yaml --num-gpus 4 python tools/train_net.py --config-file configs/R_50/TotalText/finetune_150k_tt_mlt_13_15.yaml --num-gpus 4 python tools/train_net.py --config-file configs/R_50/IC15/finetune_150k_tt_mlt_13_15.yaml --num-gpus 4
but it can not run on the dataset syntext1_96voc,syntext2_96voc
Thank for your reply!
When I try to run the code
python tools/train_net.py --config-file configs/R_50/CTW1500/finetune_96voc_50maxlen.yaml --num-gpus 4
some errors occur. However ,it can run correctly if I set the
--num-gpus 1
or change the codeon config file
configs/R_50/CTW1500/pretrain_96voc_50maxlen.yaml
and it will be error when set ' TRAIN: ("syntext1_96voc","ic13_train_96voc","totaltext_train_96voc")'
The structure tree of my dataset is as follow:
The text was updated successfully, but these errors were encountered: