RunTimeError: CUDA out of memory // Requirements on Graphic card? #25

HartmannSa · 2020-11-26T14:23:55Z

Hi,

while executing
python -m cosypose.scripts.run_cosypose_eval --config tless-siso
I receive the following error message:

RuntimeError: CUDA out of memory. Tried to allocate 1.35 GiB (GPU 0; 5.93 GiB total capacity; 1.47 GiB already allocated; 866.50 MiB free; 36.31 MiB cached)

According to my internet research a reduction of the batch size is recommended. However, I don't know where to set it and in my understanding the batch size shouldn't play any role in this command, since I use the already pre-trained network?!

Could the cause of the error be that there are certain hardware requirements for reproducing the results?
I am using Ubuntu 18.04.5 LTS and an NVIDIA GeForce GTX 1060 6GB (and the nvidia-driver-450).

Here is a larger part of my terminal output:

1:06:35.398140 - Scene: [6]
1:06:35.398203 - Views: [359]
1:06:35.398260 - Group: [2732]
1:06:35.398285 - Image has 5 gt detections. (not used)
1:06:35.701966 - Pose prediction on 4 detections (n_iterations=1): 0:00:00.063503
1:06:35.954221 - Pose prediction on 4 detections (n_iterations=4): 0:00:00.250793
1:06:35.720832 - --------------------------------------------------------------------------------
100%|███████████████████████████████████████████████████████████| 10080/10080 [1:06:24<00:00, 2.53it/s]
1:06:47.763242 - Done with predictions
100%|█████████████████████████████████████████████████████████████| 10080/10080 [39:28<00:00, 4.26it/s]
1:46:18.765271 - Skipped: pix2pose_detections/coarse/iteration=1 (N=50023)
1:46:18.765351 - Skipped: pix2pose_detections/refiner/iteration=1 (N=50023)
1:46:18.765377 - Skipped: pix2pose_detections/refiner/iteration=2 (N=50023)
1:46:18.765398 - Skipped: pix2pose_detections/refiner/iteration=3 (N=50023)
1:46:18.765419 - Evaluation : pix2pose_detections/refiner/iteration=4 (N=50023)
0%| | 0/10080 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/rosmatch/anaconda3/envs/cosypose/lib/python3.7/runpy.py", line 193, in run_module_as_main
"main", mod_spec)
File "/home/rosmatch/anaconda3/envs/cosypose/lib/python3.7/runpy.py", line 85, in run_code
exec(code, run_globals)
File "/home/rosmatch/cosypose/cosypose/scripts/run_cosypose_eval.py", line 491, in
main()
File "/home/rosmatch/cosypose/cosypose/scripts/run_cosypose_eval.py", line 433, in main
eval_metrics[preds_k], eval_dfs[preds_k] = eval_runner.evaluate(preds)
File "/home/rosmatch/cosypose/cosypose/evaluation/eval_runner/pose_eval.py", line 67, in evaluate
meter.add(obj_predictions, obj_data_gt.to(device))
File "/home/rosmatch/cosypose/cosypose/evaluation/meters/pose_meters.py", line 172, in add
cand_infos['label'].values)
File "/home/rosmatch/cosypose/cosypose/evaluation/meters/pose_meters.py", line 101, in compute_errors_batch
errors.append(self.compute_errors(TXO_pred, TXO_gt, labels_))
File "/home/rosmatch/cosypose/cosypose/evaluation/meters/pose_meters.py", line 70, in compute_errors
dists = dists_add_symmetric(TXO_pred, TXO_gt, points)
File "/home/rosmatch/cosypose/cosypose/lib3d/distances.py", line 16, in dists_add_symmetric
dists_norm_squared = (dists ** 2).sum(dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 1.35 GiB (GPU 0; 5.93 GiB total capacity; 1.47 GiB already allocated; 866.50 MiB free; 36.31 MiB cached)

The text was updated successfully, but these errors were encountered:

salimkhazem · 2021-03-24T11:51:07Z

Hello, i have the same issue, did u fix it ?
thanks a lot for your answer

JohannesAma · 2021-04-29T10:34:18Z

Same here

yupei-git · 2021-05-06T17:52:54Z

This may be done by changing the batch_size in run_pose_training.py.

JohannesAma · 2021-05-06T17:56:34Z

I solved this with following changes:
bullet_batch_renderer.py -> workers 8 to 1
multiview_predictor.py -> batch size(nsym) 64 to 1
run_bop_inference.py -> workers 8 to 1

AlexandraPapadaki · 2021-06-12T22:49:27Z

Same here. Is there any other suggestion? Unfortunately, Johannes's solution didnt work for me.
@JohannesAma did it really work for you for the siso tless case?

JohannesAma · 2021-06-16T12:01:14Z

Same here. Is there any other suggestion? Unfortunately, Johannes's solution didnt work for me.
@JohannesAma did it really work for you for the siso tless case?

My nvidia card has 8gb of storage, maybe yours is smaller and you have to reduce batch size and workers in some more modules that are used in the siso tless case.

smoothumut · 2022-02-14T12:44:07Z

I have the same problem and the suggested solution didnt work. Is there any solution ??? thanks in advance

JohannesAma · 2022-02-14T13:04:59Z

Im sorry I dont know about another solution
Workers and batchsize are the parameters which define the load on the grafics card
Maybe you have to set them even smaller.

nturaymond · 2022-03-31T07:11:06Z

The main reason for this problem is that the data set evaluated by the evaluation is too large, and the GPU memory for running the program is less than 8GB.
The The root cause is this line of code: run_cosypose_eval.py Line 443
eval_metrics[preds_k], eval_dfs[preds_k] = eval_runner.evaluate(preds)

Possible Solution:

Go to folder "local_data" to delete some data. Then perform pre-training, usually, the results will not be a problem, and then execute the process of evaluation again.
Discard GPU usage. Transfer all data, models to the CPU (requires constant code debugging)
Modify the model to use AMP. However, the workload is large, and it is easy to cause the entire program to be difficult to execute if you are not careful.

In fact, the process of performing evaluation is not just to verify whether the results are correct. This model can be used to evaluate other datasets, and if the results are correct, it is not good. The main modification part is LOCAL_DATA_DIR.

KushnirDmytro · 2023-04-11T21:34:01Z

Created PR with fix for this issue

KushnirDmytro mentioned this issue Apr 11, 2023

Memory optimized dists_add_symmetric Simple-Robotics/cosypose#18

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RunTimeError: CUDA out of memory // Requirements on Graphic card? #25

RunTimeError: CUDA out of memory // Requirements on Graphic card? #25

HartmannSa commented Nov 26, 2020

salimkhazem commented Mar 24, 2021

JohannesAma commented Apr 29, 2021

yupei-git commented May 6, 2021

JohannesAma commented May 6, 2021

AlexandraPapadaki commented Jun 12, 2021

JohannesAma commented Jun 16, 2021

smoothumut commented Feb 14, 2022

JohannesAma commented Feb 14, 2022

nturaymond commented Mar 31, 2022

KushnirDmytro commented Apr 11, 2023 •

edited

Loading

RunTimeError: CUDA out of memory // Requirements on Graphic card? #25

RunTimeError: CUDA out of memory // Requirements on Graphic card? #25

Comments

HartmannSa commented Nov 26, 2020

salimkhazem commented Mar 24, 2021

JohannesAma commented Apr 29, 2021

yupei-git commented May 6, 2021

JohannesAma commented May 6, 2021

AlexandraPapadaki commented Jun 12, 2021

JohannesAma commented Jun 16, 2021

smoothumut commented Feb 14, 2022

JohannesAma commented Feb 14, 2022

nturaymond commented Mar 31, 2022

KushnirDmytro commented Apr 11, 2023 • edited Loading

KushnirDmytro commented Apr 11, 2023 •

edited

Loading