Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Error During Testing #130

Open
2 tasks done
dianchia opened this issue Feb 12, 2024 · 1 comment
Open
2 tasks done

Assertion Error During Testing #130

dianchia opened this issue Feb 12, 2024 · 1 comment

Comments

@dianchia
Copy link

Did you check docs and existing issues?

  • I have read all the docs
  • I have searched the existing issues

Version Information

>>> python -V
Python 3.7.12
mmpose 0.24.0
mmcv-full 1.3.9
Click for full version info
addict                   2.4.0
certifi                  2023.11.17
charset-normalizer       3.3.2
chumpy                   0.70
cycler                   0.11.0
Cython                   3.0.8
einops                   0.6.1
fonttools                4.38.0
idna                     3.6
importlib-metadata       6.7.0
json-tricks              3.17.3
kiwisolver               1.4.5
matplotlib               3.5.3
mmcv-full                1.3.9      $HOME/projects/pose_estimation/ViTPose/mmcv
mmpose                   0.24.0     $HOME/projects/pose_estimation/ViTPose/ViTPose
munkres                  1.1.4
numpy                    1.21.6
nvidia-cublas-cu11       11.10.3.66
nvidia-cuda-nvrtc-cu11   11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11        8.5.0.96
opencv-python            4.9.0.80
packaging                23.2
Pillow                   9.5.0
pip                      23.3.2
platformdirs             4.0.0
pyparsing                3.1.1
python-dateutil          2.8.2
PyYAML                   6.0.1
requests                 2.31.0
scipy                    1.7.3
setuptools               69.0.3
six                      1.16.0
timm                     0.4.9
tomli                    2.0.1
torch                    1.13.1
torchvision              0.14.1
typing_extensions        4.7.1
urllib3                  2.0.7
wheel                    0.42.0
xtcocotools              1.14.3
yapf                     0.40.2
zipp                     3.15.0

Operating System

Ubuntu

Describe the bug

AssertionError was raised then testing using the script tools/dist_test.sh. A shorter version of error is included below.

File "tools/test.py", line 184, in <module>
    main()
  File "tools/test.py", line 167, in main
    args.gpu_collect)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test
    result = model(return_loss=False, **data)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward
    img, img_metas, return_heatmap=return_heatmap, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test
    assert img.size(0) == len(img_metas)
AssertionError
Click for full error message
$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py:188: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

  FutureWarning,
apex is not installed
apex is not installed
apex is not installed
$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module.
  warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting OMP_NUM_THREADS environment variable for each process '
$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting MKL_NUM_THREADS environment variable for each process '
loading annotations into memory...
Done (t=1.00s)
creating index...
index created!
=> Total boxes: 104125
=> Total boxes after filter low [email protected]: 104125
=> num_images: 5000
=> load 104125 samples
Use load_from_local loader
The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.blocks.0.mlp.experts.0.weight, backbone.blocks.0.mlp.experts.0.bias, backbone.blocks.0.mlp.experts.1.weight, backbone.blocks.0.mlp.experts.1.bias, backbone.blocks.0.mlp.experts.2.weight, backbone.blocks.0.mlp.experts.2.bias, backbone.blocks.0.mlp.experts.3.weight, backbone.blocks.0.mlp.experts.3.bias, backbone.blocks.0.mlp.experts.4.weight, backbone.blocks.0.mlp.experts.4.bias, backbone.blocks.0.mlp.experts.5.weight, backbone.blocks.0.mlp.experts.5.bias, backbone.blocks.1.mlp.experts.0.weight, backbone.blocks.1.mlp.experts.0.bias, backbone.blocks.1.mlp.experts.1.weight, backbone.blocks.1.mlp.experts.1.bias, backbone.blocks.1.mlp.experts.2.weight, backbone.blocks.1.mlp.experts.2.bias, backbone.blocks.1.mlp.experts.3.weight, backbone.blocks.1.mlp.experts.3.bias, backbone.blocks.1.mlp.experts.4.weight, backbone.blocks.1.mlp.experts.4.bias, backbone.blocks.1.mlp.experts.5.weight, backbone.blocks.1.mlp.experts.5.bias, backbone.blocks.2.mlp.experts.0.weight, backbone.blocks.2.mlp.experts.0.bias, backbone.blocks.2.mlp.experts.1.weight, backbone.blocks.2.mlp.experts.1.bias, backbone.blocks.2.mlp.experts.2.weight, backbone.blocks.2.mlp.experts.2.bias, backbone.blocks.2.mlp.experts.3.weight, backbone.blocks.2.mlp.experts.3.bias, backbone.blocks.2.mlp.experts.4.weight, backbone.blocks.2.mlp.experts.4.bias, backbone.blocks.2.mlp.experts.5.weight, backbone.blocks.2.mlp.experts.5.bias, backbone.blocks.3.mlp.experts.0.weight, backbone.blocks.3.mlp.experts.0.bias, backbone.blocks.3.mlp.experts.1.weight, backbone.blocks.3.mlp.experts.1.bias, backbone.blocks.3.mlp.experts.2.weight, backbone.blocks.3.mlp.experts.2.bias, backbone.blocks.3.mlp.experts.3.weight, backbone.blocks.3.mlp.experts.3.bias, backbone.blocks.3.mlp.experts.4.weight, backbone.blocks.3.mlp.experts.4.bias, backbone.blocks.3.mlp.experts.5.weight, backbone.blocks.3.mlp.experts.5.bias, backbone.blocks.4.mlp.experts.0.weight, backbone.blocks.4.mlp.experts.0.bias, backbone.blocks.4.mlp.experts.1.weight, backbone.blocks.4.mlp.experts.1.bias, backbone.blocks.4.mlp.experts.2.weight, backbone.blocks.4.mlp.experts.2.bias, backbone.blocks.4.mlp.experts.3.weight, backbone.blocks.4.mlp.experts.3.bias, backbone.blocks.4.mlp.experts.4.weight, backbone.blocks.4.mlp.experts.4.bias, backbone.blocks.4.mlp.experts.5.weight, backbone.blocks.4.mlp.experts.5.bias, backbone.blocks.5.mlp.experts.0.weight, backbone.blocks.5.mlp.experts.0.bias, backbone.blocks.5.mlp.experts.1.weight, backbone.blocks.5.mlp.experts.1.bias, backbone.blocks.5.mlp.experts.2.weight, backbone.blocks.5.mlp.experts.2.bias, backbone.blocks.5.mlp.experts.3.weight, backbone.blocks.5.mlp.experts.3.bias, backbone.blocks.5.mlp.experts.4.weight, backbone.blocks.5.mlp.experts.4.bias, backbone.blocks.5.mlp.experts.5.weight, backbone.blocks.5.mlp.experts.5.bias, backbone.blocks.6.mlp.experts.0.weight, backbone.blocks.6.mlp.experts.0.bias, backbone.blocks.6.mlp.experts.1.weight, backbone.blocks.6.mlp.experts.1.bias, backbone.blocks.6.mlp.experts.2.weight, backbone.blocks.6.mlp.experts.2.bias, backbone.blocks.6.mlp.experts.3.weight, backbone.blocks.6.mlp.experts.3.bias, backbone.blocks.6.mlp.experts.4.weight, backbone.blocks.6.mlp.experts.4.bias, backbone.blocks.6.mlp.experts.5.weight, backbone.blocks.6.mlp.experts.5.bias, backbone.blocks.7.mlp.experts.0.weight, backbone.blocks.7.mlp.experts.0.bias, backbone.blocks.7.mlp.experts.1.weight, backbone.blocks.7.mlp.experts.1.bias, backbone.blocks.7.mlp.experts.2.weight, backbone.blocks.7.mlp.experts.2.bias, backbone.blocks.7.mlp.experts.3.weight, backbone.blocks.7.mlp.experts.3.bias, backbone.blocks.7.mlp.experts.4.weight, backbone.blocks.7.mlp.experts.4.bias, backbone.blocks.7.mlp.experts.5.weight, backbone.blocks.7.mlp.experts.5.bias, backbone.blocks.8.mlp.experts.0.weight, backbone.blocks.8.mlp.experts.0.bias, backbone.blocks.8.mlp.experts.1.weight, backbone.blocks.8.mlp.experts.1.bias, backbone.blocks.8.mlp.experts.2.weight, backbone.blocks.8.mlp.experts.2.bias, backbone.blocks.8.mlp.experts.3.weight, backbone.blocks.8.mlp.experts.3.bias, backbone.blocks.8.mlp.experts.4.weight, backbone.blocks.8.mlp.experts.4.bias, backbone.blocks.8.mlp.experts.5.weight, backbone.blocks.8.mlp.experts.5.bias, backbone.blocks.9.mlp.experts.0.weight, backbone.blocks.9.mlp.experts.0.bias, backbone.blocks.9.mlp.experts.1.weight, backbone.blocks.9.mlp.experts.1.bias, backbone.blocks.9.mlp.experts.2.weight, backbone.blocks.9.mlp.experts.2.bias, backbone.blocks.9.mlp.experts.3.weight, backbone.blocks.9.mlp.experts.3.bias, backbone.blocks.9.mlp.experts.4.weight, backbone.blocks.9.mlp.experts.4.bias, backbone.blocks.9.mlp.experts.5.weight, backbone.blocks.9.mlp.experts.5.bias, backbone.blocks.10.mlp.experts.0.weight, backbone.blocks.10.mlp.experts.0.bias, backbone.blocks.10.mlp.experts.1.weight, backbone.blocks.10.mlp.experts.1.bias, backbone.blocks.10.mlp.experts.2.weight, backbone.blocks.10.mlp.experts.2.bias, backbone.blocks.10.mlp.experts.3.weight, backbone.blocks.10.mlp.experts.3.bias, backbone.blocks.10.mlp.experts.4.weight, backbone.blocks.10.mlp.experts.4.bias, backbone.blocks.10.mlp.experts.5.weight, backbone.blocks.10.mlp.experts.5.bias, backbone.blocks.11.mlp.experts.0.weight, backbone.blocks.11.mlp.experts.0.bias, backbone.blocks.11.mlp.experts.1.weight, backbone.blocks.11.mlp.experts.1.bias, backbone.blocks.11.mlp.experts.2.weight, backbone.blocks.11.mlp.experts.2.bias, backbone.blocks.11.mlp.experts.3.weight, backbone.blocks.11.mlp.experts.3.bias, backbone.blocks.11.mlp.experts.4.weight, backbone.blocks.11.mlp.experts.4.bias, backbone.blocks.11.mlp.experts.5.weight, backbone.blocks.11.mlp.experts.5.bias

[                                                  ] 0/104125, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/test.py", line 184, in <module>
    main()
  File "tools/test.py", line 167, in main
    args.gpu_collect)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test
    result = model(return_loss=False, **data)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward
    img, img_metas, return_heatmap=return_heatmap, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test
    assert img.size(0) == len(img_metas)
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3740731) of binary: $HOME/miniforge3/envs/vitpose/bin/python
Traceback (most recent call last):
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 195, in <module>
    main()
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 191, in main
    launch(args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 176, in launch
    run(args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run
    )(*cmd_args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
tools/test.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-02-12_17:58:43
  host      : host
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 3740731)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Steps to reproduce

  1. Clone the repository with git clone https://github.com/ViTAE-Transformer/ViTPose.git --depth 1
  2. Follow the installation instruction in README.md
  3. Download dataset from coco-dataset official website. To be specific, the 2017 Train/Val/Test Images.
  4. Put the downloaded images into ./data/coco/ and unzip all of them.
  5. Download the annotation files from here and put it into ./data/coco/annotations/
  6. Download any of the wholebody pretrained model
  7. Start testing with this command bash tools/dist_test.sh configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/ViTPose_base_wholebody_256x192.py pretrained/wholebody.pth 1

Expected behaviour

Expected the testing to run smoothly without errors.

@LancasterLi
Copy link

I correct this error by setting all "img_metas" to "img_metas.data[0]" in ./detectors/top_down.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants