Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get benchmark statistics? #2543

Open
shink opened this issue Nov 19, 2024 · 4 comments
Open

How to get benchmark statistics? #2543

shink opened this issue Nov 19, 2024 · 4 comments

Comments

@shink
Copy link
Contributor

shink commented Nov 19, 2024

I'm building a CI to test some models on certain types of devices. I want get benchmark statistics like which model cases failed? which tests were skipped and why? These statistics will be used to generate a table like this:

Devices BERT_pytorch hf_GPT2
train eval train eval
CPU
CUDA
Foo ❌ (failed) ⚠️ (skipped)

So how can I get benchmark statistics? Is there a recommended way to do this? Can anyone give suggestions? Thanks so much!

@xuzhao9
Copy link
Contributor

xuzhao9 commented Nov 19, 2024

I suggest to write your own userbenchmark for this use case: https://github.com/pytorch/benchmark/blob/main/userbenchmark/ADDING_USERBENCHMARKS.md

@shink
Copy link
Contributor Author

shink commented Nov 20, 2024

@xuzhao9 Thanks for your guidance! Actually, I'm building a CI to test some out-of-tree devices, such as Intel HPU, Ascend NPU, so I want to run some official model cases on them to prove the quality of these backends. I want to ask, is there any userbenchmarks that can be used directly? Or if I write my own benchmark, will it be merged?

@shink
Copy link
Contributor Author

shink commented Nov 20, 2024

ah I think test_bench might be a good choice.

@shink
Copy link
Contributor Author

shink commented Nov 20, 2024

But it only supports cpu and cuda. Will send a pull request to allow it to run on other devices.

facebook-github-bot pushed a commit that referenced this issue Nov 25, 2024
…_devices()` (#2545)

Summary:
from #2543 (comment)

This change will allow all userbenchmarks to run on available devices.

## Userbenchmark - test_bench - BERT_pytorch

cuda:

```
$ python run_benchmark.py test_bench --models BERT_pytorch --device cuda
Running TorchBenchModelConfig(name='BERT_pytorch', test='eval', device='cuda', batch_size=None, extra_args=[], extra_env=None, output_dir=None) ... [done]
{
    "name": "test_bench",
    "environ": {
        "pytorch_git_version": "ac47a2d9714278889923ddd40e4210d242d8d4ee",
        "pytorch_version": "2.6.0.dev20241121+cu124",
        "device": "Tesla T4"
    },
    "metrics": {
        "model=BERT_pytorch, test=eval, device=cuda, bs=None, extra_args=[], metric=latencies": 122.69141,
        "model=BERT_pytorch, test=eval, device=cuda, bs=None, extra_args=[], metric=cpu_peak_mem": 0.6962890625,
        "model=BERT_pytorch, test=eval, device=cuda, bs=None, extra_args=[], metric=gpu_peak_mem": 1.573486328125
    }
}
```

mps:

```
$ python run_benchmark.py test_bench --models BERT_pytorch --device mps
Running TorchBenchModelConfig(name='BERT_pytorch', test='eval', device='mps', batch_size=None, extra_args=[], extra_env=None, output_dir=None) ... [done]
{
    "name": "test_bench",
    "environ": {
        "pytorch_git_version": "dd2e6d61409aac22198ec771560a38adb0018ba2",
        "pytorch_version": "2.6.0.dev20241120"
    },
    "metrics": {
        "model=BERT_pytorch, test=eval, device=mps, bs=None, extra_args=[], metric=latencies": 133.299,
        "model=BERT_pytorch, test=eval, device=mps, bs=None, extra_args=[], metric=cpu_peak_mem": 19.832832,
        "model=BERT_pytorch, test=eval, device=mps, bs=None, extra_args=[], metric=gpu_peak_mem": "failed"
    }
}
```

ascend npu:

```
python run_benchmark.py test_bench --models BERT_pytorch --device npu
Running TorchBenchModelConfig(name='BERT_pytorch', test='eval', device='npu', batch_size=None, extra_args=[], extra_env=None, output_dir=None) ... [done]
{
    "name": "test_bench",
    "environ": {
        "pytorch_git_version": "64141411e0de61b61857e216ae7a8766f4f5969b",
        "pytorch_version": "2.6.0.dev20240923"
    },
    "metrics": {
        "model=BERT_pytorch, test=eval, device=npu, bs=None, extra_args=[], metric=latencies": 21.688104,
        "model=BERT_pytorch, test=eval, device=npu, bs=None, extra_args=[], metric=cpu_peak_mem": 47.261696,
        "model=BERT_pytorch, test=eval, device=npu, bs=None, extra_args=[], metric=gpu_peak_mem": "failed"
    }
}
```

cc: xuzhao9 jgong5 FFFrog

Pull Request resolved: #2545

Reviewed By: xuzhao9

Differential Revision: D66457386

Pulled By: FindHao

fbshipit-source-id: 0f3a8aba97a2cb2efc3f77f01bcd28cfc7182e0b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants