-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Put values, lengths and offsets of NJTs together in storage #1023
Open
vmoens
wants to merge
10
commits into
gh/vmoens/24/base
Choose a base branch
from
gh/vmoens/24/head
base: gh/vmoens/24/base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 2, 2024
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 2, 2024
vmoens
added a commit
that referenced
this pull request
Oct 2, 2024
ghstack-source-id: 47735ead77292fc62e19200469c6a8731048f231 Pull Request resolved: #1023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.0610μs | 23.5053μs | 42.5436 KOps/s | 39.1417 KOps/s | |
test_plain_set_stack_nested | 52.0170μs | 23.6152μs | 42.3457 KOps/s | 38.4767 KOps/s | |
test_plain_set_nested_inplace | 75.7610μs | 25.7347μs | 38.8580 KOps/s | 35.3896 KOps/s | |
test_plain_set_stack_nested_inplace | 66.6240μs | 25.6589μs | 38.9728 KOps/s | 35.4439 KOps/s | |
test_items | 23.1530μs | 4.2308μs | 236.3605 KOps/s | 243.3109 KOps/s | |
test_items_nested | 0.5120ms | 0.3846ms | 2.6001 KOps/s | 2.4514 KOps/s | |
test_items_nested_locked | 0.5626ms | 0.3830ms | 2.6113 KOps/s | 2.5881 KOps/s | |
test_items_nested_leaf | 0.1561ms | 80.2421μs | 12.4623 KOps/s | 12.2595 KOps/s | |
test_items_stack_nested | 0.5271ms | 0.3849ms | 2.5983 KOps/s | 2.5406 KOps/s | |
test_items_stack_nested_leaf | 0.1499ms | 83.1392μs | 12.0280 KOps/s | 11.7947 KOps/s | |
test_items_stack_nested_locked | 0.5837ms | 0.3837ms | 2.6061 KOps/s | 2.5818 KOps/s | |
test_keys | 36.3170μs | 3.5698μs | 280.1312 KOps/s | 283.5856 KOps/s | |
test_keys_nested | 0.2309ms | 0.1336ms | 7.4870 KOps/s | 7.1516 KOps/s | |
test_keys_nested_locked | 0.7723ms | 0.1386ms | 7.2163 KOps/s | 6.9005 KOps/s | |
test_keys_nested_leaf | 0.2032ms | 0.1167ms | 8.5726 KOps/s | 8.1743 KOps/s | |
test_keys_stack_nested | 0.2267ms | 0.1346ms | 7.4292 KOps/s | 7.1854 KOps/s | |
test_keys_stack_nested_leaf | 0.2002ms | 0.1172ms | 8.5342 KOps/s | 8.4332 KOps/s | |
test_keys_stack_nested_locked | 0.2335ms | 0.1387ms | 7.2113 KOps/s | 6.9625 KOps/s | |
test_values | 7.8486μs | 1.0476μs | 954.5256 KOps/s | 939.6585 KOps/s | |
test_values_nested | 0.1545ms | 93.4116μs | 10.7053 KOps/s | 10.3933 KOps/s | |
test_values_nested_locked | 0.1571ms | 93.3294μs | 10.7147 KOps/s | 10.3669 KOps/s | |
test_values_nested_leaf | 0.1441ms | 80.2601μs | 12.4595 KOps/s | 12.0563 KOps/s | |
test_values_stack_nested | 0.1582ms | 92.6340μs | 10.7952 KOps/s | 10.1839 KOps/s | |
test_values_stack_nested_leaf | 0.1459ms | 79.6106μs | 12.5611 KOps/s | 12.0492 KOps/s | |
test_values_stack_nested_locked | 0.1616ms | 93.8749μs | 10.6525 KOps/s | 10.4612 KOps/s | |
test_membership | 5.4416μs | 0.7386μs | 1.3539 MOps/s | 1.3163 MOps/s | |
test_membership_nested | 37.7400μs | 2.7543μs | 363.0657 KOps/s | 362.9317 KOps/s | |
test_membership_nested_leaf | 25.7080μs | 2.7826μs | 359.3811 KOps/s | 356.3370 KOps/s | |
test_membership_stacked_nested | 28.2120μs | 2.7079μs | 369.2878 KOps/s | 367.7906 KOps/s | |
test_membership_stacked_nested_leaf | 17.8530μs | 2.7418μs | 364.7223 KOps/s | 357.9529 KOps/s | |
test_membership_nested_last | 40.4740μs | 4.1380μs | 241.6627 KOps/s | 237.1864 KOps/s | |
test_membership_nested_leaf_last | 27.5610μs | 4.1981μs | 238.2009 KOps/s | 239.6246 KOps/s | |
test_membership_stacked_nested_last | 35.1760μs | 4.8159μs | 207.6446 KOps/s | 131.6535 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.7710μs | 4.9234μs | 203.1109 KOps/s | 132.1912 KOps/s | |
test_nested_getleaf | 37.1590μs | 10.9382μs | 91.4228 KOps/s | 91.5765 KOps/s | |
test_nested_get | 46.8700μs | 9.9937μs | 100.0632 KOps/s | 96.6923 KOps/s | |
test_stacked_getleaf | 30.9170μs | 10.5633μs | 94.6678 KOps/s | 92.3649 KOps/s | |
test_stacked_get | 30.6480μs | 10.0350μs | 99.6512 KOps/s | 87.4757 KOps/s | |
test_nested_getitemleaf | 39.1630μs | 10.9390μs | 91.4164 KOps/s | 88.6413 KOps/s | |
test_nested_getitem | 45.1940μs | 10.2140μs | 97.9051 KOps/s | 91.2810 KOps/s | |
test_stacked_getitemleaf | 43.0300μs | 10.9572μs | 91.2644 KOps/s | 90.1011 KOps/s | |
test_stacked_getitem | 42.4790μs | 10.2504μs | 97.5571 KOps/s | 95.8118 KOps/s | |
test_lock_nested | 83.7694ms | 0.5897ms | 1.6957 KOps/s | 1.9658 KOps/s | |
test_lock_stack_nested | 0.8638ms | 0.4741ms | 2.1094 KOps/s | 2.1409 KOps/s | |
test_unlock_nested | 88.3304ms | 0.5127ms | 1.9504 KOps/s | 2.3529 KOps/s | |
test_unlock_stack_nested | 0.6036ms | 0.3875ms | 2.5804 KOps/s | 2.5931 KOps/s | |
test_flatten_speed | 0.1699ms | 0.1009ms | 9.9127 KOps/s | 9.8866 KOps/s | |
test_unflatten_speed | 1.0662ms | 0.5114ms | 1.9553 KOps/s | 1.8977 KOps/s | |
test_common_ops | 5.0005ms | 1.1439ms | 874.2084 Ops/s | 849.5130 Ops/s | |
test_creation | 83.2550μs | 2.0554μs | 486.5146 KOps/s | 466.4653 KOps/s | |
test_creation_empty | 63.1480μs | 17.3041μs | 57.7898 KOps/s | 47.6935 KOps/s | |
test_creation_nested_1 | 93.9050μs | 20.7416μs | 48.2122 KOps/s | 41.5624 KOps/s | |
test_creation_nested_2 | 69.0590μs | 25.4009μs | 39.3687 KOps/s | 34.7097 KOps/s | |
test_clone | 0.2069ms | 16.8872μs | 59.2165 KOps/s | 56.8080 KOps/s | |
test_getitem[int] | 1.0482ms | 17.1227μs | 58.4021 KOps/s | 57.6856 KOps/s | |
test_getitem[slice_int] | 0.1320ms | 30.6116μs | 32.6674 KOps/s | 31.5211 KOps/s | |
test_getitem[range] | 0.1785ms | 57.7666μs | 17.3110 KOps/s | 17.2966 KOps/s | |
test_getitem[tuple] | 0.1281ms | 25.4129μs | 39.3501 KOps/s | 38.1120 KOps/s | |
test_getitem[list] | 0.1795ms | 54.1270μs | 18.4751 KOps/s | 18.8048 KOps/s | |
test_setitem_dim[int] | 75.9520μs | 31.6939μs | 31.5518 KOps/s | 30.5202 KOps/s | |
test_setitem_dim[slice_int] | 0.1109ms | 59.4150μs | 16.8308 KOps/s | 16.1975 KOps/s | |
test_setitem_dim[range] | 0.1265ms | 82.5450μs | 12.1146 KOps/s | 11.7560 KOps/s | |
test_setitem_dim[tuple] | 92.8430μs | 47.6918μs | 20.9680 KOps/s | 20.3218 KOps/s | |
test_setitem | 66.8550μs | 29.3897μs | 34.0256 KOps/s | 31.9962 KOps/s | |
test_set | 82.7140μs | 29.1163μs | 34.3450 KOps/s | 32.6248 KOps/s | |
test_set_shared | 3.8988ms | 0.2177ms | 4.5940 KOps/s | 4.4901 KOps/s | |
test_update | 0.2293ms | 37.5342μs | 26.6423 KOps/s | 24.3149 KOps/s | |
test_update_nested | 0.2333ms | 48.8261μs | 20.4808 KOps/s | 19.5792 KOps/s | |
test_update__nested | 0.5303ms | 45.0250μs | 22.2099 KOps/s | 22.0291 KOps/s | |
test_set_nested | 0.2418ms | 32.0446μs | 31.2065 KOps/s | 29.4335 KOps/s | |
test_set_nested_new | 82.2430μs | 37.2347μs | 26.8567 KOps/s | 25.9132 KOps/s | |
test_select | 0.2799ms | 54.2937μs | 18.4183 KOps/s | 17.6881 KOps/s | |
test_select_nested | 0.1273ms | 61.0514μs | 16.3796 KOps/s | 16.6846 KOps/s | |
test_exclude_nested | 0.1586ms | 75.1628μs | 13.3045 KOps/s | 13.2250 KOps/s | |
test_empty[True] | 0.6481ms | 0.3540ms | 2.8251 KOps/s | 2.8052 KOps/s | |
test_empty[False] | 8.6462μs | 1.2748μs | 784.4391 KOps/s | 809.3043 KOps/s | |
test_unbind_speed | 0.6478ms | 0.3019ms | 3.3119 KOps/s | 3.2610 KOps/s | |
test_unbind_speed_stack0 | 0.6184ms | 0.3032ms | 3.2977 KOps/s | 3.3980 KOps/s | |
test_unbind_speed_stack1 | 0.1002s | 0.8241ms | 1.2135 KOps/s | 1.3627 KOps/s | |
test_split | 3.0370ms | 2.0072ms | 498.2038 Ops/s | 447.2298 Ops/s | |
test_chunk | 99.2087ms | 2.4158ms | 413.9353 Ops/s | 441.4800 Ops/s | |
test_creation[device0] | 0.2424ms | 0.1153ms | 8.6727 KOps/s | 8.6340 KOps/s | |
test_creation_from_tensor | 4.0644ms | 0.1186ms | 8.4292 KOps/s | 8.4827 KOps/s | |
test_add_one[memmap_tensor0] | 78.0560μs | 6.9743μs | 143.3827 KOps/s | 140.7443 KOps/s | |
test_contiguous[memmap_tensor0] | 16.9310μs | 1.9563μs | 511.1746 KOps/s | 448.4936 KOps/s | |
test_stack[memmap_tensor0] | 29.2240μs | 5.4897μs | 182.1588 KOps/s | 144.1686 KOps/s | |
test_memmaptd_index | 1.1261ms | 0.4188ms | 2.3875 KOps/s | 2.3605 KOps/s | |
test_memmaptd_index_astensor | 1.1792ms | 0.5196ms | 1.9247 KOps/s | 1.8909 KOps/s | |
test_memmaptd_index_op | 1.8109ms | 1.0249ms | 975.6852 Ops/s | 898.5474 Ops/s | |
test_serialize_model | 0.1228s | 0.1174s | 8.5182 Ops/s | 8.5903 Ops/s | |
test_serialize_model_pickle | 0.4448s | 0.3889s | 2.5714 Ops/s | 2.4938 Ops/s | |
test_serialize_weights | 0.1259s | 0.1184s | 8.4445 Ops/s | 8.5402 Ops/s | |
test_serialize_weights_returnearly | 0.2702s | 0.1740s | 5.7472 Ops/s | 6.4035 Ops/s | |
test_serialize_weights_pickle | 0.4813s | 0.3882s | 2.5762 Ops/s | 1.0960 Ops/s | |
test_serialize_weights_filesystem | 0.1455s | 0.1407s | 7.1093 Ops/s | 7.0708 Ops/s | |
test_serialize_model_filesystem | 0.1603s | 0.1510s | 6.6245 Ops/s | 6.9618 Ops/s | |
test_reshape_pytree | 94.5970μs | 39.2474μs | 25.4794 KOps/s | 24.7041 KOps/s | |
test_reshape_td | 95.8190μs | 48.0832μs | 20.7973 KOps/s | 20.6583 KOps/s | |
test_view_pytree | 81.6420μs | 39.1785μs | 25.5242 KOps/s | 24.6678 KOps/s | |
test_view_td | 0.1071ms | 53.4581μs | 18.7062 KOps/s | 18.3162 KOps/s | |
test_unbind_pytree | 89.5570μs | 35.9759μs | 27.7964 KOps/s | 26.3323 KOps/s | |
test_unbind_td | 0.2898ms | 45.0976μs | 22.1741 KOps/s | 21.5521 KOps/s | |
test_split_pytree | 78.8570μs | 38.3060μs | 26.1056 KOps/s | 25.1049 KOps/s | |
test_split_td | 0.1985ms | 59.7583μs | 16.7341 KOps/s | 17.0388 KOps/s | |
test_add_pytree | 0.1083ms | 44.5684μs | 22.4374 KOps/s | 21.9518 KOps/s | |
test_add_td | 0.6759ms | 84.8142μs | 11.7905 KOps/s | 10.5725 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1426ms | 59.7438μs | 16.7381 KOps/s | 16.1710 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.9754ms | 0.1975ms | 5.0623 KOps/s | 5.0544 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.4995ms | 59.1942μs | 16.8936 KOps/s | 17.3048 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2727ms | 0.1395ms | 7.1670 KOps/s | 7.0769 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 72.6560μs | 24.9649μs | 40.0562 KOps/s | 42.3826 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1392ms | 73.5309μs | 13.5997 KOps/s | 12.9529 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1715ms | 76.7046μs | 13.0370 KOps/s | 12.7214 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4148ms | 68.7638μs | 14.5425 KOps/s | 14.0121 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3275ms | 0.1813ms | 5.5167 KOps/s | 5.4814 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3840ms | 0.2358ms | 4.2413 KOps/s | 4.0446 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1111ms | 48.3547μs | 20.6805 KOps/s | 21.1166 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4334ms | 76.4686μs | 13.0773 KOps/s | 10.7469 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3696ms | 0.1778ms | 5.6238 KOps/s | 5.6469 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5953ms | 0.2846ms | 3.5131 KOps/s | 3.4617 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4031ms | 0.2742ms | 3.6464 KOps/s | 3.4868 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5476ms | 0.1874ms | 5.3370 KOps/s | 5.4416 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1868ms | 74.3900μs | 13.4427 KOps/s | 13.5028 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1058ms | 47.8239μs | 20.9100 KOps/s | 20.6431 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4071ms | 0.2317ms | 4.3162 KOps/s | 4.3574 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2734ms | 0.1741ms | 5.7424 KOps/s | 5.6310 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2501ms | 0.1114ms | 8.9739 KOps/s | 8.8710 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1781ms | 79.1425μs | 12.6354 KOps/s | 12.2842 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1481ms | 78.7643μs | 12.6961 KOps/s | 12.1962 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1542ms | 69.3490μs | 14.4198 KOps/s | 13.8430 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3817ms | 0.1934ms | 5.1701 KOps/s | 5.2188 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.2561ms | 1.7400ms | 574.7133 Ops/s | 561.8206 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2841ms | 0.1868ms | 5.3520 KOps/s | 5.1807 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2326ms | 1.0864ms | 920.4682 Ops/s | 883.2286 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.7310ms | 0.4163ms | 2.4018 KOps/s | 2.3944 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.1494ms | 3.9825ms | 251.0975 Ops/s | 236.1811 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 80.5400μs | 33.0560μs | 30.2517 KOps/s | 28.6803 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.1363ms | 47.3647μs | 21.1128 KOps/s | 20.2003 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 78.2960μs | 28.7367μs | 34.7987 KOps/s | 32.6825 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 80.0800μs | 29.3675μs | 34.0513 KOps/s | 33.2060 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 72.6260μs | 28.5836μs | 34.9851 KOps/s | 32.7506 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 77.5640μs | 28.8515μs | 34.6602 KOps/s | 33.3390 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1501ms | 72.1994μs | 13.8505 KOps/s | 13.4752 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5581ms | 27.9301μs | 35.8036 KOps/s | 35.6697 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1889ms | 67.6504μs | 14.7819 KOps/s | 14.5571 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 65.0920μs | 23.5475μs | 42.4674 KOps/s | 40.7201 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1449ms | 67.3576μs | 14.8461 KOps/s | 14.6915 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 65.6920μs | 23.2089μs | 43.0870 KOps/s | 41.1881 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1630ms | 72.2276μs | 13.8451 KOps/s | 13.4283 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8570ms | 27.7778μs | 36.0000 KOps/s | 36.2061 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1523ms | 67.4436μs | 14.8272 KOps/s | 14.8126 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 61.3040μs | 22.9795μs | 43.5171 KOps/s | 41.7125 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1561ms | 66.5148μs | 15.0342 KOps/s | 14.6826 KOps/s | |
test_compile_indexing[int-pytree-eager] | 69.6700μs | 23.3166μs | 42.8880 KOps/s | 41.2940 KOps/s | |
test_mod_add[eager] | 67.6760μs | 23.6471μs | 42.2884 KOps/s | 36.0950 KOps/s | |
test_mod_add[compile] | 92.9140μs | 37.7610μs | 26.4823 KOps/s | 25.6845 KOps/s | |
test_mod_add[compile-overhead] | 86.8420μs | 37.6462μs | 26.5631 KOps/s | 26.1177 KOps/s | |
test_mod_wrap[eager] | 0.3145ms | 0.2022ms | 4.9467 KOps/s | 4.8563 KOps/s | |
test_mod_wrap[compile] | 0.3854ms | 0.2283ms | 4.3796 KOps/s | 4.3407 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3635ms | 0.2288ms | 4.3715 KOps/s | 4.3411 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.0761ms | 10.6616ms | 93.7946 Ops/s | 92.5318 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.6413ms | 10.7579ms | 92.9546 Ops/s | 83.7756 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.0332ms | 10.7137ms | 93.3385 Ops/s | 87.0019 Ops/s | |
test_seq_add[eager] | 0.1712ms | 89.5440μs | 11.1677 KOps/s | 10.4428 KOps/s | |
test_seq_add[compile] | 0.1590ms | 63.3212μs | 15.7925 KOps/s | 15.3861 KOps/s | |
test_seq_add[compile-overhead] | 0.1377ms | 63.1119μs | 15.8449 KOps/s | 15.7873 KOps/s | |
test_seq_wrap[eager] | 0.7073ms | 0.3699ms | 2.7031 KOps/s | 2.5554 KOps/s | |
test_seq_wrap[compile] | 0.4813ms | 0.2654ms | 3.7684 KOps/s | 3.7220 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4846ms | 0.2670ms | 3.7448 KOps/s | 3.7036 KOps/s | |
test_func_call_runtime[False-eager] | 0.7349ms | 0.5136ms | 1.9471 KOps/s | 1.9021 KOps/s | |
test_func_call_runtime[False-compile] | 0.5841ms | 0.4979ms | 2.0083 KOps/s | 1.9760 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7267ms | 0.4985ms | 2.0061 KOps/s | 1.9881 KOps/s | |
test_func_call_runtime[True-eager] | 2.3652ms | 0.7640ms | 1.3089 KOps/s | 1.3414 KOps/s | |
test_func_call_runtime[True-compile] | 0.9092ms | 0.5194ms | 1.9254 KOps/s | 1.9485 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6640ms | 0.5158ms | 1.9388 KOps/s | 1.9513 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1546ms | 0.5138ms | 1.9461 KOps/s | 1.9337 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9346ms | 0.5030ms | 1.9883 KOps/s | 1.9708 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7341ms | 0.5017ms | 1.9931 KOps/s | 1.9884 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4229ms | 0.8832ms | 1.1322 KOps/s | 1.1156 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8862ms | 0.7269ms | 1.3757 KOps/s | 1.3609 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8427ms | 0.7229ms | 1.3834 KOps/s | 1.3515 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.3373ms | 1.8798ms | 531.9826 Ops/s | 530.0901 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 6.0250ms | 1.9845ms | 503.9058 Ops/s | 512.7588 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6732ms | 1.9213ms | 520.4714 Ops/s | 513.4657 Ops/s | |
test_distributed | 0.2295ms | 0.1252ms | 7.9867 KOps/s | 7.7268 KOps/s | |
test_tdmodule | 82.6340μs | 17.6779μs | 56.5677 KOps/s | 51.9173 KOps/s | |
test_tdmodule_dispatch | 55.1730μs | 35.1623μs | 28.4396 KOps/s | 26.1455 KOps/s | |
test_tdseq | 37.7000μs | 19.9882μs | 50.0294 KOps/s | 45.1731 KOps/s | |
test_tdseq_dispatch | 67.2650μs | 40.5113μs | 24.6845 KOps/s | 22.7598 KOps/s | |
test_instantiation_functorch | 1.8460ms | 1.5567ms | 642.3942 Ops/s | 634.8262 Ops/s | |
test_exec_functorch | 0.3755ms | 0.1856ms | 5.3878 KOps/s | 5.2992 KOps/s | |
test_exec_functional_call | 0.7564ms | 0.1722ms | 5.8069 KOps/s | 5.7438 KOps/s | |
test_exec_td_decorator | 0.5415ms | 0.2314ms | 4.3218 KOps/s | 4.1385 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0285ms | 0.6299ms | 1.5875 KOps/s | 1.5426 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9416ms | 0.6284ms | 1.5913 KOps/s | 1.5488 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8152ms | 0.5220ms | 1.9156 KOps/s | 1.8434 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7339ms | 0.5192ms | 1.9260 KOps/s | 1.8951 KOps/s | |
test_to_module_speed[True] | 2.3430ms | 1.4281ms | 700.2525 Ops/s | 702.4027 Ops/s | |
test_to_module_speed[False] | 1.6341ms | 1.3853ms | 721.8777 Ops/s | 723.2419 Ops/s | |
test_tc_init | 83.7870μs | 44.5585μs | 22.4424 KOps/s | 19.7760 KOps/s | |
test_tc_init_nested | 0.2009ms | 90.3639μs | 11.0664 KOps/s | 10.0457 KOps/s | |
test_tc_first_layer_tensor | 31.8800μs | 1.4356μs | 696.5690 KOps/s | 659.5404 KOps/s | |
test_tc_first_layer_nontensor | 29.4550μs | 4.6366μs | 215.6762 KOps/s | 212.0546 KOps/s | |
test_tc_second_layer_tensor | 28.1930μs | 2.7470μs | 364.0282 KOps/s | 360.8835 KOps/s | |
test_tc_second_layer_nontensor | 31.6590μs | 5.8864μs | 169.8840 KOps/s | 165.6390 KOps/s | |
test_unbind | 12.5131ms | 7.5823ms | 131.8854 Ops/s | 75.4141 Ops/s | |
test_full_like | 18.4921ms | 12.4539ms | 80.2964 Ops/s | 136.7223 Ops/s | |
test_zeros_like | 17.1823ms | 7.7189ms | 129.5513 Ops/s | 362.6966 Ops/s | |
test_ones_like | 15.9530ms | 7.5213ms | 132.9563 Ops/s | 316.4179 Ops/s | |
test_clone | 18.8352ms | 9.3441ms | 107.0195 Ops/s | 198.3276 Ops/s | |
test_squeeze | 59.5920μs | 12.1084μs | 82.5876 KOps/s | 77.9240 KOps/s | |
test_unsqueeze | 0.1538ms | 89.2859μs | 11.2000 KOps/s | 10.6004 KOps/s | |
test_split | 0.4312ms | 0.1914ms | 5.2247 KOps/s | 5.0981 KOps/s | |
test_permute | 0.3569ms | 0.2219ms | 4.5059 KOps/s | 4.5453 KOps/s | |
test_stack | 28.5902ms | 26.1015ms | 38.3120 Ops/s | 36.1989 Ops/s | |
test_cat | 28.9102ms | 26.0643ms | 38.3666 Ops/s | 38.7258 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1517ms | 17.1475μs | 58.3176 KOps/s | 55.0363 KOps/s | |
test_plain_set_stack_nested | 40.7400μs | 17.1156μs | 58.4261 KOps/s | 54.4979 KOps/s | |
test_plain_set_nested_inplace | 47.4210μs | 18.2672μs | 54.7430 KOps/s | 51.9111 KOps/s | |
test_plain_set_stack_nested_inplace | 44.8910μs | 18.0773μs | 55.3180 KOps/s | 51.6800 KOps/s | |
test_items | 24.0310μs | 2.8303μs | 353.3226 KOps/s | 345.1482 KOps/s | |
test_items_nested | 0.3859ms | 0.3399ms | 2.9423 KOps/s | 2.9549 KOps/s | |
test_items_nested_locked | 0.3956ms | 0.3394ms | 2.9467 KOps/s | 2.9420 KOps/s | |
test_items_nested_leaf | 87.3710μs | 62.1304μs | 16.0952 KOps/s | 15.9758 KOps/s | |
test_items_stack_nested | 0.3922ms | 0.3449ms | 2.8997 KOps/s | 2.9127 KOps/s | |
test_items_stack_nested_leaf | 96.9420μs | 65.0871μs | 15.3640 KOps/s | 15.4783 KOps/s | |
test_items_stack_nested_locked | 0.4929ms | 0.3451ms | 2.8979 KOps/s | 2.9286 KOps/s | |
test_keys | 25.8600μs | 3.3984μs | 294.2552 KOps/s | 289.3804 KOps/s | |
test_keys_nested | 0.1032ms | 70.4676μs | 14.1909 KOps/s | 14.4181 KOps/s | |
test_keys_nested_locked | 2.3615ms | 76.6026μs | 13.0544 KOps/s | 13.0627 KOps/s | |
test_keys_nested_leaf | 0.1043ms | 61.3138μs | 16.3096 KOps/s | 16.1535 KOps/s | |
test_keys_stack_nested | 0.1134ms | 71.6165μs | 13.9633 KOps/s | 13.7908 KOps/s | |
test_keys_stack_nested_leaf | 94.1210μs | 63.6168μs | 15.7191 KOps/s | 15.8691 KOps/s | |
test_keys_stack_nested_locked | 0.1061ms | 76.6588μs | 13.0448 KOps/s | 12.8398 KOps/s | |
test_values | 5.6152μs | 0.8609μs | 1.1615 MOps/s | 1.1419 MOps/s | |
test_values_nested | 84.6620μs | 48.7275μs | 20.5223 KOps/s | 20.5075 KOps/s | |
test_values_nested_locked | 84.0020μs | 49.9683μs | 20.0127 KOps/s | 19.9963 KOps/s | |
test_values_nested_leaf | 67.8910μs | 42.4974μs | 23.5308 KOps/s | 23.5198 KOps/s | |
test_values_stack_nested | 92.0120μs | 50.4062μs | 19.8388 KOps/s | 19.8073 KOps/s | |
test_values_stack_nested_leaf | 75.5120μs | 43.8721μs | 22.7936 KOps/s | 22.6345 KOps/s | |
test_values_stack_nested_locked | 0.1021ms | 51.3167μs | 19.4868 KOps/s | 19.4054 KOps/s | |
test_membership | 2.3016μs | 0.5090μs | 1.9646 MOps/s | 1.9964 MOps/s | |
test_membership_nested | 15.4100μs | 1.8727μs | 533.9772 KOps/s | 524.8455 KOps/s | |
test_membership_nested_leaf | 10.1833μs | 1.8494μs | 540.7154 KOps/s | 541.0576 KOps/s | |
test_membership_stacked_nested | 32.4710μs | 1.9414μs | 515.0924 KOps/s | 514.5154 KOps/s | |
test_membership_stacked_nested_leaf | 24.7910μs | 1.9563μs | 511.1659 KOps/s | 511.4671 KOps/s | |
test_membership_nested_last | 31.1900μs | 2.9817μs | 335.3756 KOps/s | 334.2882 KOps/s | |
test_membership_nested_leaf_last | 33.9010μs | 2.9618μs | 337.6335 KOps/s | 337.6237 KOps/s | |
test_membership_stacked_nested_last | 39.6610μs | 2.9999μs | 333.3495 KOps/s | 287.3409 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.3910μs | 2.9499μs | 338.9966 KOps/s | 287.5203 KOps/s | |
test_nested_getleaf | 34.3210μs | 6.0716μs | 164.7015 KOps/s | 162.4108 KOps/s | |
test_nested_get | 35.2710μs | 5.7023μs | 175.3685 KOps/s | 173.4923 KOps/s | |
test_stacked_getleaf | 29.7400μs | 6.0218μs | 166.0647 KOps/s | 163.7060 KOps/s | |
test_stacked_get | 37.9000μs | 5.6100μs | 178.2530 KOps/s | 175.2396 KOps/s | |
test_nested_getitemleaf | 32.6900μs | 6.0595μs | 165.0313 KOps/s | 163.5802 KOps/s | |
test_nested_getitem | 36.5910μs | 5.7529μs | 173.8242 KOps/s | 175.5089 KOps/s | |
test_stacked_getitemleaf | 37.2100μs | 6.0221μs | 166.0546 KOps/s | 162.4451 KOps/s | |
test_stacked_getitem | 32.5310μs | 5.6681μs | 176.4255 KOps/s | 172.7817 KOps/s | |
test_lock_nested | 5.1619ms | 0.4273ms | 2.3403 KOps/s | 2.3246 KOps/s | |
test_lock_stack_nested | 0.4405ms | 0.3848ms | 2.5987 KOps/s | 2.5551 KOps/s | |
test_unlock_nested | 0.7637ms | 0.3601ms | 2.7774 KOps/s | 2.7592 KOps/s | |
test_unlock_stack_nested | 0.3761ms | 0.3241ms | 3.0852 KOps/s | 3.0567 KOps/s | |
test_flatten_speed | 0.1561ms | 76.1800μs | 13.1268 KOps/s | 13.0477 KOps/s | |
test_unflatten_speed | 0.3566ms | 0.3179ms | 3.1454 KOps/s | 3.0914 KOps/s | |
test_common_ops | 1.5553ms | 1.2444ms | 803.6185 Ops/s | 755.4151 Ops/s | |
test_creation | 23.4100μs | 1.4847μs | 673.5264 KOps/s | 671.1538 KOps/s | |
test_creation_empty | 41.1010μs | 15.9574μs | 62.6669 KOps/s | 53.8095 KOps/s | |
test_creation_nested_1 | 65.7110μs | 17.5298μs | 57.0456 KOps/s | 49.7706 KOps/s | |
test_creation_nested_2 | 57.1620μs | 20.0053μs | 49.9868 KOps/s | 43.4168 KOps/s | |
test_clone | 65.6610μs | 28.5231μs | 35.0592 KOps/s | 33.3449 KOps/s | |
test_getitem[int] | 1.2499ms | 15.7499μs | 63.4924 KOps/s | 61.5917 KOps/s | |
test_getitem[slice_int] | 0.1240ms | 27.2373μs | 36.7144 KOps/s | 36.4322 KOps/s | |
test_getitem[range] | 0.2870ms | 0.1088ms | 9.1928 KOps/s | 8.9759 KOps/s | |
test_getitem[tuple] | 0.1158ms | 23.4333μs | 42.6743 KOps/s | 41.9694 KOps/s | |
test_getitem[list] | 0.1859ms | 98.3322μs | 10.1696 KOps/s | 10.0307 KOps/s | |
test_setitem_dim[int] | 67.9210μs | 45.0706μs | 22.1874 KOps/s | 22.2760 KOps/s | |
test_setitem_dim[slice_int] | 0.1025ms | 66.9515μs | 14.9362 KOps/s | 15.0777 KOps/s | |
test_setitem_dim[range] | 0.1638ms | 0.1278ms | 7.8258 KOps/s | 7.8374 KOps/s | |
test_setitem_dim[tuple] | 86.6610μs | 60.3174μs | 16.5790 KOps/s | 16.4583 KOps/s | |
test_setitem | 93.4210μs | 41.6709μs | 23.9976 KOps/s | 22.3965 KOps/s | |
test_set | 69.5310μs | 39.9387μs | 25.0384 KOps/s | 23.1146 KOps/s | |
test_set_shared | 0.3546ms | 52.6819μs | 18.9819 KOps/s | 18.4579 KOps/s | |
test_update | 97.0120μs | 49.8032μs | 20.0790 KOps/s | 18.5230 KOps/s | |
test_update_nested | 98.2220μs | 57.2448μs | 17.4688 KOps/s | 16.4095 KOps/s | |
test_update__nested | 0.1739ms | 61.4051μs | 16.2853 KOps/s | 16.1746 KOps/s | |
test_set_nested | 82.2920μs | 43.5732μs | 22.9499 KOps/s | 21.6484 KOps/s | |
test_set_nested_new | 85.9410μs | 47.4392μs | 21.0796 KOps/s | 20.4008 KOps/s | |
test_select | 93.8010μs | 60.3386μs | 16.5731 KOps/s | 15.9872 KOps/s | |
test_select_nested | 0.4263ms | 42.2588μs | 23.6637 KOps/s | 23.6528 KOps/s | |
test_exclude_nested | 96.1120μs | 59.0294μs | 16.9407 KOps/s | 16.0882 KOps/s | |
test_empty[True] | 0.3096ms | 0.2575ms | 3.8829 KOps/s | 3.8296 KOps/s | |
test_empty[False] | 3.2620μs | 0.7376μs | 1.3558 MOps/s | 1.3384 MOps/s | |
test_to | 61.4410μs | 27.1775μs | 36.7952 KOps/s | 36.3677 KOps/s | |
test_to_nonblocking | 54.9210μs | 25.6531μs | 38.9817 KOps/s | 38.2869 KOps/s | |
test_unbind_speed | 0.3361ms | 0.2801ms | 3.5708 KOps/s | 3.5519 KOps/s | |
test_unbind_speed_stack0 | 0.3249ms | 0.2780ms | 3.5968 KOps/s | 3.5610 KOps/s | |
test_unbind_speed_stack1 | 92.7001ms | 0.7050ms | 1.4183 KOps/s | 1.4153 KOps/s | |
test_split | 95.2307ms | 2.2037ms | 453.7753 Ops/s | 454.0773 Ops/s | |
test_chunk | 95.4556ms | 2.2161ms | 451.2427 Ops/s | 452.6570 Ops/s | |
test_creation[device0] | 0.3428ms | 0.1286ms | 7.7779 KOps/s | 7.7598 KOps/s | |
test_creation_from_tensor | 0.3510ms | 0.1296ms | 7.7180 KOps/s | 7.6752 KOps/s | |
test_add_one[memmap_tensor0] | 0.2333ms | 8.7528μs | 114.2497 KOps/s | 112.7858 KOps/s | |
test_contiguous[memmap_tensor0] | 21.5710μs | 2.2284μs | 448.7479 KOps/s | 442.3126 KOps/s | |
test_stack[memmap_tensor0] | 37.6810μs | 7.1204μs | 140.4410 KOps/s | 141.1180 KOps/s | |
test_memmaptd_index | 1.2196ms | 0.4431ms | 2.2569 KOps/s | 2.2662 KOps/s | |
test_memmaptd_index_astensor | 0.7785ms | 0.5137ms | 1.9467 KOps/s | 1.9276 KOps/s | |
test_memmaptd_index_op | 1.4350ms | 1.0412ms | 960.4328 Ops/s | 929.9012 Ops/s | |
test_serialize_model | 0.1304s | 0.1298s | 7.7067 Ops/s | 7.7258 Ops/s | |
test_serialize_model_pickle | 1.3477s | 1.2121s | 0.8250 Ops/s | 0.8221 Ops/s | |
test_serialize_weights | 0.1306s | 0.1295s | 7.7235 Ops/s | 7.7620 Ops/s | |
test_serialize_weights_returnearly | 0.2405s | 63.1841ms | 15.8268 Ops/s | 15.7426 Ops/s | |
test_serialize_weights_pickle | 1.3472s | 1.1853s | 0.8437 Ops/s | 0.8197 Ops/s | |
test_reshape_pytree | 62.2210μs | 36.6218μs | 27.3061 KOps/s | 27.0492 KOps/s | |
test_reshape_td | 95.2420μs | 43.1747μs | 23.1617 KOps/s | 22.5901 KOps/s | |
test_view_pytree | 67.3510μs | 35.6397μs | 28.0586 KOps/s | 27.3141 KOps/s | |
test_view_td | 86.1810μs | 46.7242μs | 21.4022 KOps/s | 20.8592 KOps/s | |
test_unbind_pytree | 68.9010μs | 34.7130μs | 28.8077 KOps/s | 29.0452 KOps/s | |
test_unbind_td | 0.5209ms | 43.6801μs | 22.8937 KOps/s | 23.0665 KOps/s | |
test_split_pytree | 85.3420μs | 46.0940μs | 21.6948 KOps/s | 21.1232 KOps/s | |
test_split_td | 0.6926ms | 55.2295μs | 18.1062 KOps/s | 17.5159 KOps/s | |
test_add_pytree | 0.1080ms | 56.7415μs | 17.6238 KOps/s | 16.7382 KOps/s | |
test_add_td | 0.2288ms | 97.1708μs | 10.2912 KOps/s | 10.1511 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2574ms | 0.1612ms | 6.2020 KOps/s | 6.0281 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5605ms | 0.1604ms | 6.2341 KOps/s | 6.3055 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2152ms | 0.1534ms | 6.5176 KOps/s | 6.1371 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5797ms | 0.1865ms | 5.3620 KOps/s | 4.9255 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4046ms | 21.7103μs | 46.0611 KOps/s | 47.5989 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1276ms | 47.7419μs | 20.9459 KOps/s | 20.2171 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4531ms | 64.4790μs | 15.5089 KOps/s | 16.0400 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4355ms | 49.9122μs | 20.0352 KOps/s | 20.0599 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4106ms | 0.3172ms | 3.1525 KOps/s | 3.0922 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6309ms | 0.2296ms | 4.3558 KOps/s | 4.3071 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1774ms | 0.1269ms | 7.8791 KOps/s | 7.7935 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4749ms | 64.8658μs | 15.4164 KOps/s | 15.3428 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.7312ms | 0.3258ms | 3.0690 KOps/s | 3.0265 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 1.0231ms | 0.6364ms | 1.5713 KOps/s | 1.4663 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6787ms | 0.2806ms | 3.5636 KOps/s | 3.5245 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3602ms | 0.3188ms | 3.1372 KOps/s | 3.0786 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1612ms | 76.5751μs | 13.0591 KOps/s | 12.9942 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1931ms | 0.1291ms | 7.7446 KOps/s | 7.7718 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6483ms | 0.5344ms | 1.8712 KOps/s | 1.6820 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3933ms | 0.3258ms | 3.0694 KOps/s | 3.0107 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 86.9810μs | 20.5835μs | 48.5826 KOps/s | 48.2196 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 94.6920μs | 37.4553μs | 26.6985 KOps/s | 25.6605 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1541ms | 68.5090μs | 14.5966 KOps/s | 14.4313 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4313ms | 51.2940μs | 19.4955 KOps/s | 19.5822 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.4331ms | 0.8431ms | 1.1861 KOps/s | 1.0845 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6000ms | 3.2498ms | 307.7135 Ops/s | 294.0750 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4478ms | 0.8494ms | 1.1773 KOps/s | 1.0736 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.3706ms | 3.2046ms | 312.0560 Ops/s | 283.9896 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1566ms | 0.1176ms | 8.5070 KOps/s | 8.0505 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1826ms | 62.5291μs | 15.9926 KOps/s | 15.7138 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1642ms | 0.1130ms | 8.8475 KOps/s | 8.4767 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1056ms | 44.3390μs | 22.5535 KOps/s | 21.2692 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1758ms | 0.1196ms | 8.3643 KOps/s | 8.2274 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 85.8920μs | 43.4603μs | 23.0095 KOps/s | 21.1474 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1874ms | 0.1447ms | 6.9090 KOps/s | 6.6865 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1527ms | 25.5059μs | 39.2066 KOps/s | 37.6781 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1818ms | 0.1443ms | 6.9302 KOps/s | 6.9680 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 62.6710μs | 21.6055μs | 46.2844 KOps/s | 48.2264 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1913ms | 0.1414ms | 7.0735 KOps/s | 6.8211 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 55.5510μs | 21.0783μs | 47.4422 KOps/s | 48.7111 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2740ms | 0.1451ms | 6.8931 KOps/s | 6.5633 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4872ms | 24.3249μs | 41.1101 KOps/s | 38.1537 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2046ms | 0.1394ms | 7.1718 KOps/s | 6.7676 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 58.4210μs | 20.6386μs | 48.4530 KOps/s | 48.0781 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1818ms | 0.1394ms | 7.1734 KOps/s | 6.7964 KOps/s | |
test_compile_indexing[int-pytree-eager] | 52.7310μs | 21.2432μs | 47.0740 KOps/s | 48.4152 KOps/s | |
test_mod_add[eager] | 65.6420μs | 32.3338μs | 30.9274 KOps/s | 27.6327 KOps/s | |
test_mod_add[compile] | 0.1205ms | 80.6134μs | 12.4049 KOps/s | 11.4541 KOps/s | |
test_mod_add[compile-overhead] | 0.3052ms | 0.1525ms | 6.5557 KOps/s | 6.1021 KOps/s | |
test_mod_wrap[eager] | 0.3452ms | 0.2393ms | 4.1785 KOps/s | 3.8287 KOps/s | |
test_mod_wrap[compile] | 1.5572ms | 0.3069ms | 3.2581 KOps/s | 3.1779 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6448ms | 4.0528ms | 246.7436 Ops/s | 246.2503 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4795ms | 1.3117ms | 762.3685 Ops/s | 696.9353 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5539ms | 1.3182ms | 758.6131 Ops/s | 690.9030 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3355ms | 0.8937ms | 1.1190 KOps/s | 977.1837 Ops/s | |
test_seq_add[eager] | 0.1633ms | 0.1043ms | 9.5921 KOps/s | 9.4509 KOps/s | |
test_seq_add[compile] | 0.3339ms | 95.9950μs | 10.4172 KOps/s | 10.9596 KOps/s | |
test_seq_add[compile-overhead] | 0.1764ms | 0.1243ms | 8.0471 KOps/s | 8.0774 KOps/s | |
test_seq_wrap[eager] | 0.4952ms | 0.3935ms | 2.5415 KOps/s | 2.4247 KOps/s | |
test_seq_wrap[compile] | 0.3492ms | 0.3128ms | 3.1969 KOps/s | 3.1037 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2694ms | 0.2188ms | 4.5697 KOps/s | 4.4759 KOps/s | |
test_func_call_runtime[False-eager] | 0.8018ms | 0.7218ms | 1.3854 KOps/s | 1.2778 KOps/s | |
test_func_call_runtime[False-compile] | 0.8300ms | 0.7794ms | 1.2831 KOps/s | 1.2383 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4249ms | 0.3589ms | 2.7862 KOps/s | 2.7545 KOps/s | |
test_func_call_runtime[True-eager] | 0.9579ms | 0.8762ms | 1.1413 KOps/s | 1.1093 KOps/s | |
test_func_call_runtime[True-compile] | 0.9467ms | 0.8156ms | 1.2260 KOps/s | 1.2150 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4316ms | 0.3806ms | 2.6276 KOps/s | 2.5965 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7677ms | 0.7095ms | 1.4094 KOps/s | 1.2975 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8649ms | 0.7964ms | 1.2557 KOps/s | 1.2426 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4159ms | 0.3593ms | 2.7832 KOps/s | 2.7377 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1181ms | 0.9906ms | 1.0095 KOps/s | 999.6554 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8999ms | 0.8412ms | 1.1888 KOps/s | 1.1489 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4484ms | 0.4025ms | 2.4848 KOps/s | 2.4169 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5408ms | 2.0429ms | 489.5014 Ops/s | 482.6524 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9299ms | 0.8566ms | 1.1674 KOps/s | 1.1409 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4674ms | 0.4065ms | 2.4601 KOps/s | 2.4160 KOps/s | |
test_distributed | 5.4632ms | 0.2290ms | 4.3671 KOps/s | 8.5867 KOps/s | |
test_tdmodule | 48.6110μs | 15.0568μs | 66.4152 KOps/s | 58.6941 KOps/s | |
test_tdmodule_dispatch | 42.0010μs | 29.1315μs | 34.3271 KOps/s | 31.0034 KOps/s | |
test_tdseq | 34.3210μs | 16.0410μs | 62.3404 KOps/s | 57.3394 KOps/s | |
test_tdseq_dispatch | 52.3710μs | 32.0821μs | 31.1701 KOps/s | 28.3368 KOps/s | |
test_instantiation_functorch | 2.0309ms | 1.8671ms | 535.5892 Ops/s | 528.4144 Ops/s | |
test_exec_functorch | 0.2556ms | 0.2074ms | 4.8217 KOps/s | 4.7783 KOps/s | |
test_exec_functional_call | 0.3090ms | 0.2010ms | 4.9764 KOps/s | 4.8287 KOps/s | |
test_exec_td_decorator | 0.4416ms | 0.2608ms | 3.8337 KOps/s | 3.8678 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7976ms | 0.6663ms | 1.5008 KOps/s | 1.4660 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7960ms | 0.6673ms | 1.4987 KOps/s | 1.4748 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6858ms | 0.5834ms | 1.7140 KOps/s | 1.6910 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6866ms | 0.5810ms | 1.7211 KOps/s | 1.6921 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.7858ms | 19.1357ms | 52.2583 Ops/s | 51.6777 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.2965ms | 19.1114ms | 52.3249 Ops/s | 51.6290 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.9821ms | 19.0215ms | 52.5720 Ops/s | 52.1584 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.0894ms | 18.9667ms | 52.7240 Ops/s | 52.0173 Ops/s | |
test_to_module_speed[True] | 1.5217ms | 0.9990ms | 1.0010 KOps/s | 988.5761 Ops/s | |
test_to_module_speed[False] | 1.3945ms | 0.9792ms | 1.0212 KOps/s | 1.0280 KOps/s | |
test_tc_init | 61.7110μs | 33.5859μs | 29.7744 KOps/s | 27.3196 KOps/s | |
test_tc_init_nested | 0.1068ms | 70.7227μs | 14.1397 KOps/s | 13.5939 KOps/s | |
test_tc_first_layer_tensor | 4.8043μs | 0.6847μs | 1.4606 MOps/s | 1.4682 MOps/s | |
test_tc_first_layer_nontensor | 40.6310μs | 2.2506μs | 444.3221 KOps/s | 445.4248 KOps/s | |
test_tc_second_layer_tensor | 10.5133μs | 1.3935μs | 717.6019 KOps/s | 735.2682 KOps/s | |
test_tc_second_layer_nontensor | 24.0710μs | 2.9746μs | 336.1808 KOps/s | 338.8458 KOps/s | |
test_unbind | 0.1909s | 9.4946ms | 105.3233 Ops/s | 91.6300 Ops/s | |
test_full_like | 0.6554ms | 0.5756ms | 1.7373 KOps/s | 1.7538 KOps/s | |
test_zeros_like | 0.2753ms | 0.1979ms | 5.0531 KOps/s | 5.0498 KOps/s | |
test_ones_like | 0.2323ms | 0.1978ms | 5.0567 KOps/s | 5.0555 KOps/s | |
test_clone | 0.4552ms | 0.4146ms | 2.4120 KOps/s | 2.4104 KOps/s | |
test_squeeze | 36.1510μs | 9.6750μs | 103.3588 KOps/s | 100.4966 KOps/s | |
test_unsqueeze | 0.2460ms | 75.2998μs | 13.2802 KOps/s | 12.9466 KOps/s | |
test_split | 0.4227ms | 0.1555ms | 6.4327 KOps/s | 6.3061 KOps/s | |
test_permute | 0.2292ms | 0.1787ms | 5.5958 KOps/s | 5.4530 KOps/s | |
test_stack | 1.2574ms | 0.8571ms | 1.1668 KOps/s | 1.1621 KOps/s | |
test_cat | 1.2619ms | 1.2311ms | 812.2899 Ops/s | 812.0360 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 3, 2024
ghstack-source-id: a04b27dca1e37fda18c80247f06852a490dc4f5c Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 3, 2024
ghstack-source-id: 6f6bb7a0c01e10918089bfc34053137b9cda1bb4 Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: 110d6909c8ded38985ae357fbf3b9b3e7a675bc4 Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: 9ef95a13bd47c69983d715594e8d7a4e0828493d Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: d148776d43eb42889ae42a871e42672b24376846 Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: f3d8ed774074f34cea6939a36c3c69277abfd96c Pull Request resolved: #1023
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 2dbccbcbf52eafca9d230cb86b26669d89c96a53 Pull Request resolved: #1023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):