-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Make tensordict not incompatible with torch.compile #629
Open
vmoens
wants to merge
6
commits into
main
Choose a base branch
from
disable_compile_get_set
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 18, 2024
vmoens
added
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
labels
Jan 18, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.9624ms | 27.7038μs | 36.0962 KOps/s | 57.0677 KOps/s | |
test_plain_set_stack_nested | 4.8919ms | 0.1887ms | 5.2989 KOps/s | 6.7445 KOps/s | |
test_plain_set_nested_inplace | 7.2839ms | 35.0713μs | 28.5133 KOps/s | 49.4704 KOps/s | |
test_plain_set_stack_nested_inplace | 0.4255ms | 0.2028ms | 4.9298 KOps/s | 5.3305 KOps/s | |
test_items | 39.7840μs | 2.6627μs | 375.5537 KOps/s | 391.3444 KOps/s | |
test_items_nested | 0.4524ms | 0.2713ms | 3.6863 KOps/s | 3.7077 KOps/s | |
test_items_nested_locked | 0.8873ms | 0.2742ms | 3.6464 KOps/s | 3.6917 KOps/s | |
test_items_nested_leaf | 0.3207ms | 0.1671ms | 5.9859 KOps/s | 6.0085 KOps/s | |
test_items_stack_nested | 2.1264ms | 1.3592ms | 735.7403 Ops/s | 743.7110 Ops/s | |
test_items_stack_nested_leaf | 1.5982ms | 1.2197ms | 819.8906 Ops/s | 832.9116 Ops/s | |
test_items_stack_nested_locked | 1.5737ms | 0.8880ms | 1.1262 KOps/s | 1.1282 KOps/s | |
test_keys | 45.4350μs | 3.9877μs | 250.7681 KOps/s | 230.5345 KOps/s | |
test_keys_nested | 1.6000ms | 0.1518ms | 6.5883 KOps/s | 6.6549 KOps/s | |
test_keys_nested_locked | 0.3124ms | 0.1561ms | 6.4080 KOps/s | 6.5560 KOps/s | |
test_keys_nested_leaf | 0.2964ms | 0.1332ms | 7.5066 KOps/s | 7.6654 KOps/s | |
test_keys_stack_nested | 2.2205ms | 1.7202ms | 581.3283 Ops/s | 778.3391 Ops/s | |
test_keys_stack_nested_leaf | 1.9988ms | 1.7096ms | 584.9332 Ops/s | 781.3187 Ops/s | |
test_keys_stack_nested_locked | 1.5874ms | 1.1823ms | 845.8211 Ops/s | 1.1666 KOps/s | |
test_values | 11.1107μs | 1.1553μs | 865.5752 KOps/s | 848.0163 KOps/s | |
test_values_nested | 0.1005ms | 52.3577μs | 19.0994 KOps/s | 19.1742 KOps/s | |
test_values_nested_locked | 0.1160ms | 52.4049μs | 19.0822 KOps/s | 19.1326 KOps/s | |
test_values_nested_leaf | 0.1312ms | 46.9922μs | 21.2801 KOps/s | 21.2326 KOps/s | |
test_values_stack_nested | 1.7360ms | 1.0465ms | 955.5801 Ops/s | 962.2342 Ops/s | |
test_values_stack_nested_leaf | 1.2081ms | 1.0343ms | 966.8352 Ops/s | 851.6466 Ops/s | |
test_values_stack_nested_locked | 1.0860ms | 0.6152ms | 1.6255 KOps/s | 1.6216 KOps/s | |
test_membership | 41.4580μs | 1.3472μs | 742.2952 KOps/s | 740.5575 KOps/s | |
test_membership_nested | 37.2100μs | 3.5322μs | 283.1068 KOps/s | 289.6690 KOps/s | |
test_membership_nested_leaf | 50.8450μs | 3.5707μs | 280.0580 KOps/s | 286.1696 KOps/s | |
test_membership_stacked_nested | 37.5500μs | 12.0509μs | 82.9811 KOps/s | 82.7993 KOps/s | |
test_membership_stacked_nested_leaf | 48.9720μs | 11.9517μs | 83.6703 KOps/s | 83.4408 KOps/s | |
test_membership_nested_last | 49.1220μs | 12.1961μs | 81.9932 KOps/s | 149.3905 KOps/s | |
test_membership_nested_leaf_last | 69.5100μs | 12.1688μs | 82.1771 KOps/s | 148.5243 KOps/s | |
test_membership_stacked_nested_last | 0.4390ms | 0.3001ms | 3.3319 KOps/s | 5.5994 KOps/s | |
test_membership_stacked_nested_leaf_last | 61.8060μs | 19.5629μs | 51.1172 KOps/s | 69.6782 KOps/s | |
test_nested_getleaf | 57.7890μs | 15.8113μs | 63.2461 KOps/s | 85.4221 KOps/s | |
test_nested_get | 56.6760μs | 15.2527μs | 65.5621 KOps/s | 93.1050 KOps/s | |
test_stacked_getleaf | 0.8113ms | 0.4260ms | 2.3473 KOps/s | 2.4712 KOps/s | |
test_stacked_get | 1.5193ms | 0.4024ms | 2.4848 KOps/s | 2.7141 KOps/s | |
test_nested_getitemleaf | 47.8800μs | 17.4118μs | 57.4324 KOps/s | 81.6028 KOps/s | |
test_nested_getitem | 69.6400μs | 16.8466μs | 59.3593 KOps/s | 85.5372 KOps/s | |
test_stacked_getitemleaf | 0.7077ms | 0.4120ms | 2.4269 KOps/s | 2.4606 KOps/s | |
test_stacked_getitem | 0.5207ms | 0.3769ms | 2.6531 KOps/s | 2.6654 KOps/s | |
test_lock_nested | 2.4800ms | 0.7109ms | 1.4067 KOps/s | 3.0126 KOps/s | |
test_lock_stack_nested | 94.3379ms | 9.5755ms | 104.4329 Ops/s | 162.7397 Ops/s | |
test_unlock_nested | 76.9607ms | 0.7832ms | 1.2769 KOps/s | 3.0029 KOps/s | |
test_unlock_stack_nested | 96.5545ms | 9.7599ms | 102.4605 Ops/s | 160.2095 Ops/s | |
test_flatten_speed | 0.8307ms | 0.4852ms | 2.0609 KOps/s | 2.7229 KOps/s | |
test_unflatten_speed | 5.7637ms | 0.9402ms | 1.0636 KOps/s | 2.1539 KOps/s | |
test_common_ops | 7.8030ms | 1.6750ms | 597.0204 Ops/s | 1.4523 KOps/s | |
test_creation | 3.7145ms | 7.4896μs | 133.5192 KOps/s | 538.3542 KOps/s | |
test_creation_empty | 3.0530ms | 29.7440μs | 33.6202 KOps/s | 90.3874 KOps/s | |
test_creation_nested_1 | 88.5660μs | 37.8311μs | 26.4333 KOps/s | 73.2492 KOps/s | |
test_creation_nested_2 | 87.5440μs | 46.9494μs | 21.2995 KOps/s | 58.5265 KOps/s | |
test_clone | 97.1720μs | 25.8067μs | 38.7497 KOps/s | 79.0003 KOps/s | |
test_getitem[int] | 69.4900μs | 28.1394μs | 35.5373 KOps/s | 85.9274 KOps/s | |
test_getitem[slice_int] | 0.1089ms | 44.8612μs | 22.2910 KOps/s | 32.6422 KOps/s | |
test_getitem[range] | 0.2658ms | 70.3316μs | 14.2184 KOps/s | 18.6310 KOps/s | |
test_getitem[tuple] | 77.7360μs | 38.0278μs | 26.2965 KOps/s | 51.0508 KOps/s | |
test_getitem[list] | 0.1849ms | 61.6060μs | 16.2322 KOps/s | 27.9649 KOps/s | |
test_setitem_dim[int] | 0.1118ms | 36.0163μs | 27.7652 KOps/s | 31.2554 KOps/s | |
test_setitem_dim[slice_int] | 0.1005ms | 62.4641μs | 16.0092 KOps/s | 17.0424 KOps/s | |
test_setitem_dim[range] | 0.1391ms | 83.2274μs | 12.0153 KOps/s | 12.6266 KOps/s | |
test_setitem_dim[tuple] | 82.5140μs | 51.4259μs | 19.4454 KOps/s | 21.0705 KOps/s | |
test_setitem | 0.1110ms | 38.4544μs | 26.0048 KOps/s | 51.2988 KOps/s | |
test_set | 0.1018ms | 42.5675μs | 23.4921 KOps/s | 52.7104 KOps/s | |
test_set_shared | 4.4933ms | 0.1749ms | 5.7184 KOps/s | 7.0290 KOps/s | |
test_update | 4.1515ms | 37.8643μs | 26.4101 KOps/s | 33.4717 KOps/s | |
test_update_nested | 0.1173ms | 51.3947μs | 19.4573 KOps/s | 32.6822 KOps/s | |
test_set_nested | 0.1100ms | 39.9198μs | 25.0502 KOps/s | 47.7715 KOps/s | |
test_set_nested_new | 0.1222ms | 56.2474μs | 17.7786 KOps/s | 40.0586 KOps/s | |
test_select | 0.1975ms | 87.0000μs | 11.4943 KOps/s | 26.9100 KOps/s | |
test_select_nested | 0.2986ms | 0.1594ms | 6.2732 KOps/s | 17.4199 KOps/s | |
test_exclude_nested | 0.3584ms | 0.2231ms | 4.4824 KOps/s | 8.5904 KOps/s | |
test_empty[True] | 0.6211ms | 0.5152ms | 1.9411 KOps/s | 2.4676 KOps/s | |
test_empty[False] | 58.6830μs | 5.9061μs | 169.3155 KOps/s | 966.9026 KOps/s | |
test_unbind_speed | 0.6811ms | 0.5834ms | 1.7140 KOps/s | 4.1296 KOps/s | |
test_unbind_speed_stack0 | 86.7416ms | 6.6785ms | 149.7347 Ops/s | 327.7127 Ops/s | |
test_unbind_speed_stack1 | 36.7190μs | 1.9928μs | 501.7968 KOps/s | 502.3872 KOps/s | |
test_split | 85.5221ms | 2.7732ms | 360.5945 Ops/s | 599.1784 Ops/s | |
test_chunk | 2.7161ms | 2.4786ms | 403.4611 Ops/s | 645.0124 Ops/s | |
test_creation[device0] | 3.7239ms | 0.1077ms | 9.2814 KOps/s | 9.8038 KOps/s | |
test_creation_from_tensor | 0.2622ms | 82.2736μs | 12.1546 KOps/s | 11.8515 KOps/s | |
test_add_one[memmap_tensor0] | 0.1553ms | 5.6606μs | 176.6589 KOps/s | 116.9611 KOps/s | |
test_contiguous[memmap_tensor0] | 21.8910μs | 0.6355μs | 1.5735 MOps/s | 1.3583 MOps/s | |
test_stack[memmap_tensor0] | 63.7100μs | 3.7438μs | 267.1051 KOps/s | 244.7345 KOps/s | |
test_memmaptd_index | 1.0136ms | 0.2624ms | 3.8110 KOps/s | 3.9751 KOps/s | |
test_memmaptd_index_astensor | 0.7541ms | 0.3375ms | 2.9626 KOps/s | 2.9627 KOps/s | |
test_memmaptd_index_op | 1.2319ms | 0.6792ms | 1.4723 KOps/s | 1.3749 KOps/s | |
test_serialize_model | 0.2308s | 0.1567s | 6.3835 Ops/s | 6.3008 Ops/s | |
test_serialize_model_pickle | 0.4705s | 0.3722s | 2.6864 Ops/s | 2.5960 Ops/s | |
test_serialize_weights | 0.2232s | 0.1547s | 6.4648 Ops/s | 8.0144 Ops/s | |
test_serialize_weights_returnearly | 0.2005s | 0.1587s | 6.3008 Ops/s | 7.2512 Ops/s | |
test_serialize_weights_pickle | 0.5443s | 0.4275s | 2.3394 Ops/s | 1.1985 Ops/s | |
test_serialize_weights_filesystem | 0.2001s | 0.1492s | 6.7039 Ops/s | 9.3415 Ops/s | |
test_serialize_model_filesystem | 0.1603s | 0.1402s | 7.1351 Ops/s | 7.7424 Ops/s | |
test_reshape_pytree | 0.1456ms | 20.4039μs | 49.0102 KOps/s | 45.9310 KOps/s | |
test_reshape_td | 0.1212ms | 55.4909μs | 18.0210 KOps/s | 32.8221 KOps/s | |
test_view_pytree | 64.4930μs | 20.8608μs | 47.9367 KOps/s | 46.7769 KOps/s | |
test_view_td | 81.6270ms | 11.4419μs | 87.3984 KOps/s | 86.3245 KOps/s | |
test_unbind_pytree | 3.2863ms | 24.5896μs | 40.6676 KOps/s | 41.1459 KOps/s | |
test_unbind_td | 5.6980ms | 0.1076ms | 9.2940 KOps/s | 28.5040 KOps/s | |
test_split_pytree | 0.1713ms | 23.9277μs | 41.7926 KOps/s | 42.3314 KOps/s | |
test_split_td | 0.3543ms | 79.3469μs | 12.6029 KOps/s | 25.6395 KOps/s | |
test_add_pytree | 97.3920μs | 29.5815μs | 33.8050 KOps/s | 34.1063 KOps/s | |
test_add_td | 0.2568ms | 91.4332μs | 10.9369 KOps/s | 18.6799 KOps/s | |
test_distributed | 0.2583ms | 0.1001ms | 9.9931 KOps/s | 9.9880 KOps/s | |
test_tdmodule | 0.1860ms | 37.8707μs | 26.4056 KOps/s | 43.0295 KOps/s | |
test_tdmodule_dispatch | 0.2354ms | 83.1036μs | 12.0332 KOps/s | 21.6407 KOps/s | |
test_tdseq | 0.1258ms | 40.5654μs | 24.6516 KOps/s | 38.8925 KOps/s | |
test_tdseq_dispatch | 0.4679ms | 88.4263μs | 11.3088 KOps/s | 20.6500 KOps/s | |
test_instantiation_functorch | 2.1114ms | 1.3206ms | 757.2514 Ops/s | 765.5532 Ops/s | |
test_instantiation_td | 1.5931ms | 1.0745ms | 930.6414 Ops/s | 991.8016 Ops/s | |
test_exec_functorch | 0.2405ms | 0.1631ms | 6.1323 KOps/s | 6.4271 KOps/s | |
test_exec_functional_call | 0.3406ms | 0.1554ms | 6.4364 KOps/s | 6.8642 KOps/s | |
test_exec_td | 0.3133ms | 0.1511ms | 6.6165 KOps/s | 7.0415 KOps/s | |
test_exec_td_decorator | 1.1510ms | 0.3290ms | 3.0397 KOps/s | 5.7056 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4440ms | 0.9690ms | 1.0320 KOps/s | 1.1258 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8025ms | 0.5364ms | 1.8643 KOps/s | 2.1271 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.9616ms | 0.7788ms | 1.2840 KOps/s | 1.2894 KOps/s | |
test_vmap_mlp_speed[False-False] | 2.4685ms | 0.4057ms | 2.4646 KOps/s | 2.5928 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.6295ms | 1.9648ms | 508.9462 Ops/s | 640.4225 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8805ms | 0.6907ms | 1.4478 KOps/s | 1.9432 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.1900ms | 1.5798ms | 632.9859 Ops/s | 770.5046 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6166ms | 0.5061ms | 1.9759 KOps/s | 2.5368 KOps/s | |
test_to_module_speed[True] | 4.2223ms | 2.8899ms | 346.0289 Ops/s | 738.3985 Ops/s | |
test_to_module_speed[False] | 2.9729ms | 2.8410ms | 351.9886 Ops/s | 902.2590 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1263ms | 14.7147μs | 67.9592 KOps/s | 78.7848 KOps/s | |
test_plain_set_stack_nested | 0.1435ms | 0.1243ms | 8.0473 KOps/s | 8.4014 KOps/s | |
test_plain_set_nested_inplace | 41.4320μs | 16.1901μs | 61.7661 KOps/s | 71.9761 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2005ms | 0.1525ms | 6.5563 KOps/s | 6.7538 KOps/s | |
test_items | 20.3310μs | 4.7343μs | 211.2237 KOps/s | 212.1939 KOps/s | |
test_items_nested | 0.3919ms | 0.3366ms | 2.9706 KOps/s | 2.9250 KOps/s | |
test_items_nested_locked | 0.4186ms | 0.3389ms | 2.9511 KOps/s | 2.8995 KOps/s | |
test_items_nested_leaf | 0.2414ms | 0.1996ms | 5.0104 KOps/s | 4.9575 KOps/s | |
test_items_stack_nested | 1.3775ms | 1.3274ms | 753.3418 Ops/s | 751.5390 Ops/s | |
test_items_stack_nested_leaf | 1.2056ms | 1.1658ms | 857.8055 Ops/s | 853.3090 Ops/s | |
test_items_stack_nested_locked | 0.9581ms | 0.9132ms | 1.0951 KOps/s | 1.0874 KOps/s | |
test_keys | 24.6320μs | 4.5489μs | 219.8310 KOps/s | 219.2486 KOps/s | |
test_keys_nested | 1.9141ms | 94.0694μs | 10.6304 KOps/s | 10.6077 KOps/s | |
test_keys_nested_locked | 0.1317ms | 97.6320μs | 10.2425 KOps/s | 10.2018 KOps/s | |
test_keys_nested_leaf | 0.2030ms | 78.0215μs | 12.8170 KOps/s | 12.8437 KOps/s | |
test_keys_stack_nested | 1.3328ms | 1.2802ms | 781.1039 Ops/s | 862.2872 Ops/s | |
test_keys_stack_nested_leaf | 1.3673ms | 1.2667ms | 789.4677 Ops/s | 873.0781 Ops/s | |
test_keys_stack_nested_locked | 1.0189ms | 0.8290ms | 1.2063 KOps/s | 1.3666 KOps/s | |
test_values | 8.7237μs | 1.8979μs | 526.8897 KOps/s | 523.7966 KOps/s | |
test_values_nested | 85.0640μs | 44.7182μs | 22.3623 KOps/s | 21.9914 KOps/s | |
test_values_nested_locked | 69.1930μs | 46.9928μs | 21.2798 KOps/s | 20.9743 KOps/s | |
test_values_nested_leaf | 56.6630μs | 39.3643μs | 25.4037 KOps/s | 25.0336 KOps/s | |
test_values_stack_nested | 1.0172ms | 0.9737ms | 1.0270 KOps/s | 1.0247 KOps/s | |
test_values_stack_nested_leaf | 1.0314ms | 0.9708ms | 1.0301 KOps/s | 1.0332 KOps/s | |
test_values_stack_nested_locked | 0.6305ms | 0.5835ms | 1.7138 KOps/s | 1.7019 KOps/s | |
test_membership | 8.2804μs | 0.9322μs | 1.0727 MOps/s | 1.0526 MOps/s | |
test_membership_nested | 35.0820μs | 2.8686μs | 348.6037 KOps/s | 348.2273 KOps/s | |
test_membership_nested_leaf | 18.8010μs | 2.8488μs | 351.0243 KOps/s | 348.5938 KOps/s | |
test_membership_stacked_nested | 60.2830μs | 11.1415μs | 89.7547 KOps/s | 87.5673 KOps/s | |
test_membership_stacked_nested_leaf | 44.6820μs | 11.2261μs | 89.0782 KOps/s | 88.8085 KOps/s | |
test_membership_nested_last | 35.0420μs | 6.6873μs | 149.5363 KOps/s | 188.0393 KOps/s | |
test_membership_nested_leaf_last | 26.6510μs | 6.6491μs | 150.3972 KOps/s | 187.6984 KOps/s | |
test_membership_stacked_nested_last | 0.8528ms | 0.1891ms | 5.2884 KOps/s | 6.3520 KOps/s | |
test_membership_stacked_nested_leaf_last | 40.2920μs | 14.4104μs | 69.3941 KOps/s | 76.2392 KOps/s | |
test_nested_getleaf | 45.2220μs | 9.8231μs | 101.8013 KOps/s | 118.4363 KOps/s | |
test_nested_get | 31.5110μs | 9.4046μs | 106.3308 KOps/s | 125.7140 KOps/s | |
test_stacked_getleaf | 0.5336ms | 0.3364ms | 2.9723 KOps/s | 3.0332 KOps/s | |
test_stacked_get | 0.3396ms | 0.2992ms | 3.3424 KOps/s | 3.3760 KOps/s | |
test_nested_getitemleaf | 26.8620μs | 11.2773μs | 88.6740 KOps/s | 101.9730 KOps/s | |
test_nested_getitem | 26.2010μs | 10.8069μs | 92.5334 KOps/s | 107.0889 KOps/s | |
test_stacked_getitemleaf | 0.3735ms | 0.3337ms | 2.9968 KOps/s | 2.9996 KOps/s | |
test_stacked_getitem | 0.3619ms | 0.2984ms | 3.3508 KOps/s | 3.3642 KOps/s | |
test_lock_nested | 2.5156ms | 0.4659ms | 2.1466 KOps/s | 2.2571 KOps/s | |
test_lock_stack_nested | 0.1255s | 8.2715ms | 120.8977 Ops/s | 138.9049 Ops/s | |
test_unlock_nested | 0.8605ms | 0.4658ms | 2.1470 KOps/s | 2.8141 KOps/s | |
test_unlock_stack_nested | 0.1230s | 8.3088ms | 120.3548 Ops/s | 138.6609 Ops/s | |
test_flatten_speed | 0.3761ms | 0.2924ms | 3.4205 KOps/s | 3.8942 KOps/s | |
test_unflatten_speed | 0.4918ms | 0.4601ms | 2.1736 KOps/s | 2.7735 KOps/s | |
test_common_ops | 1.2059ms | 0.7671ms | 1.3036 KOps/s | 1.7324 KOps/s | |
test_creation | 15.5010μs | 2.9183μs | 342.6632 KOps/s | 636.9403 KOps/s | |
test_creation_empty | 31.5320μs | 11.8583μs | 84.3293 KOps/s | 153.3222 KOps/s | |
test_creation_nested_1 | 87.6850μs | 15.4947μs | 64.5382 KOps/s | 121.2998 KOps/s | |
test_creation_nested_2 | 36.0920μs | 19.2767μs | 51.8762 KOps/s | 93.8628 KOps/s | |
test_clone | 51.6420μs | 18.3249μs | 54.5706 KOps/s | 70.3135 KOps/s | |
test_getitem[int] | 32.3020μs | 15.8893μs | 62.9356 KOps/s | 93.9130 KOps/s | |
test_getitem[slice_int] | 62.3430μs | 29.2411μs | 34.1984 KOps/s | 46.6872 KOps/s | |
test_getitem[range] | 0.1097ms | 50.4532μs | 19.8204 KOps/s | 24.3878 KOps/s | |
test_getitem[tuple] | 50.2120μs | 25.8424μs | 38.6961 KOps/s | 53.7044 KOps/s | |
test_getitem[list] | 0.1283ms | 47.6073μs | 21.0052 KOps/s | 27.1916 KOps/s | |
test_setitem_dim[int] | 43.9620μs | 27.5148μs | 36.3440 KOps/s | 39.8890 KOps/s | |
test_setitem_dim[slice_int] | 71.2140μs | 48.5076μs | 20.6153 KOps/s | 21.7259 KOps/s | |
test_setitem_dim[range] | 87.7250μs | 68.0283μs | 14.6998 KOps/s | 15.1242 KOps/s | |
test_setitem_dim[tuple] | 58.5840μs | 41.8933μs | 23.8702 KOps/s | 25.4816 KOps/s | |
test_setitem | 55.6630μs | 22.8832μs | 43.7002 KOps/s | 55.2844 KOps/s | |
test_set | 59.2330μs | 24.0336μs | 41.6084 KOps/s | 57.3862 KOps/s | |
test_set_shared | 2.7017ms | 0.1114ms | 8.9754 KOps/s | 9.6544 KOps/s | |
test_update | 65.2230μs | 22.9028μs | 43.6629 KOps/s | 52.8258 KOps/s | |
test_update_nested | 75.0540μs | 32.2267μs | 31.0301 KOps/s | 38.7974 KOps/s | |
test_set_nested | 63.8130μs | 24.1263μs | 41.4485 KOps/s | 53.1756 KOps/s | |
test_set_nested_new | 69.8540μs | 31.3771μs | 31.8703 KOps/s | 46.8744 KOps/s | |
test_select | 0.1408ms | 51.3143μs | 19.4877 KOps/s | 29.5633 KOps/s | |
test_select_nested | 0.1102ms | 84.8400μs | 11.7869 KOps/s | 19.0239 KOps/s | |
test_exclude_nested | 1.0801ms | 0.1488ms | 6.7221 KOps/s | 8.9772 KOps/s | |
test_empty[True] | 0.4858ms | 0.4200ms | 2.3808 KOps/s | 2.6244 KOps/s | |
test_empty[False] | 10.7755μs | 2.5061μs | 399.0204 KOps/s | 1.1588 MOps/s | |
test_to | 84.0040μs | 61.6872μs | 16.2108 KOps/s | 17.5085 KOps/s | |
test_to_nonblocking | 76.5240μs | 43.0093μs | 23.2508 KOps/s | 27.9278 KOps/s | |
test_unbind_speed | 0.4658ms | 0.3765ms | 2.6561 KOps/s | 3.6875 KOps/s | |
test_unbind_speed_stack0 | 0.1160s | 5.4816ms | 182.4271 Ops/s | 280.7590 Ops/s | |
test_unbind_speed_stack1 | 16.5310μs | 1.8274μs | 547.2356 KOps/s | 551.6424 KOps/s | |
test_split | 2.5196ms | 1.8994ms | 526.4772 Ops/s | 552.6727 Ops/s | |
test_chunk | 0.1097s | 2.1139ms | 473.0547 Ops/s | 581.4345 Ops/s | |
test_creation[device0] | 0.1467ms | 73.8029μs | 13.5496 KOps/s | 13.5857 KOps/s | |
test_creation_from_tensor | 0.1984ms | 55.0737μs | 18.1575 KOps/s | 18.4950 KOps/s | |
test_add_one[memmap_tensor0] | 0.2837ms | 7.5673μs | 132.1474 KOps/s | 133.4053 KOps/s | |
test_contiguous[memmap_tensor0] | 26.4410μs | 0.6591μs | 1.5172 MOps/s | 1.5100 MOps/s | |
test_stack[memmap_tensor0] | 46.8120μs | 4.7862μs | 208.9327 KOps/s | 206.4735 KOps/s | |
test_memmaptd_index | 0.9982ms | 0.2737ms | 3.6542 KOps/s | 3.7436 KOps/s | |
test_memmaptd_index_astensor | 0.6161ms | 0.3321ms | 3.0109 KOps/s | 3.1012 KOps/s | |
test_memmaptd_index_op | 0.9580ms | 0.6416ms | 1.5586 KOps/s | 1.6356 KOps/s | |
test_serialize_model | 0.2003s | 0.1042s | 9.5940 Ops/s | 9.1770 Ops/s | |
test_serialize_model_pickle | 1.3520s | 1.2364s | 0.8088 Ops/s | 0.8077 Ops/s | |
test_serialize_weights | 0.1966s | 99.6231ms | 10.0378 Ops/s | 10.7331 Ops/s | |
test_serialize_weights_returnearly | 0.3060s | 74.6917ms | 13.3884 Ops/s | 13.2852 Ops/s | |
test_serialize_weights_pickle | 1.4157s | 1.2454s | 0.8030 Ops/s | 0.8029 Ops/s | |
test_reshape_pytree | 0.1643ms | 25.8797μs | 38.6404 KOps/s | 39.6402 KOps/s | |
test_reshape_td | 0.1861ms | 37.7917μs | 26.4609 KOps/s | 33.0647 KOps/s | |
test_view_pytree | 0.1172ms | 24.6338μs | 40.5946 KOps/s | 40.6030 KOps/s | |
test_view_td | 0.5117ms | 6.8395μs | 146.2101 KOps/s | 80.2544 KOps/s | |
test_unbind_pytree | 77.8140μs | 30.7977μs | 32.4700 KOps/s | 32.8183 KOps/s | |
test_unbind_td | 0.3567ms | 57.2645μs | 17.4628 KOps/s | 24.6070 KOps/s | |
test_split_pytree | 59.2030μs | 28.8265μs | 34.6904 KOps/s | 34.9333 KOps/s | |
test_split_td | 0.1439ms | 53.1638μs | 18.8098 KOps/s | 25.3942 KOps/s | |
test_add_pytree | 65.4040μs | 38.1026μs | 26.2450 KOps/s | 26.0797 KOps/s | |
test_add_td | 0.1455ms | 59.3009μs | 16.8632 KOps/s | 20.1829 KOps/s | |
test_distributed | 2.7618ms | 72.4127μs | 13.8097 KOps/s | 11.2975 KOps/s | |
test_tdmodule | 0.1214ms | 21.2544μs | 47.0490 KOps/s | 60.2772 KOps/s | |
test_tdmodule_dispatch | 0.2410ms | 47.6678μs | 20.9785 KOps/s | 29.9459 KOps/s | |
test_tdseq | 39.9120μs | 24.0091μs | 41.6509 KOps/s | 52.2607 KOps/s | |
test_tdseq_dispatch | 67.4330μs | 48.9563μs | 20.4264 KOps/s | 27.7832 KOps/s | |
test_instantiation_functorch | 1.7846ms | 1.6650ms | 600.6180 Ops/s | 605.4883 Ops/s | |
test_instantiation_td | 1.7283ms | 1.1795ms | 847.7846 Ops/s | 873.5983 Ops/s | |
test_exec_functorch | 0.2351ms | 0.1671ms | 5.9827 KOps/s | 6.2237 KOps/s | |
test_exec_functional_call | 0.2497ms | 0.1632ms | 6.1288 KOps/s | 6.2451 KOps/s | |
test_exec_td | 0.2289ms | 0.1586ms | 6.3063 KOps/s | 6.4035 KOps/s | |
test_exec_td_decorator | 0.3500ms | 0.2370ms | 4.2198 KOps/s | 5.4939 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1815ms | 1.0638ms | 940.0241 Ops/s | 959.8104 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7249ms | 0.6207ms | 1.6111 KOps/s | 1.6624 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.7832ms | 0.9620ms | 1.0395 KOps/s | 1.0284 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7138ms | 0.5426ms | 1.8431 KOps/s | 1.7870 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.0934ms | 1.9374ms | 516.1489 Ops/s | 551.7953 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8003ms | 0.6959ms | 1.4370 KOps/s | 1.5810 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.1018ms | 1.6772ms | 596.2402 Ops/s | 601.6582 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7127ms | 0.5783ms | 1.7292 KOps/s | 1.7467 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.7279ms | 12.3499ms | 80.9726 Ops/s | 78.9750 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4336ms | 8.1660ms | 122.4595 Ops/s | 120.5806 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.6124ms | 12.2129ms | 81.8804 Ops/s | 81.0988 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.4355ms | 8.1005ms | 123.4498 Ops/s | 122.8750 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 0.1958s | 71.7096ms | 13.9451 Ops/s | 16.0113 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2362ms | 19.8779ms | 50.3070 Ops/s | 51.9889 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 57.2127ms | 56.7394ms | 17.6244 Ops/s | 18.4176 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.1594ms | 19.6446ms | 50.9046 Ops/s | 52.9211 Ops/s | |
test_to_module_speed[True] | 1.7050ms | 1.5850ms | 630.9057 Ops/s | 996.8444 Ops/s | |
test_to_module_speed[False] | 2.9613ms | 1.5574ms | 642.0801 Ops/s | 1.0264 KOps/s |
I'd suggest filing issues for the problems with repros, they're probably just PT2 bugs. |
# Conflicts: # tensordict/_td.py # tensordict/base.py # tensordict/nn/common.py # tensordict/nn/utils.py # test/test_nn.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The goal of this PR is to make tensordict not incompatible with
torch.compile
, ie. remove breaking points by letting know torch.compile that these functions should be ignored.This way, we will be able to use tensordict with torch.compile and speed things up at a later stage.
We also deprecate the old functional API by default. It can be reinstated via
_set_auto_make_functional(True)
decorator / cm.torch.compile crashes somewhere in the
@dispatch
wrapper but since this isn't intended for performance I think having the option to disable it isn't a bad idea so I created a cm for that too and make it private for now.There are still a bunch of errors to fix:
with cudagraphs, even deactivating compile around the get oparations mixes fake and real tensors and results in
I guess that registering
get
andset
in [RFC] Tensordict integration pytorch#112441 will solve a great deal of bugs in one shot (other key-based tensordict operations almost always rely on these two methods).with inductor, some modules (eg, building distributions in
tensordict.nn.ProbabilisticTensorDictModule
) results in some crypticcc @ezyang