Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 10, 2025

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2844

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 1 Unrelated Failure

As of commit 587e72c with merge base 6e40548 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2025
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}31$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4969s 0.4915s 2.0347 Ops/s 1.8561 Ops/s $\textbf{\color{#35bf28}+9.62\%}$
test_transformed 1.0550s 0.9776s 1.0229 Ops/s 0.9592 Ops/s $\textbf{\color{#35bf28}+6.64\%}$
test_serial 1.5980s 1.4970s 0.6680 Ops/s 0.6354 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_parallel 1.3839s 1.3040s 0.7669 Ops/s 0.7328 Ops/s $\color{#35bf28}+4.65\%$
test_step_mdp_speed[True-True-True-True-True] 0.1961ms 30.0882μs 33.2356 KOps/s 34.0582 KOps/s $\color{#d91a1a}-2.42\%$
test_step_mdp_speed[True-True-True-True-False] 47.3480μs 17.9154μs 55.8179 KOps/s 56.6163 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[True-True-True-False-True] 48.8810μs 16.9329μs 59.0566 KOps/s 59.9434 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[True-True-True-False-False] 37.9110μs 10.0864μs 99.1437 KOps/s 101.3249 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[True-True-False-True-True] 68.6380μs 31.9934μs 31.2565 KOps/s 31.7701 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-True-False-True-False] 59.7320μs 19.7228μs 50.7028 KOps/s 51.3186 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-True-False-False-True] 48.1500μs 18.8409μs 53.0759 KOps/s 53.9916 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[True-True-False-False-False] 40.2460μs 11.8979μs 84.0485 KOps/s 84.7491 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-False-True-True-True] 94.5950μs 33.9864μs 29.4235 KOps/s 29.8436 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[True-False-True-True-False] 0.5857ms 21.8512μs 45.7641 KOps/s 46.4583 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-True-False-True] 0.1428ms 18.8890μs 52.9409 KOps/s 53.6743 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-False-True-False-False] 49.5920μs 11.9158μs 83.9224 KOps/s 85.3970 KOps/s $\color{#d91a1a}-1.73\%$
test_step_mdp_speed[True-False-False-True-True] 0.1429ms 35.5025μs 28.1670 KOps/s 28.2453 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-False-True-False] 61.9850μs 23.3727μs 42.7849 KOps/s 43.0992 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-False-False-False-True] 59.3010μs 20.5125μs 48.7507 KOps/s 48.8531 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-False-False-False] 71.2050μs 13.6079μs 73.4866 KOps/s 73.9749 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-True-True-True] 0.1144ms 34.1268μs 29.3025 KOps/s 29.7569 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-True-True-True-False] 54.0410μs 21.8174μs 45.8350 KOps/s 46.5232 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[False-True-True-False-True] 57.4070μs 21.7605μs 45.9548 KOps/s 46.0400 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-True-True-False-False] 51.9570μs 13.3041μs 75.1647 KOps/s 76.1540 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-True-False-True-True] 73.8980μs 36.0999μs 27.7009 KOps/s 28.2324 KOps/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[False-True-False-True-False] 2.4900ms 23.7061μs 42.1833 KOps/s 42.7423 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-True-False-False-True] 71.4840μs 23.4273μs 42.6853 KOps/s 43.2859 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-True-False-False-False] 38.3720μs 15.1169μs 66.1512 KOps/s 67.5713 KOps/s $\color{#d91a1a}-2.10\%$
test_step_mdp_speed[False-False-True-True-True] 84.2780μs 38.0604μs 26.2740 KOps/s 27.0566 KOps/s $\color{#d91a1a}-2.89\%$
test_step_mdp_speed[False-False-True-True-False] 66.2940μs 25.4741μs 39.2556 KOps/s 40.2757 KOps/s $\color{#d91a1a}-2.53\%$
test_step_mdp_speed[False-False-True-False-True] 74.1970μs 23.0791μs 43.3292 KOps/s 43.7123 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-False-True-False-False] 0.5906ms 15.2600μs 65.5310 KOps/s 67.9455 KOps/s $\color{#d91a1a}-3.55\%$
test_step_mdp_speed[False-False-False-True-True] 82.9550μs 39.5133μs 25.3079 KOps/s 26.0677 KOps/s $\color{#d91a1a}-2.91\%$
test_step_mdp_speed[False-False-False-True-False] 75.6410μs 27.3200μs 36.6033 KOps/s 37.4531 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[False-False-False-False-True] 62.4370μs 24.9717μs 40.0453 KOps/s 40.4685 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-False-False-False-False] 65.8430μs 16.8329μs 59.4076 KOps/s 60.4734 KOps/s $\color{#d91a1a}-1.76\%$
test_values[generalized_advantage_estimate-True-True] 9.9997ms 9.8447ms 101.5779 Ops/s 103.8448 Ops/s $\color{#d91a1a}-2.18\%$
test_values[vec_generalized_advantage_estimate-True-True] 28.2076ms 25.7516ms 38.8325 Ops/s 38.3582 Ops/s $\color{#35bf28}+1.24\%$
test_values[td0_return_estimate-False-False] 0.2466ms 0.1788ms 5.5921 KOps/s 5.1337 KOps/s $\textbf{\color{#35bf28}+8.93\%}$
test_values[td1_return_estimate-False-False] 26.9504ms 23.8903ms 41.8580 Ops/s 41.0118 Ops/s $\color{#35bf28}+2.06\%$
test_values[vec_td1_return_estimate-False-False] 28.3171ms 25.5772ms 39.0973 Ops/s 38.0755 Ops/s $\color{#35bf28}+2.68\%$
test_values[td_lambda_return_estimate-True-False] 37.1031ms 34.5681ms 28.9284 Ops/s 28.2937 Ops/s $\color{#35bf28}+2.24\%$
test_values[vec_td_lambda_return_estimate-True-False] 33.1520ms 25.7939ms 38.7689 Ops/s 37.9890 Ops/s $\color{#35bf28}+2.05\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.8674ms 8.4303ms 118.6198 Ops/s 117.7571 Ops/s $\color{#35bf28}+0.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2773ms 1.9824ms 504.4358 Ops/s 513.0257 Ops/s $\color{#d91a1a}-1.67\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4411ms 0.3681ms 2.7166 KOps/s 2.7721 KOps/s $\color{#d91a1a}-2.00\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.1528ms 43.2877ms 23.1013 Ops/s 21.9683 Ops/s $\textbf{\color{#35bf28}+5.16\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4572ms 3.4991ms 285.7913 Ops/s 280.3695 Ops/s $\color{#35bf28}+1.93\%$
test_dqn_speed[False-None] 6.0991ms 1.3857ms 721.6794 Ops/s 702.5694 Ops/s $\color{#35bf28}+2.72\%$
test_dqn_speed[False-backward] 2.0181ms 1.8736ms 533.7393 Ops/s 528.3143 Ops/s $\color{#35bf28}+1.03\%$
test_dqn_speed[True-None] 0.8199ms 0.5554ms 1.8005 KOps/s 1.7715 KOps/s $\color{#35bf28}+1.64\%$
test_dqn_speed[True-backward] 1.0352ms 0.9773ms 1.0233 KOps/s 844.3408 Ops/s $\textbf{\color{#35bf28}+21.19\%}$
test_dqn_speed[reduce-overhead-None] 0.7789ms 0.5589ms 1.7891 KOps/s 1.7731 KOps/s $\color{#35bf28}+0.90\%$
test_dqn_speed[reduce-overhead-backward] 1.0093ms 0.9594ms 1.0423 KOps/s 993.5444 Ops/s $\color{#35bf28}+4.91\%$
test_ddpg_speed[False-None] 3.6040ms 2.8530ms 350.5072 Ops/s 340.9856 Ops/s $\color{#35bf28}+2.79\%$
test_ddpg_speed[False-backward] 5.0642ms 4.1365ms 241.7504 Ops/s 245.0793 Ops/s $\color{#d91a1a}-1.36\%$
test_ddpg_speed[True-None] 1.9300ms 1.4450ms 692.0558 Ops/s 684.9920 Ops/s $\color{#35bf28}+1.03\%$
test_ddpg_speed[True-backward] 2.3946ms 2.3104ms 432.8250 Ops/s 346.7068 Ops/s $\textbf{\color{#35bf28}+24.84\%}$
test_ddpg_speed[reduce-overhead-None] 2.0093ms 1.4363ms 696.2299 Ops/s 696.5889 Ops/s $\color{#d91a1a}-0.05\%$
test_ddpg_speed[reduce-overhead-backward] 2.5259ms 2.3301ms 429.1695 Ops/s 370.1388 Ops/s $\textbf{\color{#35bf28}+15.95\%}$
test_sac_speed[False-None] 9.0947ms 7.8529ms 127.3412 Ops/s 114.6599 Ops/s $\textbf{\color{#35bf28}+11.06\%}$
test_sac_speed[False-backward] 10.8703ms 10.5476ms 94.8085 Ops/s 82.8237 Ops/s $\textbf{\color{#35bf28}+14.47\%}$
test_sac_speed[True-None] 3.2325ms 2.5648ms 389.8971 Ops/s 317.8548 Ops/s $\textbf{\color{#35bf28}+22.67\%}$
test_sac_speed[True-backward] 4.7919ms 4.3506ms 229.8550 Ops/s 197.3001 Ops/s $\textbf{\color{#35bf28}+16.50\%}$
test_sac_speed[reduce-overhead-None] 2.8703ms 2.5999ms 384.6299 Ops/s 353.3490 Ops/s $\textbf{\color{#35bf28}+8.85\%}$
test_sac_speed[reduce-overhead-backward] 5.1058ms 4.9186ms 203.3097 Ops/s 221.5307 Ops/s $\textbf{\color{#d91a1a}-8.23\%}$
test_redq_speed[False-None] 14.4289ms 13.5461ms 73.8219 Ops/s 64.9129 Ops/s $\textbf{\color{#35bf28}+13.72\%}$
test_redq_speed[False-backward] 28.2061ms 23.1839ms 43.1334 Ops/s 39.9869 Ops/s $\textbf{\color{#35bf28}+7.87\%}$
test_redq_speed[True-None] 7.6448ms 6.9673ms 143.5277 Ops/s 123.7923 Ops/s $\textbf{\color{#35bf28}+15.94\%}$
test_redq_speed[True-backward] 14.7853ms 14.1503ms 70.6700 Ops/s 61.9649 Ops/s $\textbf{\color{#35bf28}+14.05\%}$
test_redq_speed[reduce-overhead-None] 7.9111ms 7.1744ms 139.3850 Ops/s 120.4449 Ops/s $\textbf{\color{#35bf28}+15.73\%}$
test_redq_speed[reduce-overhead-backward] 14.5172ms 14.0601ms 71.1233 Ops/s 60.2739 Ops/s $\textbf{\color{#35bf28}+18.00\%}$
test_redq_deprec_speed[False-None] 14.9577ms 12.8490ms 77.8273 Ops/s 66.1001 Ops/s $\textbf{\color{#35bf28}+17.74\%}$
test_redq_deprec_speed[False-backward] 20.4577ms 18.5880ms 53.7980 Ops/s 45.9021 Ops/s $\textbf{\color{#35bf28}+17.20\%}$
test_redq_deprec_speed[True-None] 5.7134ms 5.2249ms 191.3922 Ops/s 150.9415 Ops/s $\textbf{\color{#35bf28}+26.80\%}$
test_redq_deprec_speed[True-backward] 11.5141ms 10.4955ms 95.2792 Ops/s 82.4478 Ops/s $\textbf{\color{#35bf28}+15.56\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.5128ms 5.3186ms 188.0187 Ops/s 147.0850 Ops/s $\textbf{\color{#35bf28}+27.83\%}$
test_redq_deprec_speed[reduce-overhead-backward] 11.4926ms 10.8897ms 91.8296 Ops/s 82.8924 Ops/s $\textbf{\color{#35bf28}+10.78\%}$
test_td3_speed[False-None] 9.3293ms 8.1188ms 123.1716 Ops/s 110.3211 Ops/s $\textbf{\color{#35bf28}+11.65\%}$
test_td3_speed[False-backward] 12.6806ms 10.6106ms 94.2450 Ops/s 87.0166 Ops/s $\textbf{\color{#35bf28}+8.31\%}$
test_td3_speed[True-None] 2.5673ms 2.2584ms 442.7925 Ops/s 433.3677 Ops/s $\color{#35bf28}+2.17\%$
test_td3_speed[True-backward] 4.1922ms 3.9275ms 254.6117 Ops/s 242.3935 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_td3_speed[reduce-overhead-None] 2.5359ms 2.2903ms 436.6210 Ops/s 431.5932 Ops/s $\color{#35bf28}+1.16\%$
test_td3_speed[reduce-overhead-backward] 4.5833ms 4.3006ms 232.5280 Ops/s 247.1785 Ops/s $\textbf{\color{#d91a1a}-5.93\%}$
test_cql_speed[False-None] 39.4290ms 36.5770ms 27.3396 Ops/s 27.7118 Ops/s $\color{#d91a1a}-1.34\%$
test_cql_speed[False-backward] 54.6726ms 48.5418ms 20.6008 Ops/s 21.1083 Ops/s $\color{#d91a1a}-2.40\%$
test_cql_speed[True-None] 27.3853ms 23.1298ms 43.2343 Ops/s 45.3672 Ops/s $\color{#d91a1a}-4.70\%$
test_cql_speed[True-backward] 30.4306ms 29.4459ms 33.9606 Ops/s 33.8065 Ops/s $\color{#35bf28}+0.46\%$
test_cql_speed[reduce-overhead-None] 23.7776ms 22.8119ms 43.8369 Ops/s 45.0232 Ops/s $\color{#d91a1a}-2.63\%$
test_cql_speed[reduce-overhead-backward] 31.1282ms 29.9743ms 33.3620 Ops/s 34.4905 Ops/s $\color{#d91a1a}-3.27\%$
test_a2c_speed[False-None] 8.6645ms 7.2901ms 137.1720 Ops/s 140.6455 Ops/s $\color{#d91a1a}-2.47\%$
test_a2c_speed[False-backward] 15.1157ms 14.7157ms 67.9546 Ops/s 70.4483 Ops/s $\color{#d91a1a}-3.54\%$
test_a2c_speed[True-None] 5.3597ms 4.8056ms 208.0922 Ops/s 214.9205 Ops/s $\color{#d91a1a}-3.18\%$
test_a2c_speed[True-backward] 11.8425ms 11.3888ms 87.8055 Ops/s 89.5423 Ops/s $\color{#d91a1a}-1.94\%$
test_a2c_speed[reduce-overhead-None] 5.5903ms 4.7551ms 210.2996 Ops/s 215.7667 Ops/s $\color{#d91a1a}-2.53\%$
test_a2c_speed[reduce-overhead-backward] 12.2650ms 11.7140ms 85.3682 Ops/s 90.1445 Ops/s $\textbf{\color{#d91a1a}-5.30\%}$
test_ppo_speed[False-None] 9.0136ms 7.6250ms 131.1472 Ops/s 133.1708 Ops/s $\color{#d91a1a}-1.52\%$
test_ppo_speed[False-backward] 16.0521ms 15.0720ms 66.3483 Ops/s 69.1684 Ops/s $\color{#d91a1a}-4.08\%$
test_ppo_speed[True-None] 6.0521ms 5.2708ms 189.7244 Ops/s 197.5737 Ops/s $\color{#d91a1a}-3.97\%$
test_ppo_speed[True-backward] 12.5248ms 11.8612ms 84.3085 Ops/s 91.7903 Ops/s $\textbf{\color{#d91a1a}-8.15\%}$
test_ppo_speed[reduce-overhead-None] 7.9145ms 5.2770ms 189.5002 Ops/s 199.7602 Ops/s $\textbf{\color{#d91a1a}-5.14\%}$
test_ppo_speed[reduce-overhead-backward] 11.9547ms 11.6976ms 85.4879 Ops/s 90.1640 Ops/s $\textbf{\color{#d91a1a}-5.19\%}$
test_reinforce_speed[False-None] 8.1301ms 6.7304ms 148.5786 Ops/s 153.7415 Ops/s $\color{#d91a1a}-3.36\%$
test_reinforce_speed[False-backward] 10.7828ms 10.2130ms 97.9142 Ops/s 102.9890 Ops/s $\color{#d91a1a}-4.93\%$
test_reinforce_speed[True-None] 4.9073ms 4.4385ms 225.3004 Ops/s 241.0646 Ops/s $\textbf{\color{#d91a1a}-6.54\%}$
test_reinforce_speed[True-backward] 11.4601ms 10.6284ms 94.0876 Ops/s 97.1945 Ops/s $\color{#d91a1a}-3.20\%$
test_reinforce_speed[reduce-overhead-None] 5.1824ms 4.1976ms 238.2331 Ops/s 243.0606 Ops/s $\color{#d91a1a}-1.99\%$
test_reinforce_speed[reduce-overhead-backward] 10.9162ms 10.2414ms 97.6426 Ops/s 91.4473 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_iql_speed[False-None] 34.0413ms 32.6607ms 30.6178 Ops/s 29.8729 Ops/s $\color{#35bf28}+2.49\%$
test_iql_speed[False-backward] 51.9278ms 45.8324ms 21.8186 Ops/s 21.4424 Ops/s $\color{#35bf28}+1.75\%$
test_iql_speed[True-None] 17.2680ms 16.2737ms 61.4490 Ops/s 61.5420 Ops/s $\color{#d91a1a}-0.15\%$
test_iql_speed[True-backward] 33.6925ms 28.4658ms 35.1299 Ops/s 36.5095 Ops/s $\color{#d91a1a}-3.78\%$
test_iql_speed[reduce-overhead-None] 19.1115ms 16.3477ms 61.1708 Ops/s 60.9115 Ops/s $\color{#35bf28}+0.43\%$
test_iql_speed[reduce-overhead-backward] 29.6341ms 27.9019ms 35.8398 Ops/s 36.9814 Ops/s $\color{#d91a1a}-3.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7303ms 5.1207ms 195.2854 Ops/s 206.2073 Ops/s $\textbf{\color{#d91a1a}-5.30\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.5516ms 0.5521ms 1.8113 KOps/s 1.8342 KOps/s $\color{#d91a1a}-1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8083ms 0.5150ms 1.9417 KOps/s 1.9273 KOps/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4229ms 4.8645ms 205.5693 Ops/s 217.3721 Ops/s $\textbf{\color{#d91a1a}-5.43\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6177ms 0.5360ms 1.8658 KOps/s 1.9202 KOps/s $\color{#d91a1a}-2.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8985ms 0.5127ms 1.9503 KOps/s 1.8999 KOps/s $\color{#35bf28}+2.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.0261ms 1.6867ms 592.8618 Ops/s 595.2479 Ops/s $\color{#d91a1a}-0.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3628ms 1.6111ms 620.6802 Ops/s 626.2261 Ops/s $\color{#d91a1a}-0.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4495ms 4.9968ms 200.1276 Ops/s 206.2347 Ops/s $\color{#d91a1a}-2.96\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2315ms 0.6853ms 1.4592 KOps/s 1.4868 KOps/s $\color{#d91a1a}-1.86\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.1988ms 0.6549ms 1.5268 KOps/s 1.5530 KOps/s $\color{#d91a1a}-1.68\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8764ms 4.8110ms 207.8586 Ops/s 217.9450 Ops/s $\color{#d91a1a}-4.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5143ms 0.5412ms 1.8479 KOps/s 1.8534 KOps/s $\color{#d91a1a}-0.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8935ms 0.5210ms 1.9194 KOps/s 1.9743 KOps/s $\color{#d91a1a}-2.78\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4937ms 4.7805ms 209.1844 Ops/s 213.1299 Ops/s $\color{#d91a1a}-1.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8694ms 0.5359ms 1.8660 KOps/s 1.9072 KOps/s $\color{#d91a1a}-2.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7559ms 0.5112ms 1.9564 KOps/s 1.9403 KOps/s $\color{#35bf28}+0.83\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1342ms 4.8910ms 204.4566 Ops/s 216.4886 Ops/s $\textbf{\color{#d91a1a}-5.56\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4193ms 0.6851ms 1.4597 KOps/s 1.5053 KOps/s $\color{#d91a1a}-3.03\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0990ms 0.6575ms 1.5210 KOps/s 1.5444 KOps/s $\color{#d91a1a}-1.52\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.0100ms 4.4563ms 224.4000 Ops/s 246.1668 Ops/s $\textbf{\color{#d91a1a}-8.84\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1815ms 2.3628ms 423.2320 Ops/s 449.8263 Ops/s $\textbf{\color{#d91a1a}-5.91\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.5921ms 1.3447ms 743.6799 Ops/s 785.1319 Ops/s $\textbf{\color{#d91a1a}-5.28\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4891s 14.1109ms 70.8672 Ops/s 250.0819 Ops/s $\textbf{\color{#d91a1a}-71.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.0265ms 2.3181ms 431.3790 Ops/s 34.8893 Ops/s $\textbf{\color{#35bf28}+1136.42\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.7833ms 1.2507ms 799.5490 Ops/s 719.7329 Ops/s $\textbf{\color{#35bf28}+11.09\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.3321ms 4.6838ms 213.5029 Ops/s 222.4715 Ops/s $\color{#d91a1a}-4.03\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.2336ms 2.5716ms 388.8589 Ops/s 392.4660 Ops/s $\color{#d91a1a}-0.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.4086ms 1.5483ms 645.8768 Ops/s 658.1825 Ops/s $\color{#d91a1a}-1.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2903ms 11.8380ms 84.4738 Ops/s 85.7244 Ops/s $\color{#d91a1a}-1.46\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.2371ms 14.2767ms 70.0444 Ops/s 72.0407 Ops/s $\color{#d91a1a}-2.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.3652ms 20.8427ms 47.9784 Ops/s 47.6652 Ops/s $\color{#35bf28}+0.66\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.4444ms 14.5914ms 68.5335 Ops/s 67.9461 Ops/s $\color{#35bf28}+0.86\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.7005ms 20.9312ms 47.7755 Ops/s 48.3741 Ops/s $\color{#d91a1a}-1.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8909ms 15.7359ms 63.5490 Ops/s 65.6882 Ops/s $\color{#d91a1a}-3.26\%$

@vmoens vmoens added the CI Has to do with CI setup (e.g. wheels & builds, tests...) label Mar 10, 2025
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}28$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8824s 0.7947s 1.2583 Ops/s 1.2378 Ops/s $\color{#35bf28}+1.66\%$
test_transformed 1.4674s 1.3762s 0.7266 Ops/s 0.6779 Ops/s $\textbf{\color{#35bf28}+7.19\%}$
test_serial 2.3405s 2.2493s 0.4446 Ops/s 0.4215 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_parallel 1.9689s 1.8765s 0.5329 Ops/s 0.5367 Ops/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-True-True-True] 0.3110ms 38.5373μs 25.9489 KOps/s 25.8543 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-True-True-True-False] 99.5110μs 22.5162μs 44.4126 KOps/s 44.1296 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[True-True-True-False-True] 55.2100μs 21.3196μs 46.9051 KOps/s 45.7711 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[True-True-True-False-False] 42.9510μs 12.4447μs 80.3558 KOps/s 77.9175 KOps/s $\color{#35bf28}+3.13\%$
test_step_mdp_speed[True-True-False-True-True] 0.1725ms 41.2053μs 24.2687 KOps/s 23.9343 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-False-True-False] 61.7410μs 24.7175μs 40.4571 KOps/s 39.1938 KOps/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[True-True-False-False-True] 93.7110μs 23.5179μs 42.5207 KOps/s 41.1878 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[True-True-False-False-False] 50.3310μs 14.6988μs 68.0328 KOps/s 66.3066 KOps/s $\color{#35bf28}+2.60\%$
test_step_mdp_speed[True-False-True-True-True] 0.1221ms 43.3799μs 23.0522 KOps/s 22.7449 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[True-False-True-True-False] 0.1709ms 27.0053μs 37.0297 KOps/s 36.0231 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[True-False-True-False-True] 60.3910μs 23.5054μs 42.5434 KOps/s 41.2020 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[True-False-True-False-False] 0.1346ms 14.7621μs 67.7412 KOps/s 66.2307 KOps/s $\color{#35bf28}+2.28\%$
test_step_mdp_speed[True-False-False-True-True] 95.2010μs 45.1218μs 22.1622 KOps/s 21.6278 KOps/s $\color{#35bf28}+2.47\%$
test_step_mdp_speed[True-False-False-True-False] 0.1058ms 29.1241μs 34.3358 KOps/s 33.8052 KOps/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[True-False-False-False-True] 80.9110μs 25.8990μs 38.6115 KOps/s 37.5603 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[True-False-False-False-False] 0.1738ms 16.9572μs 58.9718 KOps/s 57.5376 KOps/s $\color{#35bf28}+2.49\%$
test_step_mdp_speed[False-True-True-True-True] 0.1455ms 43.4652μs 23.0069 KOps/s 22.6976 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[False-True-True-True-False] 0.1839ms 27.1374μs 36.8496 KOps/s 36.0876 KOps/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[False-True-True-False-True] 2.6947ms 28.0050μs 35.7079 KOps/s 35.4953 KOps/s $\color{#35bf28}+0.60\%$
test_step_mdp_speed[False-True-True-False-False] 0.1821ms 16.6295μs 60.1341 KOps/s 60.1862 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-True-True] 0.2281ms 45.4698μs 21.9926 KOps/s 21.8369 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-True-False-True-False] 0.2054ms 28.8830μs 34.6225 KOps/s 33.4026 KOps/s $\color{#35bf28}+3.65\%$
test_step_mdp_speed[False-True-False-False-True] 0.1925ms 29.5061μs 33.8913 KOps/s 32.9962 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[False-True-False-False-False] 0.1887ms 18.5878μs 53.7987 KOps/s 52.3317 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[False-False-True-True-True] 0.1743ms 47.1249μs 21.2202 KOps/s 20.3389 KOps/s $\color{#35bf28}+4.33\%$
test_step_mdp_speed[False-False-True-True-False] 69.3410μs 31.5484μs 31.6973 KOps/s 31.0052 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-False-True-False-True] 0.1576ms 29.6080μs 33.7746 KOps/s 32.7379 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[False-False-True-False-False] 56.6410μs 18.5529μs 53.9000 KOps/s 52.0765 KOps/s $\color{#35bf28}+3.50\%$
test_step_mdp_speed[False-False-False-True-True] 0.2333ms 49.1560μs 20.3434 KOps/s 19.8478 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[False-False-False-True-False] 0.2350ms 33.5912μs 29.7697 KOps/s 28.8588 KOps/s $\color{#35bf28}+3.16\%$
test_step_mdp_speed[False-False-False-False-True] 61.8810μs 31.5972μs 31.6484 KOps/s 30.6473 KOps/s $\color{#35bf28}+3.27\%$
test_step_mdp_speed[False-False-False-False-False] 53.5710μs 20.9919μs 47.6373 KOps/s 46.4831 KOps/s $\color{#35bf28}+2.48\%$
test_values[generalized_advantage_estimate-True-True] 25.2090ms 23.9915ms 41.6814 Ops/s 40.2401 Ops/s $\color{#35bf28}+3.58\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1062s 3.0190ms 331.2375 Ops/s 354.4359 Ops/s $\textbf{\color{#d91a1a}-6.55\%}$
test_values[td0_return_estimate-False-False] 0.1013ms 77.0984μs 12.9704 KOps/s 12.7093 KOps/s $\color{#35bf28}+2.05\%$
test_values[td1_return_estimate-False-False] 53.7874ms 53.1580ms 18.8118 Ops/s 18.2694 Ops/s $\color{#35bf28}+2.97\%$
test_values[vec_td1_return_estimate-False-False] 1.3189ms 1.0720ms 932.8270 Ops/s 912.1123 Ops/s $\color{#35bf28}+2.27\%$
test_values[td_lambda_return_estimate-True-False] 85.5878ms 84.4642ms 11.8393 Ops/s 11.4843 Ops/s $\color{#35bf28}+3.09\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4070ms 1.0753ms 929.9328 Ops/s 926.9583 Ops/s $\color{#35bf28}+0.32\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.0537ms 23.7226ms 42.1539 Ops/s 41.0633 Ops/s $\color{#35bf28}+2.66\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0254ms 0.7423ms 1.3471 KOps/s 1.3220 KOps/s $\color{#35bf28}+1.89\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8145ms 0.6514ms 1.5351 KOps/s 1.5172 KOps/s $\color{#35bf28}+1.18\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6830ms 1.4698ms 680.3874 Ops/s 674.3058 Ops/s $\color{#35bf28}+0.90\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8168ms 0.6625ms 1.5094 KOps/s 1.4838 KOps/s $\color{#35bf28}+1.72\%$
test_dqn_speed[False-None] 1.7573ms 1.4520ms 688.7056 Ops/s 652.2208 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_dqn_speed[False-backward] 2.2077ms 2.0763ms 481.6158 Ops/s 464.6893 Ops/s $\color{#35bf28}+3.64\%$
test_dqn_speed[True-None] 0.7106ms 0.5332ms 1.8755 KOps/s 1.7978 KOps/s $\color{#35bf28}+4.32\%$
test_dqn_speed[True-backward] 1.1999ms 1.1039ms 905.8474 Ops/s 792.5436 Ops/s $\textbf{\color{#35bf28}+14.30\%}$
test_dqn_speed[reduce-overhead-None] 0.8132ms 0.5850ms 1.7094 KOps/s 1.7051 KOps/s $\color{#35bf28}+0.25\%$
test_dqn_speed[reduce-overhead-backward] 0.9898ms 0.9367ms 1.0676 KOps/s 925.1608 Ops/s $\textbf{\color{#35bf28}+15.40\%}$
test_ddpg_speed[False-None] 3.1677ms 2.8297ms 353.3964 Ops/s 348.7157 Ops/s $\color{#35bf28}+1.34\%$
test_ddpg_speed[False-backward] 4.4744ms 4.0914ms 244.4173 Ops/s 235.1497 Ops/s $\color{#35bf28}+3.94\%$
test_ddpg_speed[True-None] 1.4967ms 1.3013ms 768.4631 Ops/s 751.3188 Ops/s $\color{#35bf28}+2.28\%$
test_ddpg_speed[True-backward] 2.5381ms 2.3970ms 417.1857 Ops/s 408.5854 Ops/s $\color{#35bf28}+2.10\%$
test_ddpg_speed[reduce-overhead-None] 1.4591ms 1.3106ms 763.0007 Ops/s 741.2315 Ops/s $\color{#35bf28}+2.94\%$
test_ddpg_speed[reduce-overhead-backward] 2.0421ms 1.8587ms 538.0165 Ops/s 524.5842 Ops/s $\color{#35bf28}+2.56\%$
test_sac_speed[False-None] 8.1956ms 7.8064ms 128.1001 Ops/s 122.5351 Ops/s $\color{#35bf28}+4.54\%$
test_sac_speed[False-backward] 11.2502ms 10.6817ms 93.6181 Ops/s 89.5882 Ops/s $\color{#35bf28}+4.50\%$
test_sac_speed[True-None] 2.0232ms 1.7983ms 556.0912 Ops/s 512.3966 Ops/s $\textbf{\color{#35bf28}+8.53\%}$
test_sac_speed[True-backward] 3.9647ms 3.5231ms 283.8443 Ops/s 262.0167 Ops/s $\textbf{\color{#35bf28}+8.33\%}$
test_sac_speed[reduce-overhead-None] 20.7554ms 11.6209ms 86.0519 Ops/s 85.6929 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[reduce-overhead-backward] 1.7084ms 1.5772ms 634.0514 Ops/s 573.8666 Ops/s $\textbf{\color{#35bf28}+10.49\%}$
test_redq_speed[False-None] 7.8825ms 7.3961ms 135.2066 Ops/s 129.7302 Ops/s $\color{#35bf28}+4.22\%$
test_redq_speed[False-backward] 11.7649ms 11.2881ms 88.5892 Ops/s 83.9673 Ops/s $\textbf{\color{#35bf28}+5.50\%}$
test_redq_speed[True-None] 2.4352ms 2.2731ms 439.9305 Ops/s 424.3942 Ops/s $\color{#35bf28}+3.66\%$
test_redq_speed[True-backward] 4.3292ms 4.1812ms 239.1673 Ops/s 245.0017 Ops/s $\color{#d91a1a}-2.38\%$
test_redq_speed[reduce-overhead-None] 2.4961ms 2.3040ms 434.0299 Ops/s 423.9262 Ops/s $\color{#35bf28}+2.38\%$
test_redq_speed[reduce-overhead-backward] 4.3596ms 4.1389ms 241.6104 Ops/s 243.9321 Ops/s $\color{#d91a1a}-0.95\%$
test_redq_deprec_speed[False-None] 9.6597ms 9.1196ms 109.6543 Ops/s 109.7131 Ops/s $\color{#d91a1a}-0.05\%$
test_redq_deprec_speed[False-backward] 12.7282ms 12.0990ms 82.6515 Ops/s 82.4180 Ops/s $\color{#35bf28}+0.28\%$
test_redq_deprec_speed[True-None] 2.8107ms 2.5715ms 388.8710 Ops/s 375.2279 Ops/s $\color{#35bf28}+3.64\%$
test_redq_deprec_speed[True-backward] 4.8824ms 4.4314ms 225.6635 Ops/s 226.1894 Ops/s $\color{#d91a1a}-0.23\%$
test_redq_deprec_speed[reduce-overhead-None] 2.9412ms 2.6127ms 382.7514 Ops/s 380.0548 Ops/s $\color{#35bf28}+0.71\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.7041ms 4.3996ms 227.2951 Ops/s 228.4678 Ops/s $\color{#d91a1a}-0.51\%$
test_td3_speed[False-None] 7.7906ms 7.7376ms 129.2390 Ops/s 124.5762 Ops/s $\color{#35bf28}+3.74\%$
test_td3_speed[False-backward] 10.8313ms 10.3362ms 96.7478 Ops/s 96.5950 Ops/s $\color{#35bf28}+0.16\%$
test_td3_speed[True-None] 1.6259ms 1.5933ms 627.6418 Ops/s 613.8950 Ops/s $\color{#35bf28}+2.24\%$
test_td3_speed[True-backward] 3.9762ms 3.3402ms 299.3848 Ops/s 311.3395 Ops/s $\color{#d91a1a}-3.84\%$
test_td3_speed[reduce-overhead-None] 75.0144ms 25.1543ms 39.7547 Ops/s 39.8420 Ops/s $\color{#d91a1a}-0.22\%$
test_td3_speed[reduce-overhead-backward] 1.3945ms 1.3072ms 764.9883 Ops/s 756.0156 Ops/s $\color{#35bf28}+1.19\%$
test_cql_speed[False-None] 16.7981ms 16.1684ms 61.8489 Ops/s 59.2597 Ops/s $\color{#35bf28}+4.37\%$
test_cql_speed[False-backward] 21.7987ms 21.3459ms 46.8475 Ops/s 45.0872 Ops/s $\color{#35bf28}+3.90\%$
test_cql_speed[True-None] 3.3231ms 3.1576ms 316.6988 Ops/s 308.4208 Ops/s $\color{#35bf28}+2.68\%$
test_cql_speed[True-backward] 5.6330ms 5.3726ms 186.1287 Ops/s 181.5050 Ops/s $\color{#35bf28}+2.55\%$
test_cql_speed[reduce-overhead-None] 0.5940s 15.9851ms 62.5581 Ops/s 77.6590 Ops/s $\textbf{\color{#d91a1a}-19.45\%}$
test_cql_speed[reduce-overhead-backward] 2.1036ms 1.9558ms 511.2986 Ops/s 509.9643 Ops/s $\color{#35bf28}+0.26\%$
test_a2c_speed[False-None] 3.2606ms 3.0580ms 327.0101 Ops/s 310.7921 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_a2c_speed[False-backward] 6.8367ms 6.2043ms 161.1796 Ops/s 155.0493 Ops/s $\color{#35bf28}+3.95\%$
test_a2c_speed[True-None] 1.5017ms 1.3099ms 763.4401 Ops/s 735.9648 Ops/s $\color{#35bf28}+3.73\%$
test_a2c_speed[True-backward] 3.1871ms 3.0115ms 332.0565 Ops/s 318.4276 Ops/s $\color{#35bf28}+4.28\%$
test_a2c_speed[reduce-overhead-None] 15.4746ms 8.7634ms 114.1103 Ops/s 111.9677 Ops/s $\color{#35bf28}+1.91\%$
test_a2c_speed[reduce-overhead-backward] 1.7143ms 1.5723ms 636.0209 Ops/s 606.8698 Ops/s $\color{#35bf28}+4.80\%$
test_ppo_speed[False-None] 3.7289ms 3.5542ms 281.3545 Ops/s 260.1643 Ops/s $\textbf{\color{#35bf28}+8.14\%}$
test_ppo_speed[False-backward] 7.6829ms 6.9064ms 144.7931 Ops/s 138.4782 Ops/s $\color{#35bf28}+4.56\%$
test_ppo_speed[True-None] 1.5459ms 1.3759ms 726.8054 Ops/s 696.2959 Ops/s $\color{#35bf28}+4.38\%$
test_ppo_speed[True-backward] 3.3336ms 3.1833ms 314.1413 Ops/s 303.8700 Ops/s $\color{#35bf28}+3.38\%$
test_ppo_speed[reduce-overhead-None] 1.1103ms 0.9354ms 1.0691 KOps/s 1.0480 KOps/s $\color{#35bf28}+2.01\%$
test_ppo_speed[reduce-overhead-backward] 1.5368ms 1.3905ms 719.1416 Ops/s 618.9391 Ops/s $\textbf{\color{#35bf28}+16.19\%}$
test_reinforce_speed[False-None] 2.4178ms 2.1964ms 455.2971 Ops/s 425.2103 Ops/s $\textbf{\color{#35bf28}+7.08\%}$
test_reinforce_speed[False-backward] 3.6766ms 3.2181ms 310.7398 Ops/s 287.8369 Ops/s $\textbf{\color{#35bf28}+7.96\%}$
test_reinforce_speed[True-None] 1.5025ms 1.2483ms 801.0867 Ops/s 774.4212 Ops/s $\color{#35bf28}+3.44\%$
test_reinforce_speed[True-backward] 3.0372ms 2.8820ms 346.9813 Ops/s 325.8582 Ops/s $\textbf{\color{#35bf28}+6.48\%}$
test_reinforce_speed[reduce-overhead-None] 16.9125ms 9.4851ms 105.4290 Ops/s 98.7109 Ops/s $\textbf{\color{#35bf28}+6.81\%}$
test_reinforce_speed[reduce-overhead-backward] 1.5225ms 1.4625ms 683.7693 Ops/s 603.2428 Ops/s $\textbf{\color{#35bf28}+13.35\%}$
test_iql_speed[False-None] 9.4117ms 8.9583ms 111.6288 Ops/s 106.7023 Ops/s $\color{#35bf28}+4.62\%$
test_iql_speed[False-backward] 13.1057ms 12.5781ms 79.5035 Ops/s 74.3321 Ops/s $\textbf{\color{#35bf28}+6.96\%}$
test_iql_speed[True-None] 2.4217ms 2.1464ms 465.8933 Ops/s 444.3379 Ops/s $\color{#35bf28}+4.85\%$
test_iql_speed[True-backward] 4.9779ms 4.6700ms 214.1322 Ops/s 196.5683 Ops/s $\textbf{\color{#35bf28}+8.94\%}$
test_iql_speed[reduce-overhead-None] 0.5278s 12.8658ms 77.7256 Ops/s 90.1260 Ops/s $\textbf{\color{#d91a1a}-13.76\%}$
test_iql_speed[reduce-overhead-backward] 2.0026ms 1.8616ms 537.1584 Ops/s 481.0149 Ops/s $\textbf{\color{#35bf28}+11.67\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3235ms 6.0251ms 165.9722 Ops/s 164.3065 Ops/s $\color{#35bf28}+1.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7977ms 0.2732ms 3.6607 KOps/s 2.8187 KOps/s $\textbf{\color{#35bf28}+29.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5254ms 0.2995ms 3.3392 KOps/s 2.9662 KOps/s $\textbf{\color{#35bf28}+12.58\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2752ms 5.7224ms 174.7515 Ops/s 171.4722 Ops/s $\color{#35bf28}+1.91\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7907ms 0.3562ms 2.8071 KOps/s 3.5511 KOps/s $\textbf{\color{#d91a1a}-20.95\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5738ms 0.3396ms 2.9447 KOps/s 3.0608 KOps/s $\color{#d91a1a}-3.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6896ms 1.3580ms 736.3657 Ops/s 693.8917 Ops/s $\textbf{\color{#35bf28}+6.12\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5613ms 1.3032ms 767.3488 Ops/s 743.7979 Ops/s $\color{#35bf28}+3.17\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1383ms 5.8747ms 170.2206 Ops/s 165.6833 Ops/s $\color{#35bf28}+2.74\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0195ms 0.4430ms 2.2573 KOps/s 2.3733 KOps/s $\color{#d91a1a}-4.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7918ms 0.4396ms 2.2748 KOps/s 2.4826 KOps/s $\textbf{\color{#d91a1a}-8.37\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 9.3955ms 5.8055ms 172.2506 Ops/s 170.4100 Ops/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1630ms 0.3146ms 3.1782 KOps/s 3.3473 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5215ms 0.2941ms 3.4001 KOps/s 2.7944 KOps/s $\textbf{\color{#35bf28}+21.68\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9966ms 5.7096ms 175.1426 Ops/s 171.9979 Ops/s $\color{#35bf28}+1.83\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0043ms 0.3068ms 3.2594 KOps/s 3.7015 KOps/s $\textbf{\color{#d91a1a}-11.95\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6960ms 0.3166ms 3.1586 KOps/s 4.0428 KOps/s $\textbf{\color{#d91a1a}-21.87\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0687ms 5.8891ms 169.8048 Ops/s 166.1015 Ops/s $\color{#35bf28}+2.23\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1876ms 0.4104ms 2.4369 KOps/s 2.3598 KOps/s $\color{#35bf28}+3.27\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5854ms 0.3899ms 2.5647 KOps/s 2.3278 KOps/s $\textbf{\color{#35bf28}+10.18\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9634ms 5.3968ms 185.2959 Ops/s 179.7391 Ops/s $\color{#35bf28}+3.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8599ms 1.9362ms 516.4656 Ops/s 434.2661 Ops/s $\textbf{\color{#35bf28}+18.93\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.6054ms 1.2200ms 819.6939 Ops/s 837.9839 Ops/s $\color{#d91a1a}-2.18\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.7827ms 5.5109ms 181.4585 Ops/s 179.1497 Ops/s $\color{#35bf28}+1.29\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.3582ms 2.0277ms 493.1642 Ops/s 423.5486 Ops/s $\textbf{\color{#35bf28}+16.44\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.5505ms 1.2366ms 808.6511 Ops/s 829.9937 Ops/s $\color{#d91a1a}-2.57\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5198s 15.9589ms 62.6609 Ops/s 29.5766 Ops/s $\textbf{\color{#35bf28}+111.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.3916ms 2.2195ms 450.5452 Ops/s 440.0974 Ops/s $\color{#35bf28}+2.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2703ms 1.3580ms 736.3842 Ops/s 735.7536 Ops/s $\color{#35bf28}+0.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4082ms 13.2028ms 75.7413 Ops/s 73.7847 Ops/s $\color{#35bf28}+2.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.9237ms 17.1115ms 58.4401 Ops/s 58.6341 Ops/s $\color{#d91a1a}-0.33\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.6650ms 18.2065ms 54.9254 Ops/s 54.5266 Ops/s $\color{#35bf28}+0.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.0426ms 17.3415ms 57.6650 Ops/s 56.9436 Ops/s $\color{#35bf28}+1.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.6673ms 17.9951ms 55.5706 Ops/s 54.7980 Ops/s $\color{#35bf28}+1.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.7542ms 18.1176ms 55.1948 Ops/s 53.9711 Ops/s $\color{#35bf28}+2.27\%$

@vmoens vmoens closed this May 14, 2025
@vmoens vmoens deleted the gh/vmoens/111/head branch May 14, 2025 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants