Commit 0a2cebd
add CUDA 12.9 unit test (#3592)
Summary:
Pull Request resolved: #3592
# context
* previously CUDA 12.9 wasn't added to the unit test due to fbgemm capatibility
* now fbgemm has the support we added it into our test suite.
* for PR we only run CUDA 12.9 with python 3.13
# issue fix
* previous torchrec guithub workflow for gpu unittests are failing due to missing A100 support in fbgemm
```
>>> a=torch.empty(3, device='cuda').int()
>>> torch.ops.fbgemm.permute_2D_sparse_data(a,b,a)
Traceback (most recent call last):
File "<python-input-22>", line 1, in <module>
torch.ops.fbgemm.permute_2D_sparse_data(a,b,a)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
File "/home/hhy/.conda/envs/ci/lib/python3.13/site-packages/torch/_ops.py", line 1237, in __call__
return self._op(*args, **kwargs)
~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/hhy/.conda/envs/ci/lib/python3.13/site-packages/torch/_library/autograd.py", line 112, in autograd_impl
result = forward_no_grad(*args, Metadata(keyset, keyword_only_args))
File "/home/hhy/.conda/envs/ci/lib/python3.13/site-packages/torch/_library/autograd.py", line 41, in forward_no_grad
result = op.redispatch(keyset & _C._after_autograd_keyset, *args, **kwargs)
File "/home/hhy/.conda/envs/ci/lib/python3.13/site-packages/torch/_ops.py", line 822, in redispatch
return self._handle.redispatch_boxed(keyset, *args, **kwargs) # type: ignore[return-value]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [/__w/FBGEMM/FBGEMM/pytorch/FBGEMM/fbgemm_gpu/src/sparse_ops/sparse_permute_2d.cu(113:98)] [(permute_2D_lengths_kernel<index_t>)] CUDA Error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```
* env setup
```
$ conda create -yn fbgemm python=3.13
$ conda activate fbgemm
$ pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128
$ pip install fbgemm-gpu --index-url https://download.pytorch.org/whl/nightly/cu128
$ python -i
```
* python cli
```
>>> import torch
>>> torch.ops.import_module("fbgemm_gpu.sparse_ops")
>>> a=torch.empty(3, device='cuda').int()
>>> b=torch.empty((3,3), device='cuda').long()
>>> torch.ops.fbgemm.permute_2D_sparse_data(a,b,a)
```
* resolved
> the issue is you're running on A100 and fbgemm removed sm80 earlier in nightly, recently just added back. So if you uninstall fbgemm gpu and re-install it for today's release. It should fix your issue.
Reviewed By: aporialiao
Differential Revision: D88400551
fbshipit-source-id: 6da61d3841edcb3dcb51bbb8156822337be1ca781 parent deead45 commit 0a2cebd
1 file changed
+6
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
| 59 | + | |
58 | 60 | | |
59 | 61 | | |
60 | 62 | | |
61 | | - | |
| 63 | + | |
62 | 64 | | |
63 | 65 | | |
64 | 66 | | |
65 | | - | |
| 67 | + | |
66 | 68 | | |
67 | 69 | | |
68 | 70 | | |
69 | | - | |
| 71 | + | |
70 | 72 | | |
71 | 73 | | |
72 | 74 | | |
| |||
0 commit comments