-
Notifications
You must be signed in to change notification settings - Fork 286
Description
What happened
I tried to open a zarr file with xarray that I have created with CDO (and the therefore the underlying netcdf-c library). Xarray failed to open the zarr file with:
File .venv/lib/python3.12/site-packages/numcodecs/shuffle.py:31, in Shuffle._prepare_arrays(self, buf, out)
28 else:
29 out = ensure_contiguous_ndarray(out)
---> 31 if self.elementsize <= 1:
32 out.view(buf.dtype)[: len(buf)] = buf[:] # no shuffling needed
33 return buf, out
TypeError: '<=' not supported between instances of 'str' and 'int'Minimal example
- Create sample dataset
import xarray as xr import numpy as np example_ds = xr.DataArray( np.random.rand(2, 100, 100) * 10, dims=["time", "lat", "lon"], coords={"time": np.arange(2), "lat": np.arange(100), "lon": np.arange(100)}, name="data", ) example_ds.to_zarr("example_ds.zarr", mode="w")
- Compress with
cdocdo -z zip copy file://<full_path_to>/example_ds.zarr#mode=xarray file:///<full_path_to>/example_ds_cdo.zarr#mode=xarray
- Try loading with xarray
import xarray as xr xr.open_zarr("example_ds_cdo.zarr")
The cause
Upon further inspection, the .zarray had the following JSON:
{ "zarr_format": 2,
"shape": [760],
"dtype": "<f8",
"chunks": [760],
"fill_value": null,
"order": "C",
"compressor": {
"id": "zlib",
"level": "1"
},
"filters": [{"id": "shuffle", "elementsize": "0"}]}The elementwise filter parameter is a string, while numcodecs expects an integer.
Potential solution
If I am not mistaken, the serialization happens at
netcdf-c/plugins/NCZhdf5filters.c
Line 90 in 947035a
| snprintf(json,sizeof(json),"{\"id\": \"%s\", \"elementsize\": \"%u\"}",NCZ_shuffle_codec.codecid,typesize); |
"{\"id\": \"%s\", \"elementsize\": %u}"
to be compatible with numcodecs.
Workaround
When modifying the .zarray to {..."filters": [{"id": "shuffle", "elementsize": 0}]}, the loading with xarray works as expected but with values being incorrect.
Environment
netCDF 4.9.3-rc1
xarray: '2025.10.1'
numcodecs: '0.16.3'