Skip to content

Conversation

@slevang
Copy link
Contributor

@slevang slevang commented Nov 20, 2025

Implements this proposal, bypassing the list_engines call in the case that an explicit string engine is passed that exists in the standard list. This can shave up to a few seconds off the first open_dataset call depending on your env size.

Only side effect I can see is it does change the error reporting for missing engines. Now we get:

ImportError: The zarr package is required for working with Zarr stores but could not be imported. Please install it with your package manager (e.g. conda or pip).

Instead of:

ValueError: unrecognized engine 'zarr' must be one of your download engines: ['netcdf4', 'h5netcdf', 'scipy', 'cfgrib', 'gini', 'rasterio', 'store']. To install additional dependencies, see:
https://docs.xarray.dev/en/stable/user-guide/io.html 
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html

This also doesn't help for non-standard engines, e.g. open_dataset(..., engine="cfgrib"), but if speed is crucial you can pass the backend object itself in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

First invocation of open_dataset takes 3 seconds due to backend entrypoint discovery being slow

1 participant