-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Coerce IndexVariable to Variable when assigning to data variables or coordinates
#10909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
IndexVariable to Variable when assigning to data variables or coordinates
IndexVariable to Variable when assigning to data variables or coordinatesIndexVariable to Variable when assigning to data variables or coordinates
af095e7 to
3370345
Compare
shoyer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you!
|
|
||
| def _assert_variable_invariants(var: Variable, name: Hashable = None): | ||
| def _assert_variable_invariants( | ||
| var: Variable | Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check now happens within this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with https://github.com/pydata/xarray/pull/10909/files#r2528970158, it may be better to keep the IndexVariable vs Variable check at the level of Dataset and/or DataArray invariants.
Actually, the right place should probably be _assert_indexes_invariants_checks, where we could add the check that non-index variables are not instances of IndexVariable.
| stacked.coords["variable"].drop_vars(["z", "variable", "y"]), | ||
| expected_stacked_variable, | ||
| stacked["variable"].variable.to_base_variable(), | ||
| expected_stacked_variable.variable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the only test change that was needed. Basically stacked["variable"].variable is an IndexVariable because it is represents a coord that is part of a MultiIndex. But you can't know that if you only have access to the DataArray with no coords. So I just switched it to be comparing variable and to coerce to the Variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah multi-indexes are where IndexVariable really bites us
e.g. #8887
Co-authored-by: Illviljan <[email protected]>
| for k, v in da._coords.items(): | ||
| _assert_variable_invariants(v, k) | ||
| _assert_variable_invariants( | ||
| v, k, check_default_indexes=check_default_indexes, is_index=k in da._indexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| v, k, check_default_indexes=check_default_indexes, is_index=k in da._indexes | |
| v, k, is_index=k in da._indexes |
"check default indexes" doesn't mean anything for Variable, which is only data + named dims + attrs
| def _assert_variable_invariants( | ||
| var: Variable | Any, | ||
| name: Hashable = None, | ||
| check_default_indexes: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| check_default_indexes: bool = True, |
benbovy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jsignell!
Just added two small comments.
|
|
||
| def _assert_variable_invariants(var: Variable, name: Hashable = None): | ||
| def _assert_variable_invariants( | ||
| var: Variable | Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with https://github.com/pydata/xarray/pull/10909/files#r2528970158, it may be better to keep the IndexVariable vs Variable check at the level of Dataset and/or DataArray invariants.
Actually, the right place should probably be _assert_indexes_invariants_checks, where we could add the check that non-index variables are not instances of IndexVariable.
| def test_setitem_uses_base_variable_class_even_for_index_variables(self) -> None: | ||
| ds = Dataset(coords={"x": [1, 2, 3]}) | ||
| ds["y"] = ds["x"] | ||
| _assert_internal_invariants(ds, check_default_indexes=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_setitem_uses_base_variable_class_even_for_index_variables(self) -> None: | |
| ds = Dataset(coords={"x": [1, 2, 3]}) | |
| ds["y"] = ds["x"] | |
| _assert_internal_invariants(ds, check_default_indexes=True) | |
| def test_setitem_uses_base_variable_class_even_for_index_variables(self) -> None: | |
| ds = Dataset(coords={"x": [1, 2, 3]}) | |
| ds["y"] = ds["x"] | |
| # explicit check | |
| assert isintance(ds["x"].variable, IndexVariable) | |
| assert not isinstance(ds["y"].variable, IndexVariable) | |
| # test internal invariant checks when comparing the datasets | |
| expected = Dataset(coords={"x": [1, 2, 3], "y": ("x", [1, 2, 3])}) | |
| assert_identical(ds, expected) |
Nit suggestion.
whats-new.rstNew functions/methods are listed inapi.rst