-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] implement TypedDict structural assignment
#21467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Diagnostic diff on typing conformance testsChanges were detected when running ty on typing conformance tests--- old-output.txt 2025-11-19 21:06:40.475992259 +0000
+++ new-output.txt 2025-11-19 21:06:43.942997809 +0000
@@ -954,11 +954,15 @@
typeddicts_extra_items.py:285:43: error[invalid-key] Unknown key "language" for TypedDict `ExtraMovie`: Unknown key "language"
typeddicts_extra_items.py:293:44: error[invalid-key] Unknown key "year" for TypedDict `ClosedMovie`: Unknown key "year"
typeddicts_extra_items.py:299:54: error[invalid-key] Unknown key "summary" for TypedDict `MovieExtraStr`: Unknown key "summary"
+typeddicts_extra_items.py:300:34: error[invalid-assignment] Object of type `MovieExtraStr` is not assignable to `Mapping[str, str]`
typeddicts_extra_items.py:302:54: error[invalid-key] Unknown key "year" for TypedDict `MovieExtraInt`: Unknown key "year"
+typeddicts_extra_items.py:303:34: error[invalid-assignment] Object of type `MovieExtraInt` is not assignable to `Mapping[str, int]`
+typeddicts_extra_items.py:304:44: error[invalid-assignment] Object of type `MovieExtraInt` is not assignable to `Mapping[str, int | str]`
typeddicts_extra_items.py:310:5: error[type-assertion-failure] Type `list[tuple[str, int | str]]` does not match asserted type `list[tuple[str, object]]`
typeddicts_extra_items.py:311:5: error[type-assertion-failure] Type `list[int | str]` does not match asserted type `list[object]`
-typeddicts_extra_items.py:327:5: error[unresolved-attribute] Object of type `IntDict` has no attribute `clear`
+typeddicts_extra_items.py:326:25: error[invalid-assignment] Object of type `IntDict` is not assignable to `dict[str, int]`
typeddicts_extra_items.py:329:52: error[invalid-key] Unknown key "bar" for TypedDict `IntDictWithNum` - did you mean "num"?
+typeddicts_extra_items.py:330:32: error[invalid-assignment] Object of type `IntDictWithNum` is not assignable to `dict[str, int]`
typeddicts_extra_items.py:337:1: error[unresolved-attribute] Object of type `IntDictWithNum` has no attribute `clear`
typeddicts_extra_items.py:339:1: error[type-assertion-failure] Type `tuple[str, int]` does not match asserted type `Unknown`
typeddicts_extra_items.py:339:13: error[unresolved-attribute] Object of type `IntDictWithNum` has no attribute `popitem`
@@ -980,12 +984,26 @@
typeddicts_readonly.py:51:4: error[invalid-assignment] Cannot assign to key "year" on TypedDict `Movie1`: key is marked read-only
typeddicts_readonly.py:60:4: error[invalid-assignment] Cannot assign to key "title" on TypedDict `Movie2`: key is marked read-only
typeddicts_readonly.py:61:4: error[invalid-assignment] Cannot assign to key "year" on TypedDict `Movie2`: key is marked read-only
+typeddicts_readonly_consistency.py:37:14: error[invalid-assignment] Object of type `A1` is not assignable to `B1`
+typeddicts_readonly_consistency.py:38:14: error[invalid-assignment] Object of type `C1` is not assignable to `B1`
+typeddicts_readonly_consistency.py:40:14: error[invalid-assignment] Object of type `A1` is not assignable to `C1`
+typeddicts_readonly_consistency.py:81:14: error[invalid-assignment] Object of type `A2` is not assignable to `B2`
+typeddicts_readonly_consistency.py:82:14: error[invalid-assignment] Object of type `C2` is not assignable to `B2`
+typeddicts_readonly_consistency.py:84:14: error[invalid-assignment] Object of type `A2` is not assignable to `C2`
+typeddicts_readonly_consistency.py:85:14: error[invalid-assignment] Object of type `B2` is not assignable to `C2`
typeddicts_readonly_inheritance.py:36:4: error[invalid-assignment] Cannot assign to key "name" on TypedDict `Album2`: key is marked read-only
typeddicts_readonly_inheritance.py:65:19: error[missing-typed-dict-key] Missing required key 'name' in TypedDict `RequiredName` constructor
typeddicts_readonly_inheritance.py:82:14: error[invalid-assignment] Invalid assignment to key "ident" with declared type `str` on TypedDict `User`: value of type `Literal[3]`
typeddicts_readonly_inheritance.py:83:15: error[invalid-argument-type] Invalid argument to key "ident" with declared type `str` on TypedDict `User`: value of type `Literal[3]`
typeddicts_readonly_inheritance.py:84:5: error[missing-typed-dict-key] Missing required key 'ident' in TypedDict `User` constructor
+typeddicts_type_consistency.py:21:10: error[invalid-assignment] Object of type `B1` is not assignable to `A1`
+typeddicts_type_consistency.py:38:10: error[invalid-assignment] Object of type `B2` is not assignable to `A2`
+typeddicts_type_consistency.py:65:6: error[invalid-assignment] Object of type `A3` is not assignable to `B3`
typeddicts_type_consistency.py:69:21: error[invalid-key] Unknown key "y" for TypedDict `A3`: Unknown key "y"
+typeddicts_type_consistency.py:76:22: error[invalid-assignment] Object of type `B3` is not assignable to `dict[str, int]`
+typeddicts_type_consistency.py:77:25: error[invalid-assignment] Object of type `B3` is not assignable to `dict[str, object]`
+typeddicts_type_consistency.py:78:22: error[invalid-assignment] Object of type `B3` is not assignable to `dict[Any, Any]`
+typeddicts_type_consistency.py:82:25: error[invalid-assignment] Object of type `B3` is not assignable to `Mapping[str, int]`
typeddicts_type_consistency.py:101:14: error[invalid-assignment] Object of type `Unknown | None` is not assignable to `str`
typeddicts_type_consistency.py:126:56: error[invalid-argument-type] Invalid argument to key "inner_key" with declared type `str` on TypedDict `Inner1`: value of type `Literal[1]`
typeddicts_usage.py:23:7: error[invalid-key] Unknown key "director" for TypedDict `Movie`: Unknown key "director"
@@ -993,5 +1011,5 @@
typeddicts_usage.py:28:17: error[missing-typed-dict-key] Missing required key 'name' in TypedDict `Movie` constructor
typeddicts_usage.py:28:18: error[invalid-key] Unknown key "title" for TypedDict `Movie`: Unknown key "title"
typeddicts_usage.py:40:24: error[invalid-type-form] The special form `typing.TypedDict` is not allowed in type expressions. Did you mean to use a concrete TypedDict or `collections.abc.Mapping[str, object]` instead?
-Found 995 diagnostics
+Found 1013 diagnostics
WARN A fatal error occurred while checking some files. Not all project files were analyzed. See the diagnostics list above for details.
|
|
|
||
| class Person(TypedDict): | ||
| name: str | ||
| phone_number: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test wants to look at the diagnostics we print for TypedDicts in a union, but previously Person was a subtype of Animal, so this PR made the union disappear. Adding phone_number breaks the subtyping relationship, so the union remains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually if we end up going with "don't generally simplify Unions of TypedDicts", I can put this test back the way it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better for the test to pass whichever way we go on that, so this change seems good to me :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I expect if we land more comprehensive cycle-panic avoidance, we might want to go back to simplifying typed dicts in unions.
|
| if self_item_field.is_required() { | ||
| // A required field can't be assigned to a not-required, mutable field | ||
| // in the target, because `del` is allowed on the target field. | ||
| return ConstraintSet::from(false); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule was present in historical versions of the typing spec, but it's missing from the current version. It is mentioned and tested in the conformance suite. After this lands I'll open a PR to the typing docs upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-argument-type |
94 | 0 | 1 |
no-matching-overload |
23 | 0 | 0 |
unused-ignore-comment |
0 | 23 | 0 |
invalid-assignment |
7 | 6 | 1 |
invalid-return-type |
1 | 3 | 1 |
invalid-key |
0 | 4 | 0 |
type-assertion-failure |
0 | 2 | 0 |
unsupported-operator |
2 | 0 | 0 |
| Total | 127 | 38 | 3 |
|
It seems pretty common in the ecosystem report to want to pass a |
If there are cases like this showing up a lot in the ecosystem, can we just verify quick that we aren't being more strict than other type checkers? If not, then these are just true positives (at least as far as the specified type system is concerned) in the ecosystem that we are catching, great. If so, maybe other type checkers are implementing some kind of pragmatic compromise that we should at least consider. |
AlexWaygood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!! I haven't looked through the whole PR yet, but spotted one significant thing
It looks like there are a few new false positives on the typing conformance suite file I guess this is just because we don't support Other than the lines listed above, all other new diagnostics on the conformance suite look like true positives to me -- great job! |
|
Unfortunately it looks like there is a huge slowdown on the pydantic benchmark on this branch, which is causing it to timeout in CI currently. On my macbook locally, the benchmark completes in 2.161s on I suggest tackling some of the correctness issues I pointed out in #21467 (comment) and #21467 (comment) before looking at this too much. There's always a chance that fixing one/both of those will "miraculously" fix the regression 😆 |
Ah, this turned out to be exactly the issue that @AlexWaygood caught above: I was being too strict and making |
Yes I think all of the new errors-that-shouldn't-be-there are cases where |
b2d6f35 to
bc09e93
Compare
Alas, no. Digging into it. |
|
I don't have an answer yet, but an example of a specific file that's giving us trouble (which might be infecting a lot of other files, not sure) is |
|
Something you could try would be to add Salsa caching to The cost of doing this would be an increased memory usage, but that might be worth it to avoid a 30x execution time increase and benchmarks that no longer complete in CI... worth experimenting with, anyway. |
bc09e93 to
df183be
Compare
| KnownClass::Object.to_instance(db).when_assignable_to( | ||
| db, | ||
| target_item_field.declared_ty, | ||
| inferable, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly more efficient, and less verbose:
| KnownClass::Object.to_instance(db).when_assignable_to( | |
| db, | |
| target_item_field.declared_ty, | |
| inferable, | |
| ) | |
| Type::object().when_assignable_to( | |
| db, | |
| target_item_field.declared_ty, | |
| inferable, | |
| ) |
also, is really correct that we should use when_assignable_to even if relation == TypeRelation::Subtyping? Should it be this instead?
| KnownClass::Object.to_instance(db).when_assignable_to( | |
| db, | |
| target_item_field.declared_ty, | |
| inferable, | |
| ) | |
| Type::object().has_relation_to_impl( | |
| db, | |
| target_item_field.declared_ty, | |
| inferable, | |
| relation, | |
| relation_visitor, | |
| disjointness_visitor, | |
| ) |
(no tests fail if I make that change, so it looks like we might be missing some test coverage here either way ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird: committing the first suggestion resolves the whole comment, even though the second suggestion is pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's an example of something I can do in Python that demands the "subtyping" relation, as opposed to the "assignability" relation? (Besides ty_extensions.is_subtype_of I guess.) I get that they're going to be the same for fully-static types, so I know part of the answer is going to involve Any or similar, but I'm afraid I've gotten this far without actually knowing what subtyping per se is for :-D
(Edit: I went ahead and asked this in chat.)
|
Hmm, that last change fixed my "minimal repro" but not the actual Pydantic check. Time to minimize again? |
|
New minimal repro. Seems sensitive to the order of appearance of different types within the union too. from typing import Union, TypedDict
class Foo1(TypedDict):
x1: MyUnion
class Foo2(TypedDict):
x2: MyUnion
class Foo3(Foo2):
x3: MyUnion
class Foo4(TypedDict):
x4: MyUnion
MyUnion = Foo1 | Foo2 | Foo3 | Foo4Presumably some combinatorial explosion of calls that's dodging cycle detection? Digging into it... |
If the error is "too many cycle iterations", then the problem is not dodging cycle detection. That error occurs when Salsa has detected the cycle, attempted to resolve it via fixpoint iteration, but then fixpoint iteration doesn't converge to a stable value. Often this is because we try to eagerly build an ever-more-deeply-recursively-nested type on each iteration, or because iteration flip-flops indefinitely between one value and another. I haven't looked at this case in detail yet, but my suspicion is that the problem is the union simplification. That is, when we try to build the union type for |
|
The way I try to debug those issues is to add a |
I can't reproduce a panic on your branch for this snippet, but I can if I add EDIT: ah, and I was explicitly passing |
|
I tried out this change locally: Patchdiff --git a/crates/ty_python_semantic/src/types/class.rs b/crates/ty_python_semantic/src/types/class.rs
index 17b5fe7625..109a44e159 100644
--- a/crates/ty_python_semantic/src/types/class.rs
+++ b/crates/ty_python_semantic/src/types/class.rs
@@ -126,16 +126,6 @@ fn try_metaclass_cycle_initial<'db>(
})
}
-fn fields_cycle_initial<'db>(
- _db: &'db dyn Db,
- _id: salsa::Id,
- _self: ClassLiteral<'db>,
- _specialization: Option<Specialization<'db>>,
- _field_policy: CodeGeneratorKind<'db>,
-) -> FxIndexMap<Name, Field<'db>> {
- FxIndexMap::default()
-}
-
/// A category of classes with code generation capabilities (with synthesized methods).
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, salsa::Update, get_size2::GetSize)]
pub(crate) enum CodeGeneratorKind<'db> {
@@ -2339,7 +2329,7 @@ impl<'db> ClassLiteral<'db> {
// Use the alias name if provided, otherwise use the field name
let parameter_name =
- Name::new(alias.map(|alias| &**alias).unwrap_or(&**field_name));
+ Name::new(alias.map(|alias| &**alias).unwrap_or(&*field_name));
let mut parameter = if is_kw_only {
Parameter::keyword_only(parameter_name)
@@ -2603,8 +2593,8 @@ impl<'db> ClassLiteral<'db> {
)))
}
(CodeGeneratorKind::TypedDict, "get") => {
- let overloads = self
- .fields(db, specialization, field_policy)
+ let fields = self.fields(db, specialization, field_policy);
+ let overloads = fields
.iter()
.flat_map(|(name, field)| {
let key_type =
@@ -2834,7 +2824,6 @@ impl<'db> ClassLiteral<'db> {
/// Returns a list of all annotated attributes defined in this class, or any of its superclasses.
///
/// See [`ClassLiteral::own_fields`] for more details.
- #[salsa::tracked(returns(ref), cycle_initial=fields_cycle_initial, heap_size=get_size2::GetSize::get_heap_size)]
pub(crate) fn fields(
self,
db: &'db dyn Db,
diff --git a/crates/ty_python_semantic/src/types/diagnostic.rs b/crates/ty_python_semantic/src/types/diagnostic.rs
index ed8b1be1ab..1542da24ac 100644
--- a/crates/ty_python_semantic/src/types/diagnostic.rs
+++ b/crates/ty_python_semantic/src/types/diagnostic.rs
@@ -35,7 +35,7 @@ use ruff_python_ast::parenthesize::parentheses_iterator;
use ruff_python_ast::{self as ast, AnyNodeRef, Identifier};
use ruff_python_trivia::CommentRanges;
use ruff_text_size::{Ranged, TextRange};
-use rustc_hash::FxHashSet;
+use rustc_hash::{FxHashMap, FxHashSet};
use std::fmt::Formatter;
/// Registers all known type check lints.
@@ -3190,7 +3190,7 @@ pub(crate) fn report_invalid_key_on_typed_dict<'db>(
typed_dict_ty: Type<'db>,
full_object_ty: Option<Type<'db>>,
key_ty: Type<'db>,
- items: &FxIndexMap<Name, Field<'db>>,
+ items: &FxHashMap<Name, Field<'db>>,
) {
let db = context.db();
if let Some(builder) = context.report_lint(&INVALID_KEY, key_node) {
diff --git a/crates/ty_python_semantic/src/types/infer/builder.rs b/crates/ty_python_semantic/src/types/infer/builder.rs
index 85c645d37a..90bbb33ea9 100644
--- a/crates/ty_python_semantic/src/types/infer/builder.rs
+++ b/crates/ty_python_semantic/src/types/infer/builder.rs
@@ -919,9 +919,9 @@ impl<'db, 'ast> TypeInferenceBuilder<'db, 'ast> {
CodeGeneratorKind::from_class(self.db(), class, None)
{
let specialization = None;
+ let fields = class.fields(self.db(), specialization, field_policy);
- let kw_only_sentinel_fields: Vec<_> = class
- .fields(self.db(), specialization, field_policy)
+ let kw_only_sentinel_fields: Vec<_> = fields
.iter()
.filter_map(|(name, field)| {
field.is_kw_only_sentinel(self.db()).then_some(name)
diff --git a/crates/ty_python_semantic/src/types/typed_dict.rs b/crates/ty_python_semantic/src/types/typed_dict.rs
index c093dbef25..8e94a4fb1f 100644
--- a/crates/ty_python_semantic/src/types/typed_dict.rs
+++ b/crates/ty_python_semantic/src/types/typed_dict.rs
@@ -4,6 +4,7 @@ use ruff_db::parsed::parsed_module;
use ruff_python_ast::Arguments;
use ruff_python_ast::{self as ast, AnyNodeRef, StmtClassDef, name::Name};
use ruff_text_size::Ranged;
+use rustc_hash::FxHashMap;
use super::class::{ClassType, CodeGeneratorKind, Field};
use super::context::InferContext;
@@ -12,10 +13,10 @@ use super::diagnostic::{
report_missing_typed_dict_key,
};
use super::{ApplyTypeMappingVisitor, Type, TypeMapping, visitor};
+use crate::Db;
use crate::types::constraints::ConstraintSet;
use crate::types::generics::InferableTypeVars;
use crate::types::{HasRelationToVisitor, IsDisjointVisitor, TypeContext, TypeRelation};
-use crate::{Db, FxIndexMap};
use ordermap::OrderSet;
@@ -56,9 +57,25 @@ impl<'db> TypedDictType<'db> {
self.defining_class
}
- pub(crate) fn items(self, db: &'db dyn Db) -> &'db FxIndexMap<Name, Field<'db>> {
- let (class_literal, specialization) = self.defining_class.class_literal(db);
- class_literal.fields(db, specialization, CodeGeneratorKind::TypedDict)
+ pub(crate) fn items(self, db: &'db dyn Db) -> &'db FxHashMap<Name, Field<'db>> {
+ #[salsa::tracked(returns(ref), cycle_initial=items_cycle_initial, heap_size=get_size2::GetSize::get_heap_size)]
+ fn items_inner<'db>(db: &'db dyn Db, class: ClassType<'db>) -> FxHashMap<Name, Field<'db>> {
+ let (class_literal, specialization) = class.class_literal(db);
+ class_literal
+ .fields(db, specialization, CodeGeneratorKind::TypedDict)
+ .into_iter()
+ .collect()
+ }
+
+ fn items_cycle_initial<'db>(
+ _db: &'db dyn Db,
+ _id: salsa::Id,
+ _class: ClassType<'db>,
+ ) -> FxHashMap<Name, Field<'db>> {
+ FxHashMap::default()
+ }
+
+ items_inner(db, self.defining_class)
}The idea of the change is to make it easier for the cycle to converge by returning an With that patch applied, we still panic on the repro from #21467 (comment) (providing you pass New query stacktraceNote that (with or without my patch above), when invoking Pydantic query stacktrace |
|
This patch gets us passing on both minimal repros so far (#21467 (comment) and #21467 (comment)), and seems probably worth doing on its own merits, since it'll make cycles rarer and avoid us having to engage in a full structual check for many Patchdiff --git a/crates/ty_python_semantic/src/types/class.rs b/crates/ty_python_semantic/src/types/class.rs
index 17b5fe7625..78beca70d1 100644
--- a/crates/ty_python_semantic/src/types/class.rs
+++ b/crates/ty_python_semantic/src/types/class.rs
@@ -566,7 +566,9 @@ impl<'db> ClassType<'db> {
},
// Protocol and Generic are not represented by a ClassType.
- ClassBase::Protocol | ClassBase::Generic => ConstraintSet::from(false),
+ ClassBase::Protocol | ClassBase::Generic | ClassBase::TypedDict => {
+ ConstraintSet::from(false)
+ }
ClassBase::Class(base) => match (base, other) {
(ClassType::NonGeneric(base), ClassType::NonGeneric(other)) => {
@@ -589,11 +591,6 @@ impl<'db> ClassType<'db> {
ConstraintSet::from(false)
}
},
-
- ClassBase::TypedDict => {
- // TODO: Implement subclassing and assignability for TypedDicts.
- ConstraintSet::from(true)
- }
}
})
}
diff --git a/crates/ty_python_semantic/src/types/infer/builder.rs b/crates/ty_python_semantic/src/types/infer/builder.rs
index 85c645d37a..449e770aff 100644
--- a/crates/ty_python_semantic/src/types/infer/builder.rs
+++ b/crates/ty_python_semantic/src/types/infer/builder.rs
@@ -8007,25 +8007,26 @@ impl<'db, 'ast> TypeInferenceBuilder<'db, 'ast> {
// the `try_call` path below.
// TODO: it should be possible to move these special cases into the `try_call_constructor`
// path instead, or even remove some entirely once we support overloads fully.
- let has_special_cased_constructor = matches!(
- class.known(self.db()),
- Some(
- KnownClass::Bool
- | KnownClass::Str
- | KnownClass::Type
- | KnownClass::Object
- | KnownClass::Property
- | KnownClass::Super
- | KnownClass::TypeAliasType
- | KnownClass::Deprecated
- )
- ) || (
- // Constructor calls to `tuple` and subclasses of `tuple` are handled in `Type::Bindings`,
- // but constructor calls to `tuple[int]`, `tuple[int, ...]`, `tuple[int, *tuple[str, ...]]` (etc.)
- // are handled by the default constructor-call logic (we synthesize a `__new__` method for them
- // in `ClassType::own_class_member()`).
- class.is_known(self.db(), KnownClass::Tuple) && !class.is_generic()
- );
+ let has_special_cased_constructor =
+ matches!(
+ class.known(self.db()),
+ Some(
+ KnownClass::Bool
+ | KnownClass::Str
+ | KnownClass::Type
+ | KnownClass::Object
+ | KnownClass::Property
+ | KnownClass::Super
+ | KnownClass::TypeAliasType
+ | KnownClass::Deprecated
+ )
+ ) || (
+ // Constructor calls to `tuple` and subclasses of `tuple` are handled in `Type::Bindings`,
+ // but constructor calls to `tuple[int]`, `tuple[int, ...]`, `tuple[int, *tuple[str, ...]]` (etc.)
+ // are handled by the default constructor-call logic (we synthesize a `__new__` method for them
+ // in `ClassType::own_class_member()`).
+ class.is_known(self.db(), KnownClass::Tuple) && !class.is_generic()
+ ) || CodeGeneratorKind::TypedDict.matches(self.db(), class.class_literal(self.db()).0, class.class_literal(self.db()).1);
// temporary special-casing for all subclasses of `enum.Enum`
// until we support the functional syntax for creating enum classes
diff --git a/crates/ty_python_semantic/src/types/typed_dict.rs b/crates/ty_python_semantic/src/types/typed_dict.rs
index c093dbef25..bafe798425 100644
--- a/crates/ty_python_semantic/src/types/typed_dict.rs
+++ b/crates/ty_python_semantic/src/types/typed_dict.rs
@@ -90,6 +90,16 @@ impl<'db> TypedDictType<'db> {
relation_visitor: &HasRelationToVisitor<'db>,
disjointness_visitor: &IsDisjointVisitor<'db>,
) -> ConstraintSet<'db> {
+ // First do a quick nominal check that (if it succeeds) means that we can avoid
+ // materializing the full `TypeDict` schema for either `self` or `target`.
+ // This should be cheaper in many cases, and also helps us avoid some cycles.
+ if self
+ .defining_class
+ .is_subclass_of(db, target.defining_class)
+ {
+ return ConstraintSet::from(true);
+ }
+
let self_items = self.items(db);
let target_items = target.items(db);
// Many rules violations short-circuit with "never", but asking whether one field isWe still panic when checking pydantic, even with that patch applied, however, so @oconnor663 will have to find a new minimal repro if he applies the patch ;) |
Co-authored-by: Alex Waygood <[email protected]>
Co-authored-by: Alex Waygood <[email protected]>
Co-authored-by: Alex Waygood <[email protected]>
Yes I should've said I'm just running all of these snippets out of |
|
Here's a new minimized repro (running as a standalone script out of from typing import TypedDict
class Foo1(TypedDict):
x1: MyUnion
class Foo2(TypedDict):
x2: MyUnion
class Foo3(Foo2):
pass
class Foo4(Foo2):
x3: MyUnion
class Foo5(TypedDict):
x5: MyUnion
MyUnion = Foo1 | Foo3 | Foo4 | Foo5It feels "pretty much the same" as the last one. What I'm noticing as I minimize these, is that there are a lot of "paths" I can take on the way down. I'll get a different result if I start ripping things out from the top vs from the bottom vs in the middle. Eventually I rip out something that removes the panic, so I put that thing back and roll the dice again on where to delete more stuff. So the end result is 1) one of many local minima I could wind up at and 2) maximally sensitive to the order of elements within the union. Which means that minor tweaks in how things are evaluated are likely to "fix" one of these minimal examples by coincidence without really fixing the underlying issue. I'm going to try to follow @carljm and @MichaReiser's advice above and play around with logging this some more, to see if I can get some more intuition about what's going on. |
|
Making some progress with printouts. (Finally figured out that a "cycle recovery function" is Notable that the union keeps collapsing down to a single TypedDict type. More staring required... |
|
One thing that we could consider here that might help is to just never consider two named |
|
I'll try that. Separately, I've been refining the printout above, and now I see that sometimes my |
|
Also Alex mentioned at one point that he hasn't been able to get |
d2f1d58 to
c4c95ef
Compare
fd7ac18 to
17e3bf0
Compare
|
I think I'm seeing CI runs for later commits getting cancelled in favor of earlier commits, which doesn't make sense to me. Known problem? My fault for force-pushing something? In any case, I've manually restarted https://github.com/astral-sh/ruff/actions/runs/19516380284/job/55870895275, and I think that's the one to watch. |
|
Ok, everything is passing, except that CodSpeed reports a ~10x slowdown on the Pydantic benchmark. I'm going to try to isolate what parts are costing us the most time, but at the same time I need feedback on whether this is "obviously a performance bug" or "maybe reasonable given how convoluted their unions are"? |
|
The magnitude of the regression is surprising to me. I definitely think we should explore how we can be more efficient here, but given that no other project regresses more than 1%, it's clearly a factor of the size and ubiquity of the typed dicts in pydantic; I wouldn't necessarily be opposed to going ahead and landing this and doing more optimization as a follow-up. |
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work!
| class Amnesiac(TypedDict, total=False): | ||
| name: ReadOnly[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also explicitly test the total=False but mutable field case?
| `B` can have any assignable type. But if the item in `A` is mutable, the item type in `B` must match | ||
| exactly. The required and not-required cases are different codepaths, so we need test all the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"match exactly" isn't a well defined term. "Equivalent" is, but I think that's not actually the right term here either; I think the right term here is "consistent". Which is the same thing as "mutually assignable".
The types int and Any, for example, do not "match exactly" (at least not as I would intuitively define that term; it isn't defined in the typing spec), nor are they equivalent (as defined in the typing spec), but they are mutually assignable, and they are "consistent" as defined in the spec (Any can materialize to int).
And this code does not have type errors:
from typing import Any, TypedDict
class A(TypedDict):
x: Any
class B(TypedDict):
x: int
def _(b: B) -> A:
return b
def _(a: A) -> B:
return aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you got this all correct (and described well) in the comments in the actual implementation, so I think this is just tightening up the terminology here, and maybe adding some tests as mentioned below.
| # Not assignable. | ||
| # error: [invalid-assignment] "Object of type `Person` is not assignable to `Mapping[str, int]`" | ||
| _: Mapping[str, int] = alice | ||
| # TODO: Could be assignable once we support `closed=True` and/or `extra_items`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly we do want to support closed and extra_items at some point, but I don't think there is any TODO right at this spot. A TODO implies that our results on this code should be different, given better support. Person does not have closed=True or extra_items, so nothing we add support for in the future should change this error in this particular case.
If we added a test using a TypedDict that actually defines closed or extra_items and tested its assignability to a non-universal Mapping type (and got the wrong result for now), that would provide occasion for a TODO comment.
This seems like a good spot for an explanatory comment, but not for a TODO comment.
| # TODO: Could be assignable once we support `closed=True` and/or `extra_items`. | |
| # `Person` does not have `closed=True` or `extra_items`, so it may have additional keys with values | |
| # of unknown type, therefore it can't be assigned to a `Mapping` with value type smaller than `object`. |
| permutations: | ||
|
|
||
| ```py | ||
| from typing import Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We import Any here, but it doesn't seem to be used in this block of tests.
I think we should add some tests with typeddicts with Any typed field(s), showing how that impacts subtyping and assignability. Assignability should be forgiving (as shown in the comment above); subtyping should be strict.
| reveal_type(e | e) # revealed: Employee | ||
|
|
||
| # TODO: Should be `Person` once we support subtyping for TypedDicts | ||
| # EXPERIMENT: Simplification of TypedDicts in Unions is disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably keep this as a TODO:
| # EXPERIMENT: Simplification of TypedDicts in Unions is disabled. | |
| # TODO: Should be `Person`; simplifying TypedDicts in Unions is pending better cycle handling |
| // For mutable fields in the target, the relation needs to apply both | ||
| // ways, or else mutating the target could violate the structural | ||
| // invariants of self. For fully-static types, this is "equivalence". | ||
| // For gradual types, it depends on the relation, but mutual | ||
| // assignability is "consistency". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent comment!
| // Subtyping between `TypedDict`s follows the algorithm described at: | ||
| // https://typing.python.org/en/latest/spec/typeddict.html#subtyping-between-typeddict-types | ||
| pub(super) fn has_relation_to_impl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really clear and well-documented implementation of some fairly hairy logic!
AlexWaygood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!!
Since Carl already reviewed, I haven't done an exhaustive review of the semantics here vis-a-vis the spec -- just a few notes from me.
I think we should definitely look at improving perf here as a (post-beta?) followup. Aside from anything else, it's frustrating that the benchmarks CI job now takes 5 minutes longer than it used to 😆 But I do agree that we should land this now and move on; there's too much else to do before the beta.
I don't think you have any tests currently for generic typeddicts. It looks like everything works correctly on your branch, but it would be great to test it explicitly; something like this?
from typing import TypedDict
from ty_extensions import static_assert, is_assignable_to, is_subtype_of
class F[T](TypedDict):
x: T
class G[T](TypedDict):
x: T
static_assert(is_assignable_to(F, G))
def x[T](a: T) -> T:
static_assert(is_subtype_of(F[T], G[T]))
return a
static_assert(is_subtype_of(F[int], G[int]))
static_assert(not is_assignable_to(F[int], G[str]))Some tests that use the legacy syntax (multiple-inheriting from TypedDict and Generic[T]) would be great as well.
I ran the property tests on this branch and didn't see any issues. Though I don't think we include any TypedDict types in the property tests right now, so that's not a massive surprise 😆. We should probably open a followup issue for that -- again, that should probably be done post-beta.
| def _(o1: Outer1, o2: Outer2): | ||
| _: Outer1 = o2 | ||
| _: Outer2 = o1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd personally find these more readable if we used the static_assert(is_assignable_to(...)) pattern here
| def _(o1: Outer1, o2: Outer2, o3: Outer3, o4: Outer4): | ||
| _: Outer1 = o3 | ||
| _: Outer1 = o4 | ||
|
|
||
| _: Outer2 = o3 | ||
| _: Outer2 = o4 | ||
|
|
||
| _: Outer3 = o1 | ||
| _: Outer3 = o2 | ||
| _: Outer3 = o3 | ||
| _: Outer3 = o4 | ||
|
|
||
| _: Outer4 = o1 | ||
| _: Outer4 = o2 | ||
| _: Outer4 = o3 | ||
| _: Outer4 = o4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| /// | ||
| /// See [`ClassLiteral::own_fields`] for more details. | ||
| #[salsa::tracked(returns(ref), heap_size=get_size2::GetSize::get_heap_size)] | ||
| #[salsa::tracked(returns(ref), cycle_initial=fields_cycle_initial, heap_size=get_size2::GetSize::get_heap_size)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: quite a long line
| #[salsa::tracked(returns(ref), cycle_initial=fields_cycle_initial, heap_size=get_size2::GetSize::get_heap_size)] | |
| #[salsa::tracked( | |
| returns(ref), | |
| cycle_initial=fields_cycle_initial, | |
| heap_size=get_size2::GetSize::get_heap_size) | |
| ] |
| relation_visitor: &HasRelationToVisitor<'db>, | ||
| disjointness_visitor: &IsDisjointVisitor<'db>, | ||
| ) -> ConstraintSet<'db> { | ||
| // First do a quick nominal check that (if it succeeds) means that we can avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just talked with Alex about the 10x performance regression on pydantic here, caused by the large union of TypedDicts (many of which are probably not subtypes of each other). When we build that union, we need to perform O(n²) subtyping checks, so it's understandable that we see a huge regression now that we actually do nontrivial work in those checks. Other than the nominal check here (which is probably not super fast, actually), are there any other short circuit paths where we could return false early? The most common case might be something like the following where two "normal" TypedDicts (no extra_items or similar) are simply incompatible because their key names are not compatible:
class A(TypedDict):
key_a: int
class B(TypedDict):
key_b: intIs there something we can do by just looking at the names, without ever looking at the value types?
Closes astral-sh/ty#1387.