[red-knot] Restructure attribute-access and descriptor-protocol test suites. (#16664)

## Summary

This is a pure restructuring of the `attributes.md` and
`descriptor_protocol.md` test suites. They have grown organically and I
didn't want to make major structural changes in my recent PR to keep the
diff clean.
This commit is contained in:
David Peter 2025-03-12 09:52:21 +01:00 committed by GitHub
parent 3228545598
commit 11b5cbcd2f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 342 additions and 331 deletions

View file

@ -808,7 +808,7 @@ def _(flag1: bool, flag2: bool):
reveal_type(C5.attr1) # revealed: Unknown | Literal["metaclass value", "class value"]
```
## Union of attributes
## Unions of attributes
If the (meta)class is a union type or if the attribute on the (meta) class has a union type, we
infer those union types accordingly:
@ -854,43 +854,6 @@ def _(flag: bool):
reveal_type(C4.x) # revealed: Unknown | Literal[7, 8]
```
## Inherited class attributes
### Basic
```py
class A:
X = "foo"
class B(A): ...
class C(B): ...
reveal_type(C.X) # revealed: Unknown | Literal["foo"]
```
### Multiple inheritance
```py
class O: ...
class F(O):
X = 56
class E(O):
X = 42
class D(O): ...
class C(D, F): ...
class B(E, D): ...
class A(B, C): ...
# revealed: tuple[Literal[A], Literal[B], Literal[E], Literal[C], Literal[D], Literal[F], Literal[O], Literal[object]]
reveal_type(A.__mro__)
# `E` is earlier in the MRO than `F`, so we should use the type of `E.X`
reveal_type(A.X) # revealed: Unknown | Literal[42]
```
## Unions with possibly unbound paths
### Definite boundness within a class
@ -1028,7 +991,58 @@ def _(flag: bool):
reveal_type(Foo().x) # revealed: Unknown | Literal[1]
```
### Attribute access on `Any`
### Unions with all paths unbound
If the symbol is unbound in all elements of the union, we detect that:
```py
def _(flag: bool):
class C1: ...
class C2: ...
C = C1 if flag else C2
# error: [unresolved-attribute] "Type `Literal[C1, C2]` has no attribute `x`"
reveal_type(C.x) # revealed: Unknown
```
## Inherited class attributes
### Basic
```py
class A:
X = "foo"
class B(A): ...
class C(B): ...
reveal_type(C.X) # revealed: Unknown | Literal["foo"]
```
### Multiple inheritance
```py
class O: ...
class F(O):
X = 56
class E(O):
X = 42
class D(O): ...
class C(D, F): ...
class B(E, D): ...
class A(B, C): ...
# revealed: tuple[Literal[A], Literal[B], Literal[E], Literal[C], Literal[D], Literal[F], Literal[O], Literal[object]]
reveal_type(A.__mro__)
# `E` is earlier in the MRO than `F`, so we should use the type of `E.X`
reveal_type(A.X) # revealed: Unknown | Literal[42]
```
## Attribute access on `Any`
The union of the set of types that `Any` could materialise to is equivalent to `object`. It follows
from this that attribute access on `Any` resolves to `Any` if the attribute does not exist on
@ -1059,20 +1073,6 @@ reveal_type(C.__mro__) # revealed: tuple[Literal[C], Literal[B], Any, Literal[A
reveal_type(C.x) # revealed: Literal[1] & Any
```
### Unions with all paths unbound
If the symbol is unbound in all elements of the union, we detect that:
```py
def _(flag: bool):
class C1: ...
class C2: ...
C = C1 if flag else C2
# error: [unresolved-attribute] "Type `Literal[C1, C2]` has no attribute `x`"
reveal_type(C.x) # revealed: Unknown
```
## Objects of all types have a `__class__` method
The type of `x.__class__` is the same as `x`'s meta-type. `x.__class__` is always the same value as
@ -1125,6 +1125,8 @@ reveal_type(Foo.__class__) # revealed: Literal[type]
## Module attributes
### Basic
`mod.py`:
```py
@ -1159,7 +1161,7 @@ for mod.global_symbol in IntIterable():
pass
```
## Nested attributes
### Nested module attributes
`outer/__init__.py`:

View file

@ -6,7 +6,9 @@ A descriptor is an attribute value that has one of the methods in the descriptor
methods are `__get__()`, `__set__()`, and `__delete__()`. If any of those methods are defined for an
attribute, it is said to be a descriptor.
## Basic example
## Basic properties
### Example
An introductory example, modeled after a [simple example] in the primer on descriptors, involving a
descriptor that returns a constant value:
@ -41,7 +43,7 @@ c.ten = 11
C.ten = 11
```
## Different types for `__get__` and `__set__`
### Different types for `__get__` and `__set__`
The return type of `__get__` and the value type of `__set__` can be different:
@ -76,7 +78,7 @@ c.flexible_int = None # not okay
reveal_type(c.flexible_int) # revealed: int | None
```
## Data and non-data descriptors
### Data and non-data descriptors
Descriptors that define `__set__` or `__delete__` are called *data descriptors*. An example of a
data descriptor is a `property` with a setter and/or a deleter. Descriptors that only define
@ -131,6 +133,73 @@ reveal_type(C.non_data_descriptor) # revealed: Unknown | Literal["non-data"]
C.data_descriptor = "something else" # This is okay
```
### Partial fall back
Our implementation of the descriptor protocol takes into account that symbols can be possibly
unbound. In those cases, we fall back to lower precedence steps of the descriptor protocol and union
all possible results accordingly. We start by defining a data and a non-data descriptor:
```py
from typing import Literal
class DataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> Literal["data"]:
return "data"
def __set__(self, instance: object, value: int) -> None:
pass
class NonDataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> Literal["non-data"]:
return "non-data"
```
Then, we demonstrate that we fall back to an instance attribute if a data descriptor is possibly
unbound:
```py
def f1(flag: bool):
class C1:
if flag:
attr = DataDescriptor()
def f(self):
self.attr = "normal"
reveal_type(C1().attr) # revealed: Unknown | Literal["data", "normal"]
```
We never treat implicit instance attributes as definitely bound, so we fall back to the non-data
descriptor here:
```py
def f2(flag: bool):
class C2:
def f(self):
self.attr = "normal"
attr = NonDataDescriptor()
reveal_type(C2().attr) # revealed: Unknown | Literal["non-data", "normal"]
```
### Descriptors only work when used as class variables
Descriptors only work when used as class variables. When put in instances, they have no effect.
```py
from typing import Literal
class Ten:
def __get__(self, instance: object, owner: type | None = None) -> Literal[10]:
return 10
class C:
def __init__(self):
self.ten: Ten = Ten()
reveal_type(C().ten) # revealed: Ten
```
## Descriptor protocol for class objects
When attributes are accessed on a class object, the following [precedence chain] is used:
@ -306,56 +375,64 @@ def _(flag: bool):
reveal_type(C7.union_of_class_data_descriptor_and_attribute) # revealed: Literal["data", 2]
```
## Partial fall back
## Descriptors distinguishing between class and instance access
Our implementation of the descriptor protocol takes into account that symbols can be possibly
unbound. In those cases, we fall back to lower precedence steps of the descriptor protocol and union
all possible results accordingly. We start by defining a data and a non-data descriptor:
Overloads can be used to distinguish between when a descriptor is accessed on a class object and
when it is accessed on an instance. A real-world example of this is the `__get__` method on
`types.FunctionType`.
```py
from typing import Literal
from typing_extensions import Literal, LiteralString, overload
class DataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> Literal["data"]:
return "data"
class Descriptor:
@overload
def __get__(self, instance: None, owner: type, /) -> Literal["called on class object"]: ...
@overload
def __get__(self, instance: object, owner: type | None = None, /) -> Literal["called on instance"]: ...
def __get__(self, instance, owner=None, /) -> LiteralString:
if instance:
return "called on instance"
else:
return "called on class object"
def __set__(self, instance: object, value: int) -> None:
pass
class C:
d: Descriptor = Descriptor()
class NonDataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> Literal["non-data"]:
return "non-data"
# TODO: should be `Literal["called on class object"]
reveal_type(C.d) # revealed: LiteralString
# TODO: should be `Literal["called on instance"]
reveal_type(C().d) # revealed: LiteralString
```
Then, we demonstrate that we fall back to an instance attribute if a data descriptor is possibly
unbound:
## Descriptor protocol for dunder methods
Dunder methods are always looked up on the meta-type. There is no instance fallback. This means that
an implicit dunder call on an instance-like object will not only look up the dunder method on the
class object, without considering instance attributes. And an implicit dunder call on a class object
will look up the dunder method on the metaclass, without considering class attributes.
```py
def f1(flag: bool):
class C1:
if flag:
attr = DataDescriptor()
class SomeCallable:
def __call__(self, x: int) -> str:
return "a"
def f(self):
self.attr = "normal"
class Descriptor:
def __get__(self, instance: object, owner: type | None = None) -> SomeCallable:
return SomeCallable()
reveal_type(C1().attr) # revealed: Unknown | Literal["data", "normal"]
class B:
__call__: Descriptor = Descriptor()
b_instance = B()
reveal_type(b_instance(1)) # revealed: str
b_instance("bla") # error: [invalid-argument-type]
```
We never treat implicit instance attributes as definitely bound, so we fall back to the non-data
descriptor here:
## Special descriptors
```py
def f2(flag: bool):
class C2:
def f(self):
self.attr = "normal"
attr = NonDataDescriptor()
reveal_type(C2().attr) # revealed: Unknown | Literal["non-data", "normal"]
```
## Built-in `property` descriptor
### Built-in `property` descriptor
The built-in `property` decorator creates a descriptor. The names for attribute reads/writes are
determined by the return type of the `name` method and the parameter type of the setter,
@ -396,7 +473,7 @@ c.name = None
c.name = 42
```
## Built-in `classmethod` descriptor
### Built-in `classmethod` descriptor
Similarly to `property`, `classmethod` decorator creates an implicit descriptor that binds the first
argument to the class instead of the instance.
@ -423,244 +500,7 @@ reveal_type(C.get_name()) # revealed: str
reveal_type(C("42").get_name()) # revealed: str
```
## Descriptors only work when used as class variables
From the descriptor guide:
> Descriptors only work when used as class variables. When put in instances, they have no effect.
```py
from typing import Literal
class Ten:
def __get__(self, instance: object, owner: type | None = None) -> Literal[10]:
return 10
class C:
def __init__(self):
self.ten: Ten = Ten()
reveal_type(C().ten) # revealed: Ten
```
## Descriptors distinguishing between class and instance access
Overloads can be used to distinguish between when a descriptor is accessed on a class object and
when it is accessed on an instance. A real-world example of this is the `__get__` method on
`types.FunctionType`.
```py
from typing_extensions import Literal, LiteralString, overload
class Descriptor:
@overload
def __get__(self, instance: None, owner: type, /) -> Literal["called on class object"]: ...
@overload
def __get__(self, instance: object, owner: type | None = None, /) -> Literal["called on instance"]: ...
def __get__(self, instance, owner=None, /) -> LiteralString:
if instance:
return "called on instance"
else:
return "called on class object"
class C:
d: Descriptor = Descriptor()
# TODO: should be `Literal["called on class object"]
reveal_type(C.d) # revealed: LiteralString
# TODO: should be `Literal["called on instance"]
reveal_type(C().d) # revealed: LiteralString
```
## Undeclared descriptor arguments
If a descriptor attribute is not declared, we union with `Unknown`, just like for regular
attributes, since that attribute could be overwritten externally. Even a data descriptor with a
`__set__` method can be overwritten when accessed through a class object.
```py
class Descriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def __set__(self, instance: object, value: int) -> None:
pass
class C:
descriptor = Descriptor()
C.descriptor = "something else"
# This could also be `Literal["something else"]` if we support narrowing of attribute types based on assignments
reveal_type(C.descriptor) # revealed: Unknown | int
```
## `__get__` is called with correct arguments
```py
from __future__ import annotations
class TailoredForClassObjectAccess:
def __get__(self, instance: None, owner: type[C]) -> int:
return 1
class TailoredForInstanceAccess:
def __get__(self, instance: C, owner: type[C] | None = None) -> str:
return "a"
class TailoredForMetaclassAccess:
def __get__(self, instance: type[C], owner: type[Meta]) -> bytes:
return b"a"
class Meta(type):
metaclass_access: TailoredForMetaclassAccess = TailoredForMetaclassAccess()
class C(metaclass=Meta):
class_object_access: TailoredForClassObjectAccess = TailoredForClassObjectAccess()
instance_access: TailoredForInstanceAccess = TailoredForInstanceAccess()
reveal_type(C.class_object_access) # revealed: int
reveal_type(C().instance_access) # revealed: str
reveal_type(C.metaclass_access) # revealed: bytes
# TODO: These should emit a diagnostic
reveal_type(C().class_object_access) # revealed: TailoredForClassObjectAccess
reveal_type(C.instance_access) # revealed: TailoredForInstanceAccess
```
## Descriptors with incorrect `__get__` signature
```py
class Descriptor:
# `__get__` method with missing parameters:
def __get__(self) -> int:
return 1
class C:
descriptor: Descriptor = Descriptor()
# TODO: This should be an error
reveal_type(C.descriptor) # revealed: Descriptor
# TODO: This should be an error
reveal_type(C().descriptor) # revealed: Descriptor
```
## Possibly unbound descriptor attributes
```py
class DataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def __set__(self, instance: int, value) -> None:
pass
class NonDataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def _(flag: bool):
class PossiblyUnbound:
if flag:
non_data: NonDataDescriptor = NonDataDescriptor()
data: DataDescriptor = DataDescriptor()
# error: [possibly-unbound-attribute] "Attribute `non_data` on type `Literal[PossiblyUnbound]` is possibly unbound"
reveal_type(PossiblyUnbound.non_data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `non_data` on type `PossiblyUnbound` is possibly unbound"
reveal_type(PossiblyUnbound().non_data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `data` on type `Literal[PossiblyUnbound]` is possibly unbound"
reveal_type(PossiblyUnbound.data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `data` on type `PossiblyUnbound` is possibly unbound"
reveal_type(PossiblyUnbound().data) # revealed: int
```
## Possibly-unbound `__get__` method
```py
def _(flag: bool):
class MaybeDescriptor:
if flag:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
class C:
descriptor: MaybeDescriptor = MaybeDescriptor()
reveal_type(C.descriptor) # revealed: int | MaybeDescriptor
reveal_type(C().descriptor) # revealed: int | MaybeDescriptor
```
## Descriptors with non-function `__get__` callables that are descriptors themselves
The descriptor protocol is recursive, i.e. looking up `__get__` can involve triggering the
descriptor protocol on the callable's `__call__` method:
```py
from __future__ import annotations
class ReturnedCallable2:
def __call__(self, descriptor: Descriptor1, instance: None, owner: type[C]) -> int:
return 1
class ReturnedCallable1:
def __call__(self, descriptor: Descriptor2, instance: Callable1, owner: type[Callable1]) -> ReturnedCallable2:
return ReturnedCallable2()
class Callable3:
def __call__(self, descriptor: Descriptor3, instance: Callable2, owner: type[Callable2]) -> ReturnedCallable1:
return ReturnedCallable1()
class Descriptor3:
__get__: Callable3 = Callable3()
class Callable2:
__call__: Descriptor3 = Descriptor3()
class Descriptor2:
__get__: Callable2 = Callable2()
class Callable1:
__call__: Descriptor2 = Descriptor2()
class Descriptor1:
__get__: Callable1 = Callable1()
class C:
d: Descriptor1 = Descriptor1()
reveal_type(C.d) # revealed: int
```
## Dunder methods
Dunder methods are looked up on the meta-type, but we still need to invoke the descriptor protocol:
```py
class SomeCallable:
def __call__(self, x: int) -> str:
return "a"
class Descriptor:
def __get__(self, instance: object, owner: type | None = None) -> SomeCallable:
return SomeCallable()
class B:
__call__: Descriptor = Descriptor()
b_instance = B()
reveal_type(b_instance(1)) # revealed: str
b_instance("bla") # error: [invalid-argument-type]
```
## Functions as descriptors
### Functions as descriptors
Functions are descriptors because they implement a `__get__` method. This is crucial in making sure
that method calls work as expected. See [this test suite](./call/methods.md) for more information.
@ -737,6 +577,175 @@ wrapper_descriptor(f, None, f)
wrapper_descriptor(f, None, type(f), "one too many")
```
## Error handling and edge cases
### `__get__` is called with correct arguments
This test makes sure that we call `__get__` with the right argument types for various scenarios:
```py
from __future__ import annotations
class TailoredForClassObjectAccess:
def __get__(self, instance: None, owner: type[C]) -> int:
return 1
class TailoredForInstanceAccess:
def __get__(self, instance: C, owner: type[C] | None = None) -> str:
return "a"
class TailoredForMetaclassAccess:
def __get__(self, instance: type[C], owner: type[Meta]) -> bytes:
return b"a"
class Meta(type):
metaclass_access: TailoredForMetaclassAccess = TailoredForMetaclassAccess()
class C(metaclass=Meta):
class_object_access: TailoredForClassObjectAccess = TailoredForClassObjectAccess()
instance_access: TailoredForInstanceAccess = TailoredForInstanceAccess()
reveal_type(C.class_object_access) # revealed: int
reveal_type(C().instance_access) # revealed: str
reveal_type(C.metaclass_access) # revealed: bytes
# TODO: These should emit a diagnostic
reveal_type(C().class_object_access) # revealed: TailoredForClassObjectAccess
reveal_type(C.instance_access) # revealed: TailoredForInstanceAccess
```
### Descriptors with incorrect `__get__` signature
```py
class Descriptor:
# `__get__` method with missing parameters:
def __get__(self) -> int:
return 1
class C:
descriptor: Descriptor = Descriptor()
# TODO: This should be an error
reveal_type(C.descriptor) # revealed: Descriptor
# TODO: This should be an error
reveal_type(C().descriptor) # revealed: Descriptor
```
### Undeclared descriptor arguments
If a descriptor attribute is not declared, we union with `Unknown`, just like for regular
attributes, since that attribute could be overwritten externally. Even a data descriptor with a
`__set__` method can be overwritten when accessed through a class object.
```py
class Descriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def __set__(self, instance: object, value: int) -> None:
pass
class C:
descriptor = Descriptor()
C.descriptor = "something else"
# This could also be `Literal["something else"]` if we support narrowing of attribute types based on assignments
reveal_type(C.descriptor) # revealed: Unknown | int
```
### Possibly unbound descriptor attributes
```py
class DataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def __set__(self, instance: int, value) -> None:
pass
class NonDataDescriptor:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
def _(flag: bool):
class PossiblyUnbound:
if flag:
non_data: NonDataDescriptor = NonDataDescriptor()
data: DataDescriptor = DataDescriptor()
# error: [possibly-unbound-attribute] "Attribute `non_data` on type `Literal[PossiblyUnbound]` is possibly unbound"
reveal_type(PossiblyUnbound.non_data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `non_data` on type `PossiblyUnbound` is possibly unbound"
reveal_type(PossiblyUnbound().non_data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `data` on type `Literal[PossiblyUnbound]` is possibly unbound"
reveal_type(PossiblyUnbound.data) # revealed: int
# error: [possibly-unbound-attribute] "Attribute `data` on type `PossiblyUnbound` is possibly unbound"
reveal_type(PossiblyUnbound().data) # revealed: int
```
### Possibly-unbound `__get__` method
```py
def _(flag: bool):
class MaybeDescriptor:
if flag:
def __get__(self, instance: object, owner: type | None = None) -> int:
return 1
class C:
descriptor: MaybeDescriptor = MaybeDescriptor()
reveal_type(C.descriptor) # revealed: int | MaybeDescriptor
reveal_type(C().descriptor) # revealed: int | MaybeDescriptor
```
### Descriptors with non-function `__get__` callables that are descriptors themselves
The descriptor protocol is recursive, i.e. looking up `__get__` can involve triggering the
descriptor protocol on the callable's `__call__` method:
```py
from __future__ import annotations
class ReturnedCallable2:
def __call__(self, descriptor: Descriptor1, instance: None, owner: type[C]) -> int:
return 1
class ReturnedCallable1:
def __call__(self, descriptor: Descriptor2, instance: Callable1, owner: type[Callable1]) -> ReturnedCallable2:
return ReturnedCallable2()
class Callable3:
def __call__(self, descriptor: Descriptor3, instance: Callable2, owner: type[Callable2]) -> ReturnedCallable1:
return ReturnedCallable1()
class Descriptor3:
__get__: Callable3 = Callable3()
class Callable2:
__call__: Descriptor3 = Descriptor3()
class Descriptor2:
__get__: Callable2 = Callable2()
class Callable1:
__call__: Descriptor2 = Descriptor2()
class Descriptor1:
__get__: Callable1 = Callable1()
class C:
d: Descriptor1 = Descriptor1()
reveal_type(C.d) # revealed: int
```
[descriptors]: https://docs.python.org/3/howto/descriptor.html
[precedence chain]: https://github.com/python/cpython/blob/3.13/Objects/typeobject.c#L5393-L5481
[simple example]: https://docs.python.org/3/howto/descriptor.html#simple-example-a-descriptor-that-returns-a-constant