mirror of
https://github.com/astral-sh/ruff.git
synced 2025-10-02 06:42:02 +00:00

## Summary
This PR adds support for parsing and inferring types within string
annotations.
### Implementation (attempt 1)
This is preserved in
6217f48924
.
The implementation here would separate the inference of string
annotations in the deferred query. This requires the following:
* Two ways of evaluating the deferred definitions - lazily and eagerly.
* An eager evaluation occurs right outside the definition query which in
this case would be in `binding_ty` and `declaration_ty`.
* A lazy evaluation occurs on demand like using the
`definition_expression_ty` to determine the function return type and
class bases.
* The above point means that when trying to get the binding type for a
variable in an annotated assignment, the definition query won't include
the type. So, it'll require going through the deferred query to get the
type.
This has the following limitations:
* Nested string annotations, although not necessarily a useful feature,
is difficult to implement unless we convert the implementation in an
infinite loop
* Partial string annotations require complex layout because inferring
the types for stringified and non-stringified parts of the annotation
are done in separate queries. This means we need to maintain additional
information
### Implementation (attempt 2)
This is the final diff in this PR.
The implementation here does the complete inference of string annotation
in the same definition query by maintaining certain state while trying
to infer different parts of an expression and take decisions
accordingly. These are:
* Allow names that are part of a string annotation to not exists in the
symbol table. For example, in `x: "Foo"`, if the "Foo" symbol is not
defined then it won't exists in the symbol table even though it's being
used. This is an invariant which is being allowed only for symbols in a
string annotation.
* Similarly, lookup name is updated to do the same and if the symbol
doesn't exists, then it's not bounded.
* Store the final type of a string annotation on the string expression
itself and not for any of the sub-expressions that are created after
parsing. This is because those sub-expressions won't exists in the
semantic index.
Design document:
https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4
Closes: #13796
## Test Plan
* Add various test cases in our markdown framework
* Run `red_knot` on LibCST (contains a lot of string annotations,
specifically
https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py),
FastAPI (good amount of annotated code including `typing.Literal`) and
compare against the `main` branch output
3.3 KiB
3.3 KiB
String annotations
Simple
def f() -> "int":
return 1
reveal_type(f()) # revealed: int
Nested
def f() -> "'int'":
return 1
reveal_type(f()) # revealed: int
Type expression
def f1() -> "int | str":
return 1
def f2() -> "tuple[int, str]":
return 1
reveal_type(f1()) # revealed: int | str
reveal_type(f2()) # revealed: tuple[int, str]
Partial
def f() -> tuple[int, "str"]:
return 1
reveal_type(f()) # revealed: tuple[int, str]
Deferred
def f() -> "Foo":
return Foo()
class Foo:
pass
reveal_type(f()) # revealed: Foo
Deferred (undefined)
# error: [unresolved-reference]
def f() -> "Foo":
pass
reveal_type(f()) # revealed: Unknown
Partial deferred
def f() -> int | "Foo":
return 1
class Foo:
pass
reveal_type(f()) # revealed: int | Foo
typing.Literal
from typing import Literal
def f1() -> Literal["Foo", "Bar"]:
return "Foo"
def f2() -> 'Literal["Foo", "Bar"]':
return "Foo"
class Foo:
pass
reveal_type(f1()) # revealed: Literal["Foo", "Bar"]
reveal_type(f2()) # revealed: Literal["Foo", "Bar"]
Various string kinds
# error: [annotation-raw-string] "Type expressions cannot use raw string literal"
def f1() -> r"int":
return 1
# error: [annotation-f-string] "Type expressions cannot use f-strings"
def f2() -> f"int":
return 1
# error: [annotation-byte-string] "Type expressions cannot use bytes literal"
def f3() -> b"int":
return 1
def f4() -> "int":
return 1
# error: [annotation-implicit-concat] "Type expressions cannot span multiple string literals"
def f5() -> "in" "t":
return 1
# error: [annotation-escape-character] "Type expressions cannot contain escape characters"
def f6() -> "\N{LATIN SMALL LETTER I}nt":
return 1
# error: [annotation-escape-character] "Type expressions cannot contain escape characters"
def f7() -> "\x69nt":
return 1
def f8() -> """int""":
return 1
# error: [annotation-byte-string] "Type expressions cannot use bytes literal"
def f9() -> "b'int'":
return 1
reveal_type(f1()) # revealed: Unknown
reveal_type(f2()) # revealed: Unknown
reveal_type(f3()) # revealed: Unknown
reveal_type(f4()) # revealed: int
reveal_type(f5()) # revealed: Unknown
reveal_type(f6()) # revealed: Unknown
reveal_type(f7()) # revealed: Unknown
reveal_type(f8()) # revealed: int
reveal_type(f9()) # revealed: Unknown
Various string kinds in typing.Literal
from typing import Literal
def f() -> Literal["a", r"b", b"c", "d" "e", "\N{LATIN SMALL LETTER F}", "\x67", """h"""]:
return "normal"
reveal_type(f()) # revealed: Literal["a", "b", "de", "f", "g", "h"] | Literal[b"c"]
Class variables
MyType = int
class Aliases:
MyType = str
forward: "MyType"
not_forward: MyType
reveal_type(Aliases.forward) # revealed: str
reveal_type(Aliases.not_forward) # revealed: str
Annotated assignment
a: "int" = 1
b: "'int'" = 1
c: "Foo"
# error: [invalid-assignment] "Object of type `Literal[1]` is not assignable to `Foo`"
d: "Foo" = 1
class Foo:
pass
c = Foo()
reveal_type(a) # revealed: Literal[1]
reveal_type(b) # revealed: Literal[1]
reveal_type(c) # revealed: Foo
reveal_type(d) # revealed: Foo
Parameter
TODO: Add tests once parameter inference is supported