mirror of https://github.com/astral-sh/ruff.git synced 2025-10-02 06:42:02 +00:00

Dhruv Manilawala 9ec690b8f8

[red-knot] Add support for string annotations (#14151 )

## Summary

This PR adds support for parsing and inferring types within string
annotations.

### Implementation (attempt 1)

This is preserved in
6217f48924.

The implementation here would separate the inference of string
annotations in the deferred query. This requires the following:
* Two ways of evaluating the deferred definitions - lazily and eagerly. 
* An eager evaluation occurs right outside the definition query which in
this case would be in `binding_ty` and `declaration_ty`.
* A lazy evaluation occurs on demand like using the
`definition_expression_ty` to determine the function return type and
class bases.
* The above point means that when trying to get the binding type for a
variable in an annotated assignment, the definition query won't include
the type. So, it'll require going through the deferred query to get the
type.

This has the following limitations:
* Nested string annotations, although not necessarily a useful feature,
is difficult to implement unless we convert the implementation in an
infinite loop
* Partial string annotations require complex layout because inferring
the types for stringified and non-stringified parts of the annotation
are done in separate queries. This means we need to maintain additional
information

### Implementation (attempt 2)

This is the final diff in this PR.

The implementation here does the complete inference of string annotation
in the same definition query by maintaining certain state while trying
to infer different parts of an expression and take decisions
accordingly. These are:
* Allow names that are part of a string annotation to not exists in the
symbol table. For example, in `x: "Foo"`, if the "Foo" symbol is not
defined then it won't exists in the symbol table even though it's being
used. This is an invariant which is being allowed only for symbols in a
string annotation.
* Similarly, lookup name is updated to do the same and if the symbol
doesn't exists, then it's not bounded.
* Store the final type of a string annotation on the string expression
itself and not for any of the sub-expressions that are created after
parsing. This is because those sub-expressions won't exists in the
semantic index.

Design document:
https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4

Closes: #13796 

## Test Plan

* Add various test cases in our markdown framework
* Run `red_knot` on LibCST (contains a lot of string annotations,
specifically
https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py),
FastAPI (good amount of annotated code including `typing.Literal`) and
compare against the `main` branch output

2024-11-15 04:10:18 +00:00

3.3 KiB

Raw Blame History

String annotations

Simple

def f() -> "int":
    return 1

reveal_type(f())  # revealed: int

Nested

def f() -> "'int'":
    return 1

reveal_type(f())  # revealed: int

Type expression

def f1() -> "int | str":
    return 1

def f2() -> "tuple[int, str]":
    return 1

reveal_type(f1())  # revealed: int | str
reveal_type(f2())  # revealed: tuple[int, str]

Partial

def f() -> tuple[int, "str"]:
    return 1

reveal_type(f())  # revealed: tuple[int, str]

Deferred

def f() -> "Foo":
    return Foo()

class Foo:
    pass

reveal_type(f())  # revealed: Foo

Deferred (undefined)

# error: [unresolved-reference]
def f() -> "Foo":
    pass

reveal_type(f())  # revealed: Unknown

Partial deferred

def f() -> int | "Foo":
    return 1

class Foo:
    pass

reveal_type(f())  # revealed: int | Foo

`typing.Literal`

from typing import Literal

def f1() -> Literal["Foo", "Bar"]:
    return "Foo"

def f2() -> 'Literal["Foo", "Bar"]':
    return "Foo"

class Foo:
    pass

reveal_type(f1())  # revealed: Literal["Foo", "Bar"]
reveal_type(f2())  # revealed: Literal["Foo", "Bar"]

Various string kinds

# error: [annotation-raw-string] "Type expressions cannot use raw string literal"
def f1() -> r"int":
    return 1

# error: [annotation-f-string] "Type expressions cannot use f-strings"
def f2() -> f"int":
    return 1

# error: [annotation-byte-string] "Type expressions cannot use bytes literal"
def f3() -> b"int":
    return 1

def f4() -> "int":
    return 1

# error: [annotation-implicit-concat] "Type expressions cannot span multiple string literals"
def f5() -> "in" "t":
    return 1

# error: [annotation-escape-character] "Type expressions cannot contain escape characters"
def f6() -> "\N{LATIN SMALL LETTER I}nt":
    return 1

# error: [annotation-escape-character] "Type expressions cannot contain escape characters"
def f7() -> "\x69nt":
    return 1

def f8() -> """int""":
    return 1

# error: [annotation-byte-string] "Type expressions cannot use bytes literal"
def f9() -> "b'int'":
    return 1

reveal_type(f1())  # revealed: Unknown
reveal_type(f2())  # revealed: Unknown
reveal_type(f3())  # revealed: Unknown
reveal_type(f4())  # revealed: int
reveal_type(f5())  # revealed: Unknown
reveal_type(f6())  # revealed: Unknown
reveal_type(f7())  # revealed: Unknown
reveal_type(f8())  # revealed: int
reveal_type(f9())  # revealed: Unknown

Various string kinds in `typing.Literal`

from typing import Literal

def f() -> Literal["a", r"b", b"c", "d" "e", "\N{LATIN SMALL LETTER F}", "\x67", """h"""]:
    return "normal"

reveal_type(f())  # revealed: Literal["a", "b", "de", "f", "g", "h"] | Literal[b"c"]

Class variables

MyType = int

class Aliases:
    MyType = str

    forward: "MyType"
    not_forward: MyType

reveal_type(Aliases.forward)  # revealed: str
reveal_type(Aliases.not_forward)  # revealed: str

Annotated assignment

a: "int" = 1
b: "'int'" = 1
c: "Foo"
# error: [invalid-assignment] "Object of type `Literal[1]` is not assignable to `Foo`"
d: "Foo" = 1

class Foo:
    pass

c = Foo()

reveal_type(a)  # revealed: Literal[1]
reveal_type(b)  # revealed: Literal[1]
reveal_type(c)  # revealed: Foo
reveal_type(d)  # revealed: Foo

Parameter

TODO: Add tests once parameter inference is supported

3.3 KiB Raw Blame History

String annotations

Simple

Nested

Type expression

Partial

Deferred

Deferred (undefined)

Partial deferred

typing.Literal

Various string kinds

Various string kinds in typing.Literal

Class variables

Annotated assignment

Parameter

3.3 KiB

Raw Blame History

`typing.Literal`

Various string kinds in `typing.Literal`