Lexer should consider BOM for the start offset (#11732)

## Summary

This PR fixes a bug where the lexer didn't consider the BOM into the
start offset.

fixes: #11731

## Test Plan

Add multiple test cases which involves BOM character in the source for
the lexer and verify the snapshot.
This commit is contained in:
Dhruv Manilawala 2024-06-04 14:15:46 +05:30 committed by GitHub
parent 3b19df04d7
commit 2567e14b7a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 117 additions and 15 deletions

View file

@ -0,0 +1,29 @@
---
source: crates/ruff_python_parser/src/lexer.rs
expression: lex_source(source)
---
## Tokens
```
[
(
Name(
"x",
),
3..4,
),
(
Equal,
5..6,
),
(
Int(
1,
),
7..8,
),
(
Newline,
8..8,
),
]
```

View file

@ -0,0 +1,29 @@
---
source: crates/ruff_python_parser/src/lexer.rs
expression: "lex_source_with_offset(source, TextSize::new(7))"
---
## Tokens
```
[
(
Name(
"y",
),
7..8,
),
(
Plus,
9..10,
),
(
Name(
"z",
),
11..12,
),
(
Newline,
12..12,
),
]
```

View file

@ -0,0 +1,19 @@
---
source: crates/ruff_python_parser/src/lexer.rs
expression: "lex_source_with_offset(source, TextSize::new(11))"
---
## Tokens
```
[
(
Name(
"z",
),
11..12,
),
(
Newline,
12..12,
),
]
```