Lexer should consider BOM for the start offset (#11732)

## Summary This PR fixes a bug where the lexer didn't consider the BOM into the start offset. fixes: #11731 ## Test Plan Add multiple test cases which involves BOM character in the source for the lexer and verify the snapshot.
2025-09-29 05:15:12 +00:00 · 2024-06-04 14:15:46 +05:30 · 2024-06-04 14:15:46 +05:30 · 2567e14b7a
commit 2567e14b7a
parent 3b19df04d7
4 changed files with 117 additions and 15 deletions
--- a/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom.snap
+++ b/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom.snap
@ -0,0 +1,29 @@
+---
+source: crates/ruff_python_parser/src/lexer.rs
+expression: lex_source(source)
+---
+## Tokens
+```
+[
+    (
+        Name(
+            "x",
+        ),
+        3..4,
+    ),
+    (
+        Equal,
+        5..6,
+    ),
+    (
+        Int(
+            1,
+        ),
+        7..8,
+    ),
+    (
+        Newline,
+        8..8,
+    ),
+]
+```
--- a/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom_with_offset.snap
+++ b/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom_with_offset.snap
@ -0,0 +1,29 @@
+---
+source: crates/ruff_python_parser/src/lexer.rs
+expression: "lex_source_with_offset(source, TextSize::new(7))"
+---
+## Tokens
+```
+[
+    (
+        Name(
+            "y",
+        ),
+        7..8,
+    ),
+    (
+        Plus,
+        9..10,
+    ),
+    (
+        Name(
+            "z",
+        ),
+        11..12,
+    ),
+    (
+        Newline,
+        12..12,
+    ),
+]
+```
--- a/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom_with_offset_edge.snap
+++ b/crates/ruff_python_parser/src/snapshots/ruff_python_parserlexertests__bom_with_offset_edge.snap
@ -0,0 +1,19 @@
+---
+source: crates/ruff_python_parser/src/lexer.rs
+expression: "lex_source_with_offset(source, TextSize::new(11))"
+---
+## Tokens
+```
+[
+    (
+        Name(
+            "z",
+        ),
+        11..12,
+    ),
+    (
+        Newline,
+        12..12,
+    ),
+]
+```