Implement an iterator for universal newlines (#3454)

# Summary

We need to support CR line endings (as opposed to LF and CRLF line endings, which are already supported). They're rare, but they do appear in Python code, and we tend to panic on any file that uses them.

Our `Locator` abstraction now supports CR line endings. However, Rust's `str#lines` implementation does _not_.

This PR adds a `UniversalNewlineIterator` implementation that respects all of CR, LF, and CRLF line endings, and plugs it into most of the `.lines()` call sites.

As an alternative design, it could be nice if we could leverage `Locator` for this. We've already computed all of the line endings, so we could probably iterate much more efficiently?

# Test Plan

Largely relying on automated testing, however, also ran over some known failure cases, like #3404.
This commit is contained in:
Charlie Marsh 2023-03-13 00:01:29 -04:00 committed by GitHub
parent 2a4d6ab3b2
commit c2750a59ab
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
35 changed files with 325 additions and 126 deletions

View file

@ -1,5 +1,6 @@
use rustpython_parser::ast::Location;
use ruff_python_ast::newlines::StrExt;
use ruff_python_ast::source_code::Locator;
use ruff_python_ast::types::Range;
@ -96,7 +97,11 @@ pub fn expand_indented_block(
// Compound statement: from the colon to the end of the block.
let mut offset = 0;
for (index, line) in contents[end_index..].lines().skip(1).enumerate() {
for (index, line) in contents[end_index..]
.universal_newlines()
.skip(1)
.enumerate()
{
if line.is_empty() {
continue;
}