Merge branch 'trunk' of github.com:rtfeldman/roc into list_keepIf

This commit is contained in:
Chad Stearns 2020-09-05 00:47:21 -04:00
commit 5bd88c8901
42 changed files with 1818 additions and 965 deletions

View file

@ -7,14 +7,36 @@ To build the compiler, you need a particular version of LLVM installed on your s
To see which version of LLVM you need, take a look at `Cargo.toml`, in particular the `branch` section of the `inkwell` dependency. It should have something like `llvmX-Y` where X and Y are the major and minor revisions of LLVM you need.
For Ubuntu, I used the `Automatic installation script` at [apt.llvm.org](https://apt.llvm.org) - but there are plenty of alternative options at http://releases.llvm.org/download.html
For Ubuntu and Debian, you can use the `Automatic installation script` at [apt.llvm.org](https://apt.llvm.org):
```
sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
```
### Troubleshooting LLVM installation on Linux
For macOS, you can run `brew install llvm` (but before you do so, check the version with `brew info llvm`--if it's 10.0.1, you may need to install a slightly older version. See below for details.)
There are also plenty of alternative options at http://releases.llvm.org/download.html
## Troubleshooting
Create an issue if you run into problems not listed here.
That will help us improve this document for everyone who reads it in the future!
### LLVM installation on Linux
On some Linux systems we've seen the error "failed to run custom build command for x11".
On Ubuntu, running `sudo apt-get install cmake libx11-dev` fixed this.
### Troubleshooting LLVM installation on Windows
### LLVM installation on macOS
It looks like LLVM 10.0.1 [has some issues with libxml2 on macOS](https://discourse.brew.sh/t/llvm-config-10-0-1-advertise-libxml2-tbd-as-system-libs/8593). You can install the older 10.0.0_3 by doing
```
$ brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/6616d50fb0b24dbe30f5e975210bdad63257f517/Formula/llvm.rb
# "pinning" ensures that homebrew doesn't update it automatically
$ brew pin llvm
```
### LLVM installation on Windows
Installing LLVM's prebuilt binaries doesn't seem to be enough for the `llvm-sys` crate that Roc depends on, so I had to build LLVM from source
on Windows. After lots of help from [**@IanMacKenzie**](https://github.com/IanMacKenzie) (thank you, Ian!), here's what worked for me:
@ -27,6 +49,7 @@ on Windows. After lots of help from [**@IanMacKenzie**](https://github.com/IanMa
6. Once that completed, I ran `nmake` to build LLVM. (This took about 2 hours on my laptop.)
7. Finally, I set an environment variable `LLVM_SYS_100_PREFIX` to point to the `build` directory where I ran the `cmake` command.
Once all that was done, `cargo` ran successfully for Roc!
## Use LLD for the linker

View file

@ -1,6 +1,18 @@
# Not ready to be shared yet!
Roc is a language for building reliable applications on top of fast platforms.
Roc is a language to help anyone create delightful software.
Here's [a short talk](https://youtu.be/ZnYa99QoznE?t=4790) introducing it at a meetup.
## Getting started
1. [Install Rust](https://rustup.rs/)
2. [Build from source](BUILDING_FROM_SOURCE.md)
3. In a terminal, run this from the root folder:
```
cargo run repl
```
4. Check out [these tests](https://github.com/rtfeldman/roc/blob/trunk/cli/tests/repl_eval.rs) for examples of using the REPL
## Applications and Platforms

View file

@ -36,8 +36,7 @@ use std::str::from_utf8_unchecked;
use target_lexicon::Triple;
pub const WELCOME_MESSAGE: &str = "\n The rockin \u{001b}[36mroc repl\u{001b}[0m\n\u{001b}[35m────────────────────────\u{001b}[0m\n\n";
pub const INSTRUCTIONS: &str =
"Enter an expression, or :help for a list of commands, or :exit to exit.\n";
pub const INSTRUCTIONS: &str = "Enter an expression, or :help, or :exit.\n";
pub const PROMPT: &str = "\n\u{001b}[36m»\u{001b}[0m ";
pub const ELLIPSIS: &str = "\u{001b}[36m…\u{001b}[0m ";
@ -70,7 +69,9 @@ pub fn main() -> io::Result<()> {
.expect("there was no next line")
.expect("the line could not be read");
match line.trim() {
let line = line.trim();
match line.to_lowercase().as_str() {
":help" => {
println!("Use :exit to exit.");
}
@ -100,7 +101,7 @@ pub fn main() -> io::Result<()> {
":exit" => {
break;
}
line => {
_ => {
let result = if pending_src.is_empty() {
print_output(line)
} else {

View file

@ -6,7 +6,7 @@ use roc_module::ident::{Lowercase, TagName};
use roc_module::operator::CalledVia;
use roc_module::symbol::{Interns, ModuleId, Symbol};
use roc_mono::layout::{Builtin, Layout};
use roc_parse::ast::{AssignedField, Expr};
use roc_parse::ast::{AssignedField, Expr, StrLiteral};
use roc_region::all::{Located, Region};
use roc_types::subs::{Content, FlatType, Subs, Variable};
use roc_types::types::RecordField;
@ -90,7 +90,7 @@ fn jit_to_ast_help<'a>(
execution_engine,
main_fn_name,
&'static str,
|string: &'static str| { Expr::Str(env.arena.alloc(string)) }
|string: &'static str| { str_slice_to_ast(env.arena, env.arena.alloc(string)) }
),
Layout::Builtin(Builtin::EmptyList) => {
jit_map!(execution_engine, main_fn_name, &'static str, |_| {
@ -168,11 +168,11 @@ fn ptr_to_ast<'a>(
list_to_ast(env, ptr, len, elem_layout, content)
}
Layout::Builtin(Builtin::EmptyStr) => Expr::Str(""),
Layout::Builtin(Builtin::EmptyStr) => Expr::Str(StrLiteral::PlainLine("")),
Layout::Builtin(Builtin::Str) => {
let arena_str = unsafe { *(ptr as *const &'static str) };
Expr::Str(arena_str)
str_slice_to_ast(env.arena, arena_str)
}
Layout::Struct(field_layouts) => match content {
Content::Structure(FlatType::Record(fields, _)) => {
@ -405,3 +405,14 @@ fn i64_to_ast(arena: &Bump, num: i64) -> Expr<'_> {
fn f64_to_ast(arena: &Bump, num: f64) -> Expr<'_> {
Expr::Num(arena.alloc(format!("{}", num)))
}
fn str_slice_to_ast<'a>(_arena: &'a Bump, string: &'a str) -> Expr<'a> {
if string.contains('\n') {
todo!(
"this string contains newlines, so render it as a multiline string: {:?}",
Expr::Str(StrLiteral::PlainLine(string))
);
} else {
Expr::Str(StrLiteral::PlainLine(string))
}
}

View file

@ -19,8 +19,7 @@ pub struct Out {
// TODO get these from roc_cli::repl instead, after figuring out why
// `extern crate roc_cli;` doesn't work.
const WELCOME_MESSAGE: &str = "\n The rockin \u{001b}[36mroc repl\u{001b}[0m\n\u{001b}[35m────────────────────────\u{001b}[0m\n\n";
const INSTRUCTIONS: &str =
"Enter an expression, or :help for a list of commands, or :exit to exit.\n";
const INSTRUCTIONS: &str = "Enter an expression, or :help, or :exit.\n";
const PROMPT: &str = "\n\u{001b}[36m»\u{001b}[0m ";
pub fn path_to_roc_binary() -> PathBuf {

View file

@ -232,6 +232,12 @@ mod repl_eval {
);
}
// #[test]
// fn multiline_string() {
// // If a string contains newlines, format it as a multiline string in the output
// expect_success(r#""\n\nhi!\n\n""#, "\"\"\"\n\nhi!\n\n\"\"\"");
// }
// TODO uncomment this once https://github.com/rtfeldman/roc/issues/295 is done
//
// #[test]

View file

@ -215,6 +215,23 @@ mapOrCancel : List before, (before -> Result after err) -> Result (List after) e
## >>> List.mapOks [ "", "a", "bc", "", "d", "ef", "" ]
mapOks : List before, (before -> Result after *) -> List after
## Returns a list with the element at the given index having been transformed by
## the given function.
##
## For a version of this which gives you more control over when to perform
## the transformation, see #List.updater
##
## ## Performance notes
##
## In particular when updating nested collections, this is potentially much more
## efficient than using #List.get to obtain the element, transforming it,
## and then putting it back in the same place.
update : List elem, Len, (elem -> elem) -> List elem
## A more flexible version of #List.update, which returns an "updater" function
## that lets you delay performing the update until later.
updater : List elem, Len -> { elem, new : elem -> List elem }
## If all the elements in the list are #Ok, return a new list containing the
## contents of those #Ok tags. If any elements are #Err, return #Err.
allOks : List (Result ok err) -> Result (List ok) err

View file

@ -775,9 +775,35 @@ div = \numerator, denominator ->
##
## >>> Float.pi
## >>> |> Float.mod 2.0
mod : Float a, Float a -> Result Float DivByZero
mod : Float a, Float a -> Result (Float a) [ DivByZero ]*
tryMod : Float a, Float a -> Result (Float a) [ DivByZero ]*
## Raises a #Float to the power of another #Float.
##
## `
## For an #Int alternative to this function, see #Num.raise.
pow : Float a, Float a -> Float a
## Raises an integer to the power of another, by multiplying the integer by
## itself the given number of times.
##
## This process is known as [exponentiation by squaring](https://en.wikipedia.org/wiki/Exponentiation_by_squaring).
##
## For a #Float alternative to this function, which supports negative exponents,
## see #Num.exp.
##
## >>> Num.exp 5 0
##
## >>> Num.exp 5 1
##
## >>> Num.exp 5 2
##
## >>> Num.exp 5 6
##
## ## Performance Notes
##
## Be careful! Even though this function takes only a #U8, it is very easy to
## overflow
expBySquaring : Int a, U8 -> Int a
## Return the reciprocal of a #Float - that is, divides `1.0` by the given number.
##
@ -786,7 +812,9 @@ tryMod : Float a, Float a -> Result (Float a) [ DivByZero ]*
## For a version that does not crash, use #tryRecip
recip : Float a -> Result (Float a) [ DivByZero ]*
## NOTE: Need to come up a suffix alternative to the "try" prefix.
## This should be like (for example) recipTry so that it's more discoverable
## in documentation and editor autocomplete when you type "recip"
tryRecip : Float a -> Result (Float a) [ DivByZero ]*
## Return an approximation of the absolute value of the square root of the #Float.

View file

@ -9,19 +9,20 @@ use crate::num::{
use crate::pattern::{canonicalize_pattern, Pattern};
use crate::procedure::References;
use crate::scope::Scope;
use inlinable_string::InlinableString;
use roc_collections::all::{ImSet, MutMap, MutSet, SendMap};
use roc_module::ident::{Lowercase, TagName};
use roc_module::low_level::LowLevel;
use roc_module::operator::CalledVia;
use roc_module::symbol::Symbol;
use roc_parse::ast;
use roc_parse::ast::{self, EscapedChar, StrLiteral};
use roc_parse::pattern::PatternType::*;
use roc_problem::can::{PrecedenceProblem, Problem, RuntimeError};
use roc_region::all::{Located, Region};
use roc_types::subs::{VarStore, Variable};
use roc_types::types::Alias;
use std::fmt::Debug;
use std::i64;
use std::{char, i64, u32};
#[derive(Clone, Default, Debug, PartialEq)]
pub struct Output {
@ -55,8 +56,7 @@ pub enum Expr {
// Int and Float store a variable to generate better error messages
Int(Variable, i64),
Float(Variable, f64),
Str(Box<str>),
BlockStr(Box<str>),
Str(InlinableString),
List {
list_var: Variable, // required for uniqueness of the list
elem_var: Variable,
@ -247,12 +247,7 @@ pub fn canonicalize_expr<'a>(
)
}
}
ast::Expr::Str(string) => (Str((*string).into()), Output::default()),
ast::Expr::BlockStr(lines) => {
let joined = lines.iter().copied().collect::<Vec<&str>>().join("\n");
(BlockStr(joined.into()), Output::default())
}
ast::Expr::Str(literal) => flatten_str_literal(env, var_store, scope, literal),
ast::Expr::List(loc_elems) => {
if loc_elems.is_empty() {
(
@ -1045,8 +1040,7 @@ pub fn inline_calls(var_store: &mut VarStore, scope: &mut Scope, expr: Expr) ->
other @ Num(_, _)
| other @ Int(_, _)
| other @ Float(_, _)
| other @ Str(_)
| other @ BlockStr(_)
| other @ Str { .. }
| other @ RuntimeError(_)
| other @ EmptyRecord
| other @ Accessor { .. }
@ -1323,3 +1317,170 @@ pub fn inline_calls(var_store: &mut VarStore, scope: &mut Scope, expr: Expr) ->
}
}
}
fn flatten_str_literal<'a>(
env: &mut Env<'a>,
var_store: &mut VarStore,
scope: &mut Scope,
literal: &StrLiteral<'a>,
) -> (Expr, Output) {
use ast::StrLiteral::*;
match literal {
PlainLine(str_slice) => (Expr::Str((*str_slice).into()), Output::default()),
Line(segments) => flatten_str_lines(env, var_store, scope, &[segments]),
Block(lines) => flatten_str_lines(env, var_store, scope, lines),
}
}
fn is_valid_interpolation(expr: &ast::Expr<'_>) -> bool {
match expr {
ast::Expr::Var { .. } => true,
ast::Expr::Access(sub_expr, _) => is_valid_interpolation(sub_expr),
_ => false,
}
}
enum StrSegment {
Interpolation(Located<Expr>),
Plaintext(InlinableString),
}
fn flatten_str_lines<'a>(
env: &mut Env<'a>,
var_store: &mut VarStore,
scope: &mut Scope,
lines: &[&[ast::StrSegment<'a>]],
) -> (Expr, Output) {
use ast::StrSegment::*;
let mut buf = String::new();
let mut segments = Vec::new();
let mut output = Output::default();
for line in lines {
for segment in line.iter() {
match segment {
Plaintext(string) => {
buf.push_str(string);
}
Unicode(loc_hex_digits) => match u32::from_str_radix(loc_hex_digits.value, 16) {
Ok(code_pt) => match char::from_u32(code_pt) {
Some(ch) => {
buf.push(ch);
}
None => {
env.problem(Problem::InvalidUnicodeCodePoint(loc_hex_digits.region));
return (
Expr::RuntimeError(RuntimeError::InvalidUnicodeCodePoint(
loc_hex_digits.region,
)),
output,
);
}
},
Err(_) => {
env.problem(Problem::InvalidHexadecimal(loc_hex_digits.region));
return (
Expr::RuntimeError(RuntimeError::InvalidHexadecimal(
loc_hex_digits.region,
)),
output,
);
}
},
Interpolated(loc_expr) => {
if is_valid_interpolation(loc_expr.value) {
// Interpolations desugar to Str.concat calls
output.references.calls.insert(Symbol::STR_CONCAT);
if !buf.is_empty() {
segments.push(StrSegment::Plaintext(buf.into()));
buf = String::new();
}
let (loc_expr, new_output) = canonicalize_expr(
env,
var_store,
scope,
loc_expr.region,
loc_expr.value,
);
output.union(new_output);
segments.push(StrSegment::Interpolation(loc_expr));
} else {
env.problem(Problem::InvalidInterpolation(loc_expr.region));
return (
Expr::RuntimeError(RuntimeError::InvalidInterpolation(loc_expr.region)),
output,
);
}
}
EscapedChar(escaped) => buf.push(unescape_char(escaped)),
}
}
}
if !buf.is_empty() {
segments.push(StrSegment::Plaintext(buf.into()));
}
(desugar_str_segments(var_store, segments), output)
}
/// Resolve stirng interpolations by desugaring a sequence of StrSegments
/// into nested calls to Str.concat
fn desugar_str_segments(var_store: &mut VarStore, segments: Vec<StrSegment>) -> Expr {
use StrSegment::*;
let mut iter = segments.into_iter().rev();
let mut loc_expr = match iter.next() {
Some(Plaintext(string)) => Located::new(0, 0, 0, 0, Expr::Str(string)),
Some(Interpolation(loc_expr)) => loc_expr,
None => {
// No segments? Empty string!
Located::new(0, 0, 0, 0, Expr::Str("".into()))
}
};
for seg in iter {
let loc_new_expr = match seg {
Plaintext(string) => Located::new(0, 0, 0, 0, Expr::Str(string)),
Interpolation(loc_interpolated_expr) => loc_interpolated_expr,
};
let fn_expr = Located::new(0, 0, 0, 0, Expr::Var(Symbol::STR_CONCAT));
let expr = Expr::Call(
Box::new((var_store.fresh(), fn_expr, var_store.fresh())),
vec![
(var_store.fresh(), loc_new_expr),
(var_store.fresh(), loc_expr),
],
CalledVia::Space,
);
loc_expr = Located::new(0, 0, 0, 0, expr);
}
loc_expr.value
}
/// Returns the char that would have been originally parsed to
pub fn unescape_char(escaped: &EscapedChar) -> char {
use EscapedChar::*;
match escaped {
Backslash => '\\',
Quote => '"',
CarriageReturn => '\r',
Tab => '\t',
Newline => '\n',
}
}

View file

@ -68,8 +68,6 @@ pub fn desugar_expr<'a>(arena: &'a Bump, loc_expr: &'a Located<Expr<'a>>) -> &'a
| Nested(NonBase10Int { .. })
| Str(_)
| Nested(Str(_))
| BlockStr(_)
| Nested(BlockStr(_))
| AccessorFunction(_)
| Nested(AccessorFunction(_))
| Var { .. }

View file

@ -1,10 +1,10 @@
use crate::env::Env;
use crate::expr::{canonicalize_expr, Expr, Output};
use crate::expr::{canonicalize_expr, unescape_char, Expr, Output};
use crate::num::{finish_parsing_base, finish_parsing_float, finish_parsing_int};
use crate::scope::Scope;
use roc_module::ident::{Ident, Lowercase, TagName};
use roc_module::symbol::Symbol;
use roc_parse::ast;
use roc_parse::ast::{self, StrLiteral, StrSegment};
use roc_parse::pattern::PatternType;
use roc_problem::can::{MalformedPatternProblem, Problem, RuntimeError};
use roc_region::all::{Located, Region};
@ -230,16 +230,8 @@ pub fn canonicalize_pattern<'a>(
ptype => unsupported_pattern(env, ptype, region),
},
StrLiteral(string) => match pattern_type {
WhenBranch => {
// TODO report whether string was malformed
Pattern::StrLiteral((*string).into())
}
ptype => unsupported_pattern(env, ptype, region),
},
BlockStrLiteral(_lines) => match pattern_type {
WhenBranch => todo!("TODO block string literal pattern"),
StrLiteral(literal) => match pattern_type {
WhenBranch => flatten_str_literal(literal),
ptype => unsupported_pattern(env, ptype, region),
},
@ -473,3 +465,38 @@ fn add_bindings_from_patterns(
| UnsupportedPattern(_) => (),
}
}
fn flatten_str_literal(literal: &StrLiteral<'_>) -> Pattern {
use ast::StrLiteral::*;
match literal {
PlainLine(str_slice) => Pattern::StrLiteral((*str_slice).into()),
Line(segments) => flatten_str_lines(&[segments]),
Block(lines) => flatten_str_lines(lines),
}
}
fn flatten_str_lines(lines: &[&[StrSegment<'_>]]) -> Pattern {
use StrSegment::*;
let mut buf = String::new();
for line in lines {
for segment in line.iter() {
match segment {
Plaintext(string) => {
buf.push_str(string);
}
Unicode(loc_digits) => {
todo!("parse unicode digits {:?}", loc_digits);
}
Interpolated(loc_expr) => {
return Pattern::UnsupportedPattern(loc_expr.region);
}
EscapedChar(escaped) => buf.push(unescape_char(escaped)),
}
}
}
Pattern::StrLiteral(buf.into())
}

View file

@ -27,7 +27,7 @@ pub fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<ast::Expr<'a>,
#[allow(dead_code)]
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

View file

@ -69,6 +69,10 @@ mod test_can {
}
}
fn expr_str(contents: &str) -> Expr {
Expr::Str(contents.into())
}
// NUMBER LITERALS
#[test]
@ -1179,161 +1183,61 @@ mod test_can {
//}
//
//
//// STRING LITERALS
// STRING LITERALS
//
// #[test]
// fn string_with_valid_unicode_escapes() {
// expect_parsed_str("x\u{00A0}x", r#""x\u{00A0}x""#);
// expect_parsed_str("x\u{101010}x", r#""x\u{101010}x""#);
// }
#[test]
fn string_with_valid_unicode_escapes() {
assert_can(r#""x\u(00A0)x""#, expr_str("x\u{00A0}x"));
assert_can(r#""x\u(101010)x""#, expr_str("x\u{101010}x"));
}
// #[test]
// fn string_with_too_large_unicode_escape() {
// // Should be too big - max size should be 10FFFF.
// // (Rust has this restriction. I assume it's a good idea.)
// assert_malformed_str(
// r#""abc\u{110000}def""#,
// vec![Located::new(0, 7, 0, 12, Problem::UnicodeCodePointTooLarge)],
// );
// }
// #[test]
// fn string_with_too_large_unicode_escape() {
// // Should be too big - max size should be 10FFFF.
// // (Rust has this restriction. I assume it's a good idea.)
// assert_malformed_str(
// r#""abc\u{110000}def""#,
// vec![Located::new(0, 7, 0, 12, Problem::UnicodeCodePointTooLarge)],
// );
// }
// #[test]
// fn string_with_no_unicode_digits() {
// // No digits specified
// assert_malformed_str(
// r#""blah\u{}foo""#,
// vec![Located::new(0, 5, 0, 8, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_no_unicode_digits() {
// // No digits specified
// assert_malformed_str(
// r#""blah\u{}foo""#,
// vec![Located::new(0, 5, 0, 8, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_no_unicode_opening_brace() {
// // No opening curly brace. It can't be sure if the closing brace
// // was intended to be a closing brace for the unicode escape, so
// // report that there were no digits specified.
// assert_malformed_str(
// r#""abc\u00A0}def""#,
// vec![Located::new(0, 4, 0, 5, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_no_unicode_opening_brace() {
// // No opening curly brace. It can't be sure if the closing brace
// // was intended to be a closing brace for the unicode escape, so
// // report that there were no digits specified.
// assert_malformed_str(
// r#""abc\u00A0}def""#,
// vec![Located::new(0, 4, 0, 5, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_no_unicode_closing_brace() {
// // No closing curly brace
// assert_malformed_str(
// r#""blah\u{stuff""#,
// vec![Located::new(0, 5, 0, 12, Problem::MalformedEscapedUnicode)],
// );
// }
// #[test]
// fn string_with_no_unicode_closing_brace() {
// // No closing curly brace
// assert_malformed_str(
// r#""blah\u{stuff""#,
// vec![Located::new(0, 5, 0, 12, Problem::MalformedEscapedUnicode)],
// );
// }
// #[test]
// fn string_with_no_unicode_braces() {
// // No curly braces
// assert_malformed_str(
// r#""zzzz\uzzzzz""#,
// vec![Located::new(0, 5, 0, 6, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_interpolation_at_start() {
// let input = indoc!(
// r#"
// "\(abc)defg"
// "#
// );
// let (args, ret) = (vec![("", Located::new(0, 2, 0, 4, Var("abc")))], "defg");
// let arena = Bump::new();
// let actual = parse_with(&arena, input);
// assert_eq!(
// Ok(InterpolatedStr(&(arena.alloc_slice_clone(&args), ret))),
// actual
// );
// }
// #[test]
// fn string_with_interpolation_at_end() {
// let input = indoc!(
// r#"
// "abcd\(efg)"
// "#
// );
// let (args, ret) = (vec![("abcd", Located::new(0, 6, 0, 8, Var("efg")))], "");
// let arena = Bump::new();
// let actual = parse_with(&arena, input);
// assert_eq!(
// Ok(InterpolatedStr(&(arena.alloc_slice_clone(&args), ret))),
// actual
// );
// }
// #[test]
// fn string_with_interpolation_in_middle() {
// let input = indoc!(
// r#"
// "abc\(defg)hij"
// "#
// );
// let (args, ret) = (vec![("abc", Located::new(0, 5, 0, 8, Var("defg")))], "hij");
// let arena = Bump::new();
// let actual = parse_with(&arena, input);
// assert_eq!(
// Ok(InterpolatedStr(&(arena.alloc_slice_clone(&args), ret))),
// actual
// );
// }
// #[test]
// fn string_with_two_interpolations_in_middle() {
// let input = indoc!(
// r#"
// "abc\(defg)hi\(jkl)mn"
// "#
// );
// let (args, ret) = (
// vec![
// ("abc", Located::new(0, 5, 0, 8, Var("defg"))),
// ("hi", Located::new(0, 14, 0, 16, Var("jkl"))),
// ],
// "mn",
// );
// let arena = Bump::new();
// let actual = parse_with(&arena, input);
// assert_eq!(
// Ok(InterpolatedStr(&(arena.alloc_slice_clone(&args), ret))),
// actual
// );
// }
// #[test]
// fn string_with_four_interpolations() {
// let input = indoc!(
// r#"
// "\(abc)def\(ghi)jkl\(mno)pqrs\(tuv)"
// "#
// );
// let (args, ret) = (
// vec![
// ("", Located::new(0, 2, 0, 4, Var("abc"))),
// ("def", Located::new(0, 11, 0, 13, Var("ghi"))),
// ("jkl", Located::new(0, 20, 0, 22, Var("mno"))),
// ("pqrs", Located::new(0, 30, 0, 32, Var("tuv"))),
// ],
// "",
// );
// let arena = Bump::new();
// let actual = parse_with(&arena, input);
// assert_eq!(
// Ok(InterpolatedStr(&(arena.alloc_slice_clone(&args), ret))),
// actual
// );
// }
// #[test]
// fn string_with_no_unicode_braces() {
// // No curly braces
// assert_malformed_str(
// r#""zzzz\uzzzzz""#,
// vec![Located::new(0, 5, 0, 6, Problem::NoUnicodeDigits)],
// );
// }
// #[test]
// fn string_with_escaped_interpolation() {
@ -1341,13 +1245,12 @@ mod test_can {
// // This should NOT be string interpolation, because of the \\
// indoc!(
// r#"
// "abcd\\(efg)hij"
// "#
// "abcd\\(efg)hij"
// "#
// ),
// Str(r#"abcd\(efg)hij"#.into()),
// );
// }
//
// #[test]
// fn string_without_escape() {
@ -1384,4 +1287,6 @@ mod test_can {
// TODO test hex/oct/binary conversion to numbers
//
// TODO test for \t \r and \n in string literals *outside* unicode escape sequence!
//
// TODO test for multiline block string literals in pattern matches
}

View file

@ -199,7 +199,7 @@ pub fn constrain_expr(
exists(vars, And(cons))
}
Str(_) | BlockStr(_) => Eq(str_type(), expected, Category::Str, region),
Str(_) => Eq(str_type(), expected, Category::Str, region),
List {
elem_var,
loc_elems,

View file

@ -503,7 +503,7 @@ pub fn constrain_expr(
]),
)
}
BlockStr(_) | Str(_) => {
Str(_) => {
let uniq_type = var_store.fresh();
let inferred = str_type(Bool::variable(uniq_type));

View file

@ -6,6 +6,7 @@ use crate::spaces::{
};
use bumpalo::collections::{String, Vec};
use roc_module::operator::{self, BinOp};
use roc_parse::ast::StrSegment;
use roc_parse::ast::{AssignedField, Base, CommentOrNewline, Expr, Pattern, WhenBranch};
use roc_region::all::Located;
@ -28,7 +29,6 @@ impl<'a> Formattable<'a> for Expr<'a> {
Float(_)
| Num(_)
| NonBase10Int { .. }
| Str(_)
| Access(_, _)
| AccessorFunction(_)
| Var { .. }
@ -42,7 +42,20 @@ impl<'a> Formattable<'a> for Expr<'a> {
List(elems) => elems.iter().any(|loc_expr| loc_expr.is_multiline()),
BlockStr(lines) => lines.len() > 1,
Str(literal) => {
use roc_parse::ast::StrLiteral::*;
match literal {
PlainLine(_) | Line(_) => {
// If this had any newlines, it'd have parsed as Block.
false
}
Block(lines) => {
// Block strings don't *have* to be multiline!
lines.len() > 1
}
}
}
Apply(loc_expr, args, _) => {
loc_expr.is_multiline() || args.iter().any(|loc_arg| loc_arg.is_multiline())
}
@ -112,9 +125,53 @@ impl<'a> Formattable<'a> for Expr<'a> {
sub_expr.format_with_options(buf, Parens::NotNeeded, Newlines::Yes, indent);
buf.push(')');
}
Str(string) => {
Str(literal) => {
use roc_parse::ast::StrLiteral::*;
buf.push('"');
buf.push_str(string);
match literal {
PlainLine(string) => {
buf.push_str(string);
}
Line(segments) => {
for seg in segments.iter() {
format_str_segment(seg, buf, 0)
}
}
Block(lines) => {
buf.push_str("\"\"");
if lines.len() > 1 {
// Since we have multiple lines, format this with
// the `"""` symbols on their own lines, and the
newline(buf, indent);
for segments in lines.iter() {
for seg in segments.iter() {
format_str_segment(seg, buf, indent);
}
newline(buf, indent);
}
} else {
// This is a single-line block string, for example:
//
// """Whee, "quotes" inside quotes!"""
// This loop will run either 0 or 1 times.
for segments in lines.iter() {
for seg in segments.iter() {
format_str_segment(seg, buf, indent);
}
// Don't print a newline here, because we either
// just printed 1 or 0 lines.
}
}
buf.push_str("\"\"");
}
}
buf.push('"');
}
Var { module_name, ident } => {
@ -152,13 +209,6 @@ impl<'a> Formattable<'a> for Expr<'a> {
buf.push(')');
}
}
BlockStr(lines) => {
buf.push_str("\"\"\"");
for line in lines.iter() {
buf.push_str(line);
}
buf.push_str("\"\"\"");
}
Num(string) | Float(string) | GlobalTag(string) | PrivateTag(string) => {
buf.push_str(string)
}
@ -252,6 +302,36 @@ impl<'a> Formattable<'a> for Expr<'a> {
}
}
fn format_str_segment<'a>(seg: &StrSegment<'a>, buf: &mut String<'a>, indent: u16) {
use StrSegment::*;
match seg {
Plaintext(string) => {
buf.push_str(string);
}
Unicode(loc_str) => {
buf.push_str("\\u(");
buf.push_str(loc_str.value); // e.g. "00A0" in "\u(00A0)"
buf.push(')');
}
EscapedChar(escaped) => {
buf.push('\\');
buf.push(escaped.to_parsed_char());
}
Interpolated(loc_expr) => {
buf.push_str("\\(");
// e.g. (name) in "Hi, \(name)!"
loc_expr.value.format_with_options(
buf,
Parens::NotNeeded, // We already printed parens!
Newlines::No, // Interpolations can never have newlines
indent,
);
buf.push(')');
}
}
}
fn fmt_bin_op<'a>(
buf: &mut String<'a>,
loc_left_side: &'a Located<Expr<'a>>,

View file

@ -37,7 +37,6 @@ impl<'a> Formattable<'a> for Pattern<'a> {
| Pattern::NonBase10Literal { .. }
| Pattern::FloatLiteral(_)
| Pattern::StrLiteral(_)
| Pattern::BlockStrLiteral(_)
| Pattern::Underscore
| Pattern::Malformed(_)
| Pattern::QualifiedIdentifier { .. } => false,
@ -126,11 +125,8 @@ impl<'a> Formattable<'a> for Pattern<'a> {
buf.push_str(string);
}
FloatLiteral(string) => buf.push_str(string),
StrLiteral(string) => buf.push_str(string),
BlockStrLiteral(lines) => {
for line in *lines {
buf.push_str(line)
}
StrLiteral(literal) => {
todo!("Format string literal: {:?}", literal);
}
Underscore => buf.push('_'),

View file

@ -20,7 +20,7 @@ mod test_fmt {
use roc_parse::parser::{Fail, Parser, State};
fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Expr<'a>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc!(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);
@ -192,7 +192,7 @@ mod test_fmt {
fn escaped_unicode_string() {
expr_formats_same(indoc!(
r#"
"unicode: \u{A00A}!"
"unicode: \u(A00A)!"
"#
));
}
@ -206,47 +206,47 @@ mod test_fmt {
));
}
#[test]
fn empty_block_string() {
expr_formats_same(indoc!(
r#"
""""""
"#
));
}
// #[test]
// fn empty_block_string() {
// expr_formats_same(indoc!(
// r#"
// """"""
// "#
// ));
// }
#[test]
fn basic_block_string() {
expr_formats_same(indoc!(
r#"
"""blah"""
"#
));
}
// #[test]
// fn basic_block_string() {
// expr_formats_same(indoc!(
// r#"
// """blah"""
// "#
// ));
// }
#[test]
fn newlines_block_string() {
expr_formats_same(indoc!(
r#"
"""blah
spam
foo"""
"#
));
}
// #[test]
// fn newlines_block_string() {
// expr_formats_same(indoc!(
// r#"
// """blah
// spam
// foo"""
// "#
// ));
// }
#[test]
fn quotes_block_string() {
expr_formats_same(indoc!(
r#"
"""
// #[test]
// fn quotes_block_string() {
// expr_formats_same(indoc!(
// r#"
// """
"" \""" ""\"
// "" \""" ""\"
"""
"#
));
}
// """
// "#
// ));
// }
#[test]
fn zero() {

View file

@ -87,7 +87,7 @@ pub fn infer_expr(
}
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

View file

@ -69,7 +69,7 @@ pub fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<ast::Expr<'a>,
#[allow(dead_code)]
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

View file

@ -496,11 +496,11 @@ pub fn lowlevel_borrow_signature(arena: &Bump, op: LowLevel) -> &[bool] {
ListSet => arena.alloc_slice_copy(&[owned, irrelevant, irrelevant]),
ListSetInPlace => arena.alloc_slice_copy(&[owned, irrelevant, irrelevant]),
ListGetUnsafe => arena.alloc_slice_copy(&[borrowed, irrelevant]),
ListConcat | StrConcat => arena.alloc_slice_copy(&[owned, borrowed]),
ListSingle => arena.alloc_slice_copy(&[irrelevant]),
ListRepeat => arena.alloc_slice_copy(&[irrelevant, irrelevant]),
ListReverse => arena.alloc_slice_copy(&[owned]),
ListConcat | StrConcat => arena.alloc_slice_copy(&[irrelevant, irrelevant]),
ListAppend => arena.alloc_slice_copy(&[owned, owned]),
ListPrepend => arena.alloc_slice_copy(&[owned, owned]),
ListJoin => arena.alloc_slice_copy(&[irrelevant]),

View file

@ -585,6 +585,7 @@ pub enum Stmt<'a> {
Jump(JoinPointId, &'a [Symbol]),
RuntimeError(&'a str),
}
#[derive(Clone, Debug, PartialEq)]
pub enum Literal<'a> {
// Literals
@ -1242,7 +1243,7 @@ pub fn with_hole<'a>(
hole,
),
Str(string) | BlockStr(string) => Stmt::Let(
Str(string) => Stmt::Let(
assigned,
Expr::Literal(Literal::Str(arena.alloc(string))),
Layout::Builtin(Builtin::Str),

View file

@ -311,7 +311,7 @@ fn layout_from_flat_type<'a>(
// Num.Num should only ever have 1 argument, e.g. Num.Num Int.Integer
debug_assert_eq!(args.len(), 1);
let var = args.get(0).unwrap();
let var = args.first().unwrap();
let content = subs.get_without_compacting(*var).content;
layout_from_num_content(content)

View file

@ -84,6 +84,46 @@ pub struct WhenPattern<'a> {
pub guard: Option<Loc<Expr<'a>>>,
}
#[derive(Clone, Debug, PartialEq)]
pub enum StrSegment<'a> {
Plaintext(&'a str), // e.g. "foo"
Unicode(Loc<&'a str>), // e.g. "00A0" in "\u(00A0)"
EscapedChar(EscapedChar), // e.g. '\n' in "Hello!\n"
Interpolated(Loc<&'a Expr<'a>>), // e.g. (name) in "Hi, \(name)!"
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum EscapedChar {
Newline, // \n
Tab, // \t
Quote, // \"
Backslash, // \\
CarriageReturn, // \r
}
impl EscapedChar {
/// Returns the char that would have been originally parsed to
pub fn to_parsed_char(&self) -> char {
use EscapedChar::*;
match self {
Backslash => '\\',
Quote => '"',
CarriageReturn => 'r',
Tab => 't',
Newline => 'n',
}
}
}
#[derive(Clone, Debug, PartialEq)]
pub enum StrLiteral<'a> {
/// The most common case: a plain string with no escapes or interpolations
PlainLine(&'a str),
Line(&'a [StrSegment<'a>]),
Block(&'a [&'a [StrSegment<'a>]]),
}
/// A parsed expression. This uses lifetimes extensively for two reasons:
///
/// 1. It uses Bump::alloc for all allocations, which returns a reference.
@ -105,8 +145,7 @@ pub enum Expr<'a> {
},
// String Literals
Str(&'a str),
BlockStr(&'a [&'a str]),
Str(StrLiteral<'a>), // string without escapes in it
/// Look up exactly one field on a record, e.g. (expr).foo.
Access(&'a Expr<'a>, &'a str),
/// e.g. `.foo`
@ -336,8 +375,7 @@ pub enum Pattern<'a> {
is_negative: bool,
},
FloatLiteral(&'a str),
StrLiteral(&'a str),
BlockStrLiteral(&'a [&'a str]),
StrLiteral(StrLiteral<'a>),
Underscore,
// Space
@ -455,7 +493,6 @@ impl<'a> Pattern<'a> {
) => string_x == string_y && base_x == base_y && is_negative_x == is_negative_y,
(FloatLiteral(x), FloatLiteral(y)) => x == y,
(StrLiteral(x), StrLiteral(y)) => x == y,
(BlockStrLiteral(x), BlockStrLiteral(y)) => x == y,
(Underscore, Underscore) => true,
// Space
@ -584,7 +621,7 @@ impl<'a> Spaceable<'a> for Def<'a> {
pub enum Attempting {
List,
Keyword,
StringLiteral,
StrLiteral,
RecordLiteral,
RecordFieldLabel,
InterpolatedString,
@ -596,6 +633,7 @@ pub enum Attempting {
Module,
Record,
Identifier,
HexDigit,
ConcreteType,
TypeVariable,
WhenCondition,

View file

@ -300,12 +300,8 @@ fn expr_to_pattern<'a>(arena: &'a Bump, expr: &Expr<'a>) -> Result<Pattern<'a>,
base: *base,
is_negative: *is_negative,
}),
Expr::Str(string) => Ok(Pattern::StrLiteral(string)),
Expr::MalformedIdent(string) => Ok(Pattern::Malformed(string)),
// These would not have parsed as patterns
Expr::BlockStr(_)
| Expr::AccessorFunction(_)
Expr::AccessorFunction(_)
| Expr::Access(_, _)
| Expr::List(_)
| Expr::Closure(_, _)
@ -322,6 +318,9 @@ fn expr_to_pattern<'a>(arena: &'a Bump, expr: &Expr<'a>) -> Result<Pattern<'a>,
attempting: Attempting::Def,
reason: FailReason::InvalidPattern,
}),
Expr::Str(string) => Ok(Pattern::StrLiteral(string.clone())),
Expr::MalformedIdent(string) => Ok(Pattern::Malformed(string)),
}
}
@ -580,11 +579,7 @@ fn annotation_or_alias<'a>(
QualifiedIdentifier { .. } => {
panic!("TODO gracefully handle trying to annotate a qualified identifier, e.g. `Foo.bar : ...`");
}
NumLiteral(_)
| NonBase10Literal { .. }
| FloatLiteral(_)
| StrLiteral(_)
| BlockStrLiteral(_) => {
NumLiteral(_) | NonBase10Literal { .. } | FloatLiteral(_) | StrLiteral(_) => {
panic!("TODO gracefully handle trying to annotate a litera");
}
Underscore => {
@ -916,10 +911,7 @@ fn number_pattern<'a>() -> impl Parser<'a, Pattern<'a>> {
}
fn string_pattern<'a>() -> impl Parser<'a, Pattern<'a>> {
map!(crate::string_literal::parse(), |result| match result {
crate::string_literal::StringLiteral::Line(string) => Pattern::StrLiteral(string),
crate::string_literal::StringLiteral::Block(lines) => Pattern::BlockStrLiteral(lines),
})
map!(crate::string_literal::parse(), Pattern::StrLiteral)
}
fn underscore_pattern<'a>() -> impl Parser<'a, Pattern<'a>> {
@ -1789,8 +1781,5 @@ pub fn global_tag<'a>() -> impl Parser<'a, &'a str> {
}
pub fn string_literal<'a>() -> impl Parser<'a, Expr<'a>> {
map!(crate::string_literal::parse(), |result| match result {
crate::string_literal::StringLiteral::Line(string) => Expr::Str(string),
crate::string_literal::StringLiteral::Block(lines) => Expr::BlockStr(lines),
})
map!(crate::string_literal::parse(), Expr::Str)
}

View file

@ -445,6 +445,29 @@ pub fn ascii_char<'a>(expected: char) -> impl Parser<'a, ()> {
}
}
/// One or more ASCII hex digits. (Useful when parsing unicode escape codes,
/// which must consist entirely of ASCII hex digits.)
pub fn ascii_hex_digits<'a>() -> impl Parser<'a, &'a str> {
move |arena, state: State<'a>| {
let mut buf = bumpalo::collections::String::new_in(arena);
for &byte in state.bytes.iter() {
if (byte as char).is_ascii_hexdigit() {
buf.push(byte as char);
} else if buf.is_empty() {
// We didn't find any hex digits!
return Err(unexpected(0, state, Attempting::Keyword));
} else {
let state = state.advance_without_indenting(buf.len())?;
return Ok((buf.into_bump_str(), state));
}
}
Err(unexpected_eof(0, Attempting::HexDigit, state))
}
}
/// A single UTF-8-encoded char. This will both parse *and* validate that the
/// char is valid UTF-8, but it will *not* advance the state.
pub fn peek_utf8_char<'a>(state: &State<'a>) -> Result<(char, usize), FailReason> {

View file

@ -1,90 +1,242 @@
use crate::ast::Attempting;
use crate::parser::{parse_utf8, unexpected, unexpected_eof, ParseResult, Parser, State};
use crate::ast::{Attempting, EscapedChar, StrLiteral, StrSegment};
use crate::expr;
use crate::parser::{
allocated, ascii_char, ascii_hex_digits, loc, parse_utf8, unexpected, unexpected_eof,
ParseResult, Parser, State,
};
use bumpalo::collections::vec::Vec;
use bumpalo::Bump;
pub enum StringLiteral<'a> {
Line(&'a str),
Block(&'a [&'a str]),
}
pub fn parse<'a>() -> impl Parser<'a, StrLiteral<'a>> {
use StrLiteral::*;
pub fn parse<'a>() -> impl Parser<'a, StringLiteral<'a>> {
move |arena: &'a Bump, state: State<'a>| {
move |arena: &'a Bump, mut state: State<'a>| {
let mut bytes = state.bytes.iter();
// String literals must start with a quote.
// If this doesn't, it must not be a string literal!
match bytes.next() {
Some(&byte) => {
if byte != b'"' {
return Err(unexpected(0, state, Attempting::StringLiteral));
return Err(unexpected(0, state, Attempting::StrLiteral));
}
}
None => {
return Err(unexpected_eof(0, Attempting::StringLiteral, state));
return Err(unexpected_eof(0, Attempting::StrLiteral, state));
}
}
// Advance past the opening quotation mark.
state = state.advance_without_indenting(1)?;
// At the parsing stage we keep the entire raw string, because the formatter
// needs the raw string. (For example, so it can "remember" whether you
// wrote \u{...} or the actual unicode character itself.)
//
// Later, in canonicalization, we'll do things like resolving
// unicode escapes and string interpolation.
//
// Since we're keeping the entire raw string, all we need to track is
// how many characters we've parsed. So far, that's 1 (the opening `"`).
let mut parsed_chars = 1;
let mut prev_byte = b'"';
let mut segment_parsed_bytes = 0;
let mut segments = Vec::new_in(arena);
while let Some(&byte) = bytes.next() {
parsed_chars += 1;
macro_rules! escaped_char {
($ch:expr) => {
// Record the escaped char.
segments.push(StrSegment::EscapedChar($ch));
// Potentially end the string (unless this is an escaped `"`!)
if byte == b'"' && prev_byte != b'\\' {
let (string, state) = if parsed_chars == 2 {
match bytes.next() {
Some(byte) if *byte == b'"' => {
// If the first three chars were all `"`, then this
// literal begins with `"""` and is a block string.
return parse_block_string(arena, state, &mut bytes);
}
_ => ("", state.advance_without_indenting(2)?),
}
} else {
// Start at 1 so we omit the opening `"`.
// Subtract 1 from parsed_chars so we omit the closing `"`.
let string_bytes = &state.bytes[1..(parsed_chars - 1)];
// Advance past the segment we just added
state = state.advance_without_indenting(segment_parsed_bytes)?;
// Reset the segment
segment_parsed_bytes = 0;
};
}
macro_rules! end_segment {
($transform:expr) => {
// Don't push anything if the string would be empty.
if segment_parsed_bytes > 1 {
// This function is always called after we just parsed
// something which signalled that we should end the
// current segment - so use segment_parsed_bytes - 1 here,
// to exclude that char we just parsed.
let string_bytes = &state.bytes[0..(segment_parsed_bytes - 1)];
match parse_utf8(string_bytes) {
Ok(string) => (string, state.advance_without_indenting(parsed_chars)?),
Ok(string) => {
state = state.advance_without_indenting(string.len())?;
segments.push($transform(string));
}
Err(reason) => {
return state.fail(reason);
}
}
};
}
return Ok((StringLiteral::Line(string), state));
} else if byte == b'\n' {
// This is a single-line string, which cannot have newlines!
// Treat this as an unclosed string literal, and consume
// all remaining chars. This will mask all other errors, but
// it should make it easiest to debug; the file will be a giant
// error starting from where the open quote appeared.
return Err(unexpected(
state.bytes.len() - 1,
state,
Attempting::StringLiteral,
));
} else {
prev_byte = byte;
// Depending on where this macro is used, in some
// places this is unused.
#[allow(unused_assignments)]
{
// This function is always called after we just parsed
// something which signalled that we should end the
// current segment.
segment_parsed_bytes = 1;
}
};
}
while let Some(&byte) = bytes.next() {
// This is for the byte we just grabbed from the iterator.
segment_parsed_bytes += 1;
match byte {
b'"' => {
// This is the end of the string!
if segment_parsed_bytes == 1 && segments.is_empty() {
match bytes.next() {
Some(b'"') => {
// If the very first three chars were all `"`,
// then this literal begins with `"""`
// and is a block string.
return parse_block_string(arena, state, &mut bytes);
}
_ => {
// Advance 1 for the close quote
return Ok((PlainLine(""), state.advance_without_indenting(1)?));
}
}
} else {
end_segment!(StrSegment::Plaintext);
let expr = if segments.len() == 1 {
// We had exactly one segment, so this is a candidate
// to be StrLiteral::Plaintext
match segments.pop().unwrap() {
StrSegment::Plaintext(string) => StrLiteral::PlainLine(string),
other => {
let vec = bumpalo::vec![in arena; other];
StrLiteral::Line(vec.into_bump_slice())
}
}
} else {
Line(segments.into_bump_slice())
};
// Advance the state 1 to account for the closing `"`
return Ok((expr, state.advance_without_indenting(1)?));
};
}
b'\n' => {
// This is a single-line string, which cannot have newlines!
// Treat this as an unclosed string literal, and consume
// all remaining chars. This will mask all other errors, but
// it should make it easiest to debug; the file will be a giant
// error starting from where the open quote appeared.
return Err(unexpected(
state.bytes.len() - 1,
state,
Attempting::StrLiteral,
));
}
b'\\' => {
// We're about to begin an escaped segment of some sort!
//
// Record the current segment so we can begin a new one.
// End it right before the `\` char we just parsed.
end_segment!(StrSegment::Plaintext);
// This is for the byte we're about to parse.
segment_parsed_bytes += 1;
// This is the start of a new escape. Look at the next byte
// to figure out what type of escape it is.
match bytes.next() {
Some(b'(') => {
// Advance past the `\(` before using the expr parser
state = state.advance_without_indenting(2)?;
let original_byte_count = state.bytes.len();
// This is an interpolated variable.
// Parse an arbitrary expression, then give a
// canonicalization error if that expression variant
// is not allowed inside a string interpolation.
let (loc_expr, new_state) =
skip_second!(loc(allocated(expr::expr(0))), ascii_char(')'))
.parse(arena, state)?;
// Advance the iterator past the expr we just parsed.
for _ in 0..(original_byte_count - new_state.bytes.len()) {
bytes.next();
}
segments.push(StrSegment::Interpolated(loc_expr));
// Reset the segment
segment_parsed_bytes = 0;
state = new_state;
}
Some(b'u') => {
// Advance past the `\u` before using the expr parser
state = state.advance_without_indenting(2)?;
let original_byte_count = state.bytes.len();
// Parse the hex digits, surrounded by parens, then
// give a canonicalization error if the digits form
// an invalid unicode code point.
let (loc_digits, new_state) =
between!(ascii_char('('), loc(ascii_hex_digits()), ascii_char(')'))
.parse(arena, state)?;
// Advance the iterator past the expr we just parsed.
for _ in 0..(original_byte_count - new_state.bytes.len()) {
bytes.next();
}
segments.push(StrSegment::Unicode(loc_digits));
// Reset the segment
segment_parsed_bytes = 0;
state = new_state;
}
Some(b'\\') => {
escaped_char!(EscapedChar::Backslash);
}
Some(b'"') => {
escaped_char!(EscapedChar::Quote);
}
Some(b'r') => {
escaped_char!(EscapedChar::CarriageReturn);
}
Some(b't') => {
escaped_char!(EscapedChar::Tab);
}
Some(b'n') => {
escaped_char!(EscapedChar::Newline);
}
_ => {
// Invalid escape! A backslash must be followed
// by either an open paren or else one of the
// escapable characters (\n, \t, \", \\, etc)
return Err(unexpected(
state.bytes.len() - 1,
state,
Attempting::StrLiteral,
));
}
}
}
_ => {
// All other characters need no special handling.
}
}
}
// We ran out of characters before finding a closed quote
Err(unexpected_eof(
parsed_chars,
Attempting::StringLiteral,
state.bytes.len(),
Attempting::StrLiteral,
state.clone(),
))
}
@ -94,7 +246,7 @@ fn parse_block_string<'a, I>(
arena: &'a Bump,
state: State<'a>,
bytes: &mut I,
) -> ParseResult<'a, StringLiteral<'a>>
) -> ParseResult<'a, StrLiteral<'a>>
where
I: Iterator<Item = &'a u8>,
{
@ -112,42 +264,47 @@ where
parsed_chars += 1;
// Potentially end the string (unless this is an escaped `"`!)
if *byte == b'"' && prev_byte != b'\\' {
if quotes_seen == 2 {
// three consecutive qoutes, end string
match byte {
b'"' if prev_byte != b'\\' => {
if quotes_seen == 2 {
// three consecutive qoutes, end string
// Subtract 3 from parsed_chars so we omit the closing `"`.
let line_bytes = &state.bytes[line_start..(parsed_chars - 3)];
// Subtract 3 from parsed_chars so we omit the closing `"`.
let line_bytes = &state.bytes[line_start..(parsed_chars - 3)];
return match parse_utf8(line_bytes) {
return match parse_utf8(line_bytes) {
Ok(line) => {
// state = state.advance_without_indenting(parsed_chars)?;
// lines.push(line);
// Ok((StrLiteral::Block(lines.into_bump_slice()), state))
todo!("TODO parse this line in a block string: {:?}", line);
}
Err(reason) => state.fail(reason),
};
}
quotes_seen += 1;
}
b'\n' => {
// note this includes the newline
let line_bytes = &state.bytes[line_start..parsed_chars];
match parse_utf8(line_bytes) {
Ok(line) => {
let state = state.advance_without_indenting(parsed_chars)?;
lines.push(line);
Ok((StringLiteral::Block(arena.alloc(lines)), state))
quotes_seen = 0;
line_start = parsed_chars;
}
Err(reason) => {
return state.fail(reason);
}
Err(reason) => state.fail(reason),
};
}
quotes_seen += 1;
} else if *byte == b'\n' {
// note this includes the newline
let line_bytes = &state.bytes[line_start..parsed_chars];
match parse_utf8(line_bytes) {
Ok(line) => {
lines.push(line);
quotes_seen = 0;
line_start = parsed_chars;
}
Err(reason) => {
return state.fail(reason);
}
}
} else {
quotes_seen = 0;
_ => {
quotes_seen = 0;
}
}
prev_byte = *byte;
@ -156,8 +313,8 @@ where
// We ran out of characters before finding 3 closing quotes
Err(unexpected_eof(
parsed_chars,
// TODO custom BlockStringLiteral?
Attempting::StringLiteral,
// TODO custom BlockStrLiteral?
Attempting::StrLiteral,
state,
))
}

View file

@ -13,7 +13,7 @@ pub fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<ast::Expr<'a>,
#[allow(dead_code)]
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

View file

@ -24,8 +24,11 @@ mod test_parse {
use roc_parse::ast::CommentOrNewline::*;
use roc_parse::ast::Expr::{self, *};
use roc_parse::ast::Pattern::{self, *};
use roc_parse::ast::StrLiteral::*;
use roc_parse::ast::StrSegment::*;
use roc_parse::ast::{
Attempting, Def, InterfaceHeader, Spaceable, Tag, TypeAnnotation, WhenBranch,
self, Attempting, Def, EscapedChar, InterfaceHeader, Spaceable, Tag, TypeAnnotation,
WhenBranch,
};
use roc_parse::header::ModuleName;
use roc_parse::module::{interface_header, module_defs};
@ -35,7 +38,7 @@ mod test_parse {
fn assert_parses_to<'a>(input: &'a str, expected_expr: Expr<'a>) {
let arena = Bump::new();
let actual = parse_with(&arena, input);
let actual = parse_with(&arena, input.trim());
assert_eq!(Ok(expected_expr), actual);
}
@ -48,10 +51,44 @@ mod test_parse {
assert_eq!(Err(expected_fail), actual);
}
fn assert_segments<E: Fn(&Bump) -> Vec<'_, ast::StrSegment<'_>>>(input: &str, to_expected: E) {
let arena = Bump::new();
let actual = parse_with(&arena, arena.alloc(input));
let expected_slice = to_expected(&arena).into_bump_slice();
let expected_expr = Expr::Str(Line(expected_slice));
assert_eq!(Ok(expected_expr), actual);
}
fn parses_with_escaped_char<
I: Fn(&str) -> String,
E: Fn(EscapedChar, &Bump) -> Vec<'_, ast::StrSegment<'_>>,
>(
to_input: I,
to_expected: E,
) {
let arena = Bump::new();
// Try parsing with each of the escaped chars Roc supports
for (string, escaped) in &[
("\\\\", EscapedChar::Backslash),
("\\n", EscapedChar::Newline),
("\\r", EscapedChar::CarriageReturn),
("\\t", EscapedChar::Tab),
("\\\"", EscapedChar::Quote),
] {
let actual = parse_with(&arena, arena.alloc(to_input(string)));
let expected_slice = to_expected(*escaped, &arena).into_bump_slice();
let expected_expr = Expr::Str(Line(expected_slice));
assert_eq!(Ok(expected_expr), actual);
}
}
// STRING LITERALS
fn expect_parsed_str(input: &str, expected: &str) {
assert_parses_to(expected, Str(input.into()));
assert_parses_to(expected, Expr::Str(PlainLine(input)));
}
#[test]
@ -59,10 +96,10 @@ mod test_parse {
assert_parses_to(
indoc!(
r#"
""
""
"#
),
Str(""),
Str(PlainLine("")),
);
}
@ -71,10 +108,10 @@ mod test_parse {
assert_parses_to(
indoc!(
r#"
"x"
"x"
"#
),
Str("x".into()),
Expr::Str(PlainLine("x".into())),
);
}
@ -83,10 +120,10 @@ mod test_parse {
assert_parses_to(
indoc!(
r#"
"foo"
"foo"
"#
),
Str("foo".into()),
Expr::Str(PlainLine("foo".into())),
);
}
@ -101,19 +138,155 @@ mod test_parse {
expect_parsed_str("123 abc 456 def", r#""123 abc 456 def""#);
}
// BACKSLASH ESCAPES
#[test]
fn string_with_special_escapes() {
expect_parsed_str(r#"x\\x"#, r#""x\\x""#);
expect_parsed_str(r#"x\"x"#, r#""x\"x""#);
expect_parsed_str(r#"x\tx"#, r#""x\tx""#);
expect_parsed_str(r#"x\rx"#, r#""x\rx""#);
expect_parsed_str(r#"x\nx"#, r#""x\nx""#);
fn string_with_escaped_char_at_end() {
parses_with_escaped_char(
|esc| format!(r#""abcd{}""#, esc),
|esc, arena| bumpalo::vec![in arena; Plaintext("abcd"), EscapedChar(esc)],
);
}
#[test]
fn string_with_single_quote() {
// This shoud NOT be escaped in a string.
expect_parsed_str("x'x", r#""x'x""#);
fn string_with_escaped_char_in_front() {
parses_with_escaped_char(
|esc| format!(r#""{}abcd""#, esc),
|esc, arena| bumpalo::vec![in arena; EscapedChar(esc), Plaintext("abcd")],
);
}
#[test]
fn string_with_escaped_char_in_middle() {
parses_with_escaped_char(
|esc| format!(r#""ab{}cd""#, esc),
|esc, arena| bumpalo::vec![in arena; Plaintext("ab"), EscapedChar(esc), Plaintext("cd")],
);
}
#[test]
fn string_with_multiple_escaped_chars() {
parses_with_escaped_char(
|esc| format!(r#""{}abc{}de{}fghi{}""#, esc, esc, esc, esc),
|esc, arena| bumpalo::vec![in arena; EscapedChar(esc), Plaintext("abc"), EscapedChar(esc), Plaintext("de"), EscapedChar(esc), Plaintext("fghi"), EscapedChar(esc)],
);
}
// UNICODE ESCAPES
#[test]
fn unicode_escape_in_middle() {
assert_segments(r#""Hi, \u(123)!""#, |arena| {
bumpalo::vec![in arena;
Plaintext("Hi, "),
Unicode(Located::new(0, 0, 8, 11, "123")),
Plaintext("!")
]
});
}
#[test]
fn unicode_escape_in_front() {
assert_segments(r#""\u(1234) is a unicode char""#, |arena| {
bumpalo::vec![in arena;
Unicode(Located::new(0, 0, 4, 8, "1234")),
Plaintext(" is a unicode char")
]
});
}
#[test]
fn unicode_escape_in_back() {
assert_segments(r#""this is unicode: \u(1)""#, |arena| {
bumpalo::vec![in arena;
Plaintext("this is unicode: "),
Unicode(Located::new(0, 0, 21, 22, "1"))
]
});
}
#[test]
fn unicode_escape_multiple() {
assert_segments(r#""\u(a1) this is \u(2Bcd) unicode \u(ef97)""#, |arena| {
bumpalo::vec![in arena;
Unicode(Located::new(0, 0, 4, 6, "a1")),
Plaintext(" this is "),
Unicode(Located::new(0, 0, 19, 23, "2Bcd")),
Plaintext(" unicode "),
Unicode(Located::new(0, 0, 36, 40, "ef97"))
]
});
}
// INTERPOLATION
#[test]
fn string_with_interpolation_in_middle() {
assert_segments(r#""Hi, \(name)!""#, |arena| {
let expr = arena.alloc(Var {
module_name: "",
ident: "name",
});
bumpalo::vec![in arena;
Plaintext("Hi, "),
Interpolated(Located::new(0, 0, 7, 11, expr)),
Plaintext("!")
]
});
}
#[test]
fn string_with_interpolation_in_front() {
assert_segments(r#""\(name), hi!""#, |arena| {
let expr = arena.alloc(Var {
module_name: "",
ident: "name",
});
bumpalo::vec![in arena;
Interpolated(Located::new(0, 0, 3, 7, expr)),
Plaintext(", hi!")
]
});
}
#[test]
fn string_with_interpolation_in_back() {
assert_segments(r#""Hello \(name)""#, |arena| {
let expr = arena.alloc(Var {
module_name: "",
ident: "name",
});
bumpalo::vec![in arena;
Plaintext("Hello "),
Interpolated(Located::new(0, 0, 9, 13, expr))
]
});
}
#[test]
fn string_with_multiple_interpolations() {
assert_segments(r#""Hi, \(name)! How is \(project) going?""#, |arena| {
let expr1 = arena.alloc(Var {
module_name: "",
ident: "name",
});
let expr2 = arena.alloc(Var {
module_name: "",
ident: "project",
});
bumpalo::vec![in arena;
Plaintext("Hi, "),
Interpolated(Located::new(0, 0, 7, 11, expr1)),
Plaintext("! How is "),
Interpolated(Located::new(0, 0, 23, 30, expr2)),
Plaintext(" going?")
]
});
}
#[test]
@ -460,7 +633,7 @@ mod test_parse {
}
#[test]
fn comment_with_unicode() {
fn comment_with_non_ascii() {
let arena = Bump::new();
let spaced_int = arena
.alloc(Num("3"))
@ -1859,19 +2032,23 @@ mod test_parse {
fn two_branch_when() {
let arena = Bump::new();
let newlines = bumpalo::vec![in &arena; Newline];
let pattern1 =
Pattern::SpaceBefore(arena.alloc(StrLiteral("blah")), newlines.into_bump_slice());
let loc_pattern1 = Located::new(1, 1, 1, 7, pattern1);
let pattern1 = Pattern::SpaceBefore(
arena.alloc(StrLiteral(PlainLine(""))),
newlines.into_bump_slice(),
);
let loc_pattern1 = Located::new(1, 1, 1, 3, pattern1);
let expr1 = Num("1");
let loc_expr1 = Located::new(1, 1, 11, 12, expr1);
let loc_expr1 = Located::new(1, 1, 7, 8, expr1);
let branch1 = &*arena.alloc(WhenBranch {
patterns: bumpalo::vec![in &arena;loc_pattern1],
value: loc_expr1,
guard: None,
});
let newlines = bumpalo::vec![in &arena; Newline];
let pattern2 =
Pattern::SpaceBefore(arena.alloc(StrLiteral("mise")), newlines.into_bump_slice());
let pattern2 = Pattern::SpaceBefore(
arena.alloc(StrLiteral(PlainLine("mise"))),
newlines.into_bump_slice(),
);
let loc_pattern2 = Located::new(2, 2, 1, 7, pattern2);
let expr2 = Num("2");
let loc_expr2 = Located::new(2, 2, 11, 12, expr2);
@ -1891,9 +2068,9 @@ mod test_parse {
&arena,
indoc!(
r#"
when x is
"blah" -> 1
"mise" -> 2
when x is
"" -> 1
"mise" -> 2
"#
),
);
@ -2003,9 +2180,11 @@ mod test_parse {
fn when_with_alternative_patterns() {
let arena = Bump::new();
let newlines = bumpalo::vec![in &arena; Newline];
let pattern1 =
Pattern::SpaceBefore(arena.alloc(StrLiteral("blah")), newlines.into_bump_slice());
let pattern1_alt = StrLiteral("blop");
let pattern1 = Pattern::SpaceBefore(
arena.alloc(StrLiteral(PlainLine("blah"))),
newlines.into_bump_slice(),
);
let pattern1_alt = StrLiteral(PlainLine("blop"));
let loc_pattern1 = Located::new(1, 1, 1, 7, pattern1);
let loc_pattern1_alt = Located::new(1, 1, 10, 16, pattern1_alt);
let expr1 = Num("1");
@ -2016,11 +2195,15 @@ mod test_parse {
guard: None,
});
let newlines = bumpalo::vec![in &arena; Newline];
let pattern2 =
Pattern::SpaceBefore(arena.alloc(StrLiteral("foo")), newlines.into_bump_slice());
let pattern2 = Pattern::SpaceBefore(
arena.alloc(StrLiteral(PlainLine("foo"))),
newlines.into_bump_slice(),
);
let newlines = bumpalo::vec![in &arena; Newline];
let pattern2_alt =
Pattern::SpaceBefore(arena.alloc(StrLiteral("bar")), newlines.into_bump_slice());
let pattern2_alt = Pattern::SpaceBefore(
arena.alloc(StrLiteral(PlainLine("bar"))),
newlines.into_bump_slice(),
);
let loc_pattern2 = Located::new(2, 2, 1, 6, pattern2);
let loc_pattern2_alt = Located::new(3, 3, 1, 6, pattern2_alt);
let expr2 = Num("2");
@ -2133,14 +2316,14 @@ mod test_parse {
let def2 = SpaceAfter(
arena.alloc(Body(
arena.alloc(Located::new(2, 2, 0, 3, pattern2)),
arena.alloc(Located::new(2, 2, 6, 10, Str("hi"))),
arena.alloc(Located::new(2, 2, 6, 10, Str(PlainLine("hi")))),
)),
newlines2.into_bump_slice(),
);
let def3 = SpaceAfter(
arena.alloc(Body(
arena.alloc(Located::new(3, 3, 0, 3, pattern3)),
arena.alloc(Located::new(3, 3, 6, 13, Str("stuff"))),
arena.alloc(Located::new(3, 3, 6, 13, Str(PlainLine("stuff")))),
)),
newlines3.into_bump_slice(),
);
@ -2426,12 +2609,10 @@ mod test_parse {
// )
// "#
// ),
// Str(""),
// Str(PlainLine("")),
// );
// }
// TODO test for \t \r and \n in string literals *outside* unicode escape sequence!
//
// TODO test for non-ASCII variables
//
// TODO verify that when a string literal contains a newline before the

View file

@ -55,6 +55,9 @@ pub enum Problem {
alias_name: Symbol,
region: Region,
},
InvalidInterpolation(Region),
InvalidHexadecimal(Region),
InvalidUnicodeCodePoint(Region),
}
#[derive(Clone, Debug, PartialEq)]
@ -125,6 +128,10 @@ pub enum RuntimeError {
NonExhaustivePattern,
InvalidInterpolation(Region),
InvalidHexadecimal(Region),
InvalidUnicodeCodePoint(Region),
/// When the author specifies a type annotation but no implementation
NoImplementation,
}

View file

@ -144,7 +144,7 @@ pub fn can_problem<'b>(
alloc.region(variable_region),
alloc.reflow("Roc does not allow unused type parameters!"),
// TODO add link to this guide section
alloc.hint().append(alloc.reflow(
alloc.tip().append(alloc.reflow(
"If you want an unused type parameter (a so-called \"phantom type\"), \
read the guide section on phantom data.",
)),
@ -262,6 +262,24 @@ pub fn can_problem<'b>(
alloc.reflow(" can occur in this position."),
]),
]),
Problem::InvalidHexadecimal(region) => {
todo!(
"TODO report an invalid hexadecimal number in a \\u(...) code point at region {:?}",
region
);
}
Problem::InvalidUnicodeCodePoint(region) => {
todo!(
"TODO report an invalid \\u(...) code point at region {:?}",
region
);
}
Problem::InvalidInterpolation(region) => {
todo!(
"TODO report an invalid string interpolation at region {:?}",
region
);
}
Problem::RuntimeError(runtime_error) => pretty_runtime_error(alloc, runtime_error),
};
@ -309,7 +327,7 @@ fn pretty_runtime_error<'b>(
" value is defined directly in terms of itself, causing an infinite loop.",
))
// TODO "are you trying to mutate a variable?
// TODO hint?
// TODO tip?
} else {
alloc.stack(vec![
alloc
@ -334,7 +352,7 @@ fn pretty_runtime_error<'b>(
.map(|s| alloc.symbol_unqualified(s))
.collect::<Vec<_>>(),
),
// TODO hint?
// TODO tip?
])
}
}
@ -353,12 +371,12 @@ fn pretty_runtime_error<'b>(
QualifiedIdentifier => " qualified ",
};
let hint = match problem {
let tip = match problem {
MalformedInt | MalformedFloat | MalformedBase(_) => alloc
.hint()
.tip()
.append(alloc.reflow("Learn more about number literals at TODO")),
Unknown => alloc.nil(),
QualifiedIdentifier => alloc.hint().append(
QualifiedIdentifier => alloc.tip().append(
alloc.reflow("In patterns, only private and global tags can be qualified"),
),
};
@ -370,7 +388,7 @@ fn pretty_runtime_error<'b>(
alloc.reflow("pattern is malformed:"),
]),
alloc.region(region),
hint,
tip,
])
}
RuntimeError::UnsupportedPattern(_) => {
@ -392,8 +410,8 @@ fn pretty_runtime_error<'b>(
RuntimeError::MalformedClosure(_) => todo!(""),
RuntimeError::InvalidFloat(sign @ FloatErrorKind::PositiveInfinity, region, _raw_str)
| RuntimeError::InvalidFloat(sign @ FloatErrorKind::NegativeInfinity, region, _raw_str) => {
let hint = alloc
.hint()
let tip = alloc
.tip()
.append(alloc.reflow("Learn more about number literals at TODO"));
let big_or_small = if let FloatErrorKind::PositiveInfinity = sign {
@ -415,12 +433,12 @@ fn pretty_runtime_error<'b>(
alloc.reflow(" and "),
alloc.text(format!("{:e}", f64::MAX)),
]),
hint,
tip,
])
}
RuntimeError::InvalidFloat(FloatErrorKind::Error, region, _raw_str) => {
let hint = alloc
.hint()
let tip = alloc
.tip()
.append(alloc.reflow("Learn more about number literals at TODO"));
alloc.stack(vec![
@ -431,7 +449,7 @@ fn pretty_runtime_error<'b>(
alloc.concat(vec![
alloc.reflow("Floating point literals can only contain the digits 0-9, or use scientific notation 10e4"),
]),
hint,
tip,
])
}
RuntimeError::InvalidInt(error @ IntErrorKind::InvalidDigit, base, region, _raw_str)
@ -471,8 +489,8 @@ fn pretty_runtime_error<'b>(
Binary => "0 and 1",
};
let hint = alloc
.hint()
let tip = alloc
.tip()
.append(alloc.reflow("Learn more about number literals at TODO"));
alloc.stack(vec![
@ -490,7 +508,7 @@ fn pretty_runtime_error<'b>(
alloc.text(charset),
alloc.text("."),
]),
hint,
tip,
])
}
RuntimeError::InvalidInt(error_kind @ IntErrorKind::Underflow, _base, region, _raw_str)
@ -501,8 +519,8 @@ fn pretty_runtime_error<'b>(
"big"
};
let hint = alloc
.hint()
let tip = alloc
.tip()
.append(alloc.reflow("Learn more about number literals at TODO"));
alloc.stack(vec![
@ -513,7 +531,7 @@ fn pretty_runtime_error<'b>(
]),
alloc.region(region),
alloc.reflow("Roc uses signed 64-bit integers, allowing values between 9_223_372_036_854_775_808 and 9_223_372_036_854_775_807."),
hint,
tip,
])
}
RuntimeError::InvalidRecordUpdate { region } => alloc.stack(vec![
@ -524,6 +542,24 @@ fn pretty_runtime_error<'b>(
alloc.region(region),
alloc.reflow("Only variables can be updated with record update syntax."),
]),
RuntimeError::InvalidHexadecimal(region) => {
todo!(
"TODO runtime error for an invalid hexadecimal number in a \\u(...) code point at region {:?}",
region
);
}
RuntimeError::InvalidUnicodeCodePoint(region) => {
todo!(
"TODO runtime error for an invalid \\u(...) code point at region {:?}",
region
);
}
RuntimeError::InvalidInterpolation(region) => {
todo!(
"TODO runtime error for an invalid string interpolation at region {:?}",
region
);
}
RuntimeError::NoImplementation => todo!("no implementation, unreachable"),
RuntimeError::NonExhaustivePattern => {
unreachable!("not currently reported (but can blow up at runtime)")

View file

@ -498,6 +498,14 @@ fn to_expr_report<'b>(
Reason::ElemInList { index } => {
let ith = index.ordinal();
// Don't say "the previous elements all have the type" if
// there was only 1 previous element!
let prev_elems_msg = if index.to_zero_based() == 1 {
"However, the 1st element has the type:"
} else {
"However, the preceding elements in the list all have the type:"
};
report_mismatch(
alloc,
filename,
@ -506,13 +514,10 @@ fn to_expr_report<'b>(
expected_type,
region,
Some(expr_region),
alloc.string(format!(
"The {} element of this list does not match all the previous elements:",
ith
)),
alloc.string(format!("The {} element is", ith)),
alloc.reflow("But all the previous elements in the list have type:"),
Some(alloc.reflow("I need all elements of a list to have the same type!")),
alloc.reflow("This list contains elements with different types:"),
alloc.string(format!("Its {} element is", ith)),
alloc.reflow(prev_elems_msg),
Some(alloc.reflow("I need every element in a list to have the same type!")),
)
}
Reason::RecordUpdateValue(field) => report_mismatch(
@ -776,7 +781,7 @@ fn to_expr_report<'b>(
unreachable!("I don't think these can be reached")
}
Reason::InterpolatedStringVar => {
Reason::StrInterpolation => {
unimplemented!("string interpolation is not implemented yet")
}
@ -819,7 +824,7 @@ fn type_comparison<'b>(
lines.push(alloc.concat(context_hints));
}
lines.extend(problems_to_hint(alloc, comparison.problems));
lines.extend(problems_to_tip(alloc, comparison.problems));
alloc.stack(lines)
}
@ -835,7 +840,7 @@ fn lone_type<'b>(
let mut lines = vec![i_am_seeing, comparison.actual, further_details];
lines.extend(problems_to_hint(alloc, comparison.problems));
lines.extend(problems_to_tip(alloc, comparison.problems));
alloc.stack(lines)
}
@ -870,6 +875,10 @@ fn add_category<'b>(
Int => alloc.concat(vec![this_is, alloc.text(" an integer of type:")]),
Float => alloc.concat(vec![this_is, alloc.text(" a float of type:")]),
Str => alloc.concat(vec![this_is, alloc.text(" a string of type:")]),
StrInterpolation => alloc.concat(vec![
this_is,
alloc.text(" a value in a string interpolation, which was of type:"),
]),
Lambda => alloc.concat(vec![this_is, alloc.text(" an anonymous function of type:")]),
@ -1081,7 +1090,7 @@ fn pattern_type_comparision<'b>(
comparison.expected,
];
lines.extend(problems_to_hint(alloc, comparison.problems));
lines.extend(problems_to_tip(alloc, comparison.problems));
lines.extend(reason_hints);
alloc.stack(lines)
@ -1156,7 +1165,7 @@ pub enum Problem {
OptionalRequiredMismatch(Lowercase),
}
fn problems_to_hint<'b>(
fn problems_to_tip<'b>(
alloc: &'b RocDocAllocator<'b>,
mut problems: Vec<Problem>,
) -> Option<RocDocBuilder<'b>> {
@ -2261,27 +2270,24 @@ fn type_problem_to_pretty<'b>(
let found = alloc.text(typo_str).annotate(Annotation::Typo);
let suggestion = alloc.text(nearest_str).annotate(Annotation::TypoSuggestion);
let hint1 = alloc
.hint()
let tip1 = alloc
.tip()
.append(alloc.reflow("Seems like a record field typo. Maybe "))
.append(found)
.append(alloc.reflow(" should be "))
.append(suggestion)
.append(alloc.text("?"));
let hint2 = alloc.hint().append(alloc.reflow(ADD_ANNOTATIONS));
let tip2 = alloc.tip().append(alloc.reflow(ADD_ANNOTATIONS));
hint1
.append(alloc.line())
.append(alloc.line())
.append(hint2)
tip1.append(alloc.line()).append(alloc.line()).append(tip2)
}
}
}
FieldsMissing(missing) => match missing.split_last() {
None => alloc.nil(),
Some((f1, [])) => alloc
.hint()
.tip()
.append(alloc.reflow("Looks like the "))
.append(f1.as_str().to_owned())
.append(alloc.reflow(" field is missing.")),
@ -2289,7 +2295,7 @@ fn type_problem_to_pretty<'b>(
let separator = alloc.reflow(", ");
alloc
.hint()
.tip()
.append(alloc.reflow("Looks like the "))
.append(
alloc.intersperse(init.iter().map(|v| v.as_str().to_owned()), separator),
@ -2315,20 +2321,17 @@ fn type_problem_to_pretty<'b>(
let found = alloc.text(typo_str).annotate(Annotation::Typo);
let suggestion = alloc.text(nearest_str).annotate(Annotation::TypoSuggestion);
let hint1 = alloc
.hint()
let tip1 = alloc
.tip()
.append(alloc.reflow("Seems like a tag typo. Maybe "))
.append(found)
.append(" should be ")
.append(suggestion)
.append(alloc.text("?"));
let hint2 = alloc.hint().append(alloc.reflow(ADD_ANNOTATIONS));
let tip2 = alloc.tip().append(alloc.reflow(ADD_ANNOTATIONS));
hint1
.append(alloc.line())
.append(alloc.line())
.append(hint2)
tip1.append(alloc.line()).append(alloc.line()).append(tip2)
}
}
}
@ -2345,7 +2348,7 @@ fn type_problem_to_pretty<'b>(
)
};
alloc.hint().append(line)
alloc.tip().append(line)
}
BadRigidVar(x, tipe) => {
@ -2353,7 +2356,7 @@ fn type_problem_to_pretty<'b>(
let bad_rigid_var = |name: Lowercase, a_thing| {
alloc
.hint()
.tip()
.append(alloc.reflow("The type annotation uses the type variable "))
.append(alloc.type_variable(name))
.append(alloc.reflow(" to say that this definition can produce any type of value. But in the body I see that it will only produce "))
@ -2365,7 +2368,7 @@ fn type_problem_to_pretty<'b>(
let line = r#" as separate type variables. Your code seems to be saying they are the same though. Maybe they should be the same your type annotation? Maybe your code uses them in a weird way?"#;
alloc
.hint()
.tip()
.append(alloc.reflow("Your type annotation uses "))
.append(alloc.type_variable(a))
.append(alloc.reflow(" and "))
@ -2392,12 +2395,12 @@ fn type_problem_to_pretty<'b>(
Boolean(_) => bad_rigid_var(x, alloc.reflow("a uniqueness attribute value")),
}
}
IntFloat => alloc.hint().append(alloc.concat(vec![
alloc.reflow("Convert between "),
IntFloat => alloc.tip().append(alloc.concat(vec![
alloc.reflow("You can convert between "),
alloc.type_str("Int"),
alloc.reflow(" and "),
alloc.type_str("Float"),
alloc.reflow(" with "),
alloc.reflow(" using functions like "),
alloc.symbol_qualified(Symbol::NUM_TO_FLOAT),
alloc.reflow(" and "),
alloc.symbol_qualified(Symbol::NUM_ROUND),
@ -2407,26 +2410,26 @@ fn type_problem_to_pretty<'b>(
TagsMissing(missing) => match missing.split_last() {
None => alloc.nil(),
Some((f1, [])) => {
let hint1 = alloc
.hint()
let tip1 = alloc
.tip()
.append(alloc.reflow("Looks like a closed tag union does not have the "))
.append(alloc.tag_name(f1.clone()))
.append(alloc.reflow(" tag."));
let hint2 = alloc.hint().append(alloc.reflow(
let tip2 = alloc.tip().append(alloc.reflow(
"Closed tag unions can't grow, \
because that might change the size in memory. \
Can you use an open tag union?",
));
alloc.stack(vec![hint1, hint2])
alloc.stack(vec![tip1, tip2])
}
Some((last, init)) => {
let separator = alloc.reflow(", ");
let hint1 = alloc
.hint()
let tip1 = alloc
.tip()
.append(alloc.reflow("Looks like a closed tag union does not have the "))
.append(
alloc
@ -2436,16 +2439,16 @@ fn type_problem_to_pretty<'b>(
.append(alloc.tag_name(last.clone()))
.append(alloc.reflow(" tags."));
let hint2 = alloc.hint().append(alloc.reflow(
let tip2 = alloc.tip().append(alloc.reflow(
"Closed tag unions can't grow, \
because that might change the size in memory. \
Can you use an open tag union?",
));
alloc.stack(vec![hint1, hint2])
alloc.stack(vec![tip1, tip2])
}
},
OptionalRequiredMismatch(field) => alloc.hint().append(alloc.concat(vec![
OptionalRequiredMismatch(field) => alloc.tip().append(alloc.concat(vec![
alloc.reflow("To extract the "),
alloc.record_field(field),
alloc.reflow(

View file

@ -20,7 +20,12 @@ const CYCLE_LN: &str = ["| ", "│ "][!IS_WINDOWS as usize];
const CYCLE_MID: &str = ["| |", "│ ↓"][!IS_WINDOWS as usize];
const CYCLE_END: &str = ["+-<---+", "└─────┘"][!IS_WINDOWS as usize];
const GUTTER_BAR: &str = "";
const GUTTER_BAR: &str = "";
const ERROR_UNDERLINE: &str = "^";
/// The number of monospace spaces the gutter bar takes up.
/// (This is not necessarily the same as GUTTER_BAR.len()!)
const GUTTER_BAR_WIDTH: usize = 1;
pub fn cycle<'b>(
alloc: &'b RocDocAllocator<'b>,
@ -91,12 +96,15 @@ impl<'b> Report<'b> {
self.doc
} else {
let header = format!(
"-- {} {}",
"── {} {}",
self.title,
"-".repeat(80 - (self.title.len() + 4))
"".repeat(80 - (self.title.len() + 4))
);
alloc.stack(vec![alloc.text(header), self.doc])
alloc.stack(vec![
alloc.text(header).annotate(Annotation::Header),
self.doc,
])
}
}
}
@ -110,6 +118,7 @@ pub struct Palette<'a> {
pub alias: &'a str,
pub error: &'a str,
pub line_number: &'a str,
pub header: &'a str,
pub gutter_bar: &'a str,
pub module_name: &'a str,
pub binop: &'a str,
@ -126,7 +135,8 @@ pub const DEFAULT_PALETTE: Palette = Palette {
alias: YELLOW_CODE,
error: RED_CODE,
line_number: CYAN_CODE,
gutter_bar: MAGENTA_CODE,
header: CYAN_CODE,
gutter_bar: CYAN_CODE,
module_name: GREEN_CODE,
binop: GREEN_CODE,
typo: YELLOW_CODE,
@ -317,10 +327,11 @@ impl<'a> RocDocAllocator<'a> {
content.annotate(Annotation::TypeBlock).indent(4)
}
pub fn hint(&'a self) -> DocBuilder<'a, Self, Annotation> {
self.text("Hint:")
pub fn tip(&'a self) -> DocBuilder<'a, Self, Annotation> {
self.text("Tip")
.annotate(Annotation::Tip)
.append(":")
.append(self.softline())
.annotate(Annotation::Hint)
}
pub fn region_all_the_things(
@ -389,13 +400,16 @@ impl<'a> RocDocAllocator<'a> {
let overlapping = sub_region2.start_col < sub_region1.end_col;
let highlight = if overlapping {
self.text("^".repeat((sub_region2.end_col - sub_region1.start_col) as usize))
self.text(
ERROR_UNDERLINE.repeat((sub_region2.end_col - sub_region1.start_col) as usize),
)
} else {
let highlight1 = "^".repeat((sub_region1.end_col - sub_region1.start_col) as usize);
let highlight1 =
ERROR_UNDERLINE.repeat((sub_region1.end_col - sub_region1.start_col) as usize);
let highlight2 = if sub_region1 == sub_region2 {
"".repeat(0)
} else {
"^".repeat((sub_region2.end_col - sub_region2.start_col) as usize)
ERROR_UNDERLINE.repeat((sub_region2.end_col - sub_region2.start_col) as usize)
};
let inbetween = " "
.repeat((sub_region2.start_col.saturating_sub(sub_region1.end_col)) as usize);
@ -407,8 +421,9 @@ impl<'a> RocDocAllocator<'a> {
let highlight_line = self
.line()
.append(self.text(" ".repeat(max_line_number_length)))
.append(self.text(GUTTER_BAR).annotate(Annotation::GutterBar))
// Omit the gutter bar when we know there are no further
// line numbers to be printed after this!
.append(self.text(" ".repeat(max_line_number_length + GUTTER_BAR_WIDTH)))
.append(if sub_region1.is_empty() && sub_region2.is_empty() {
self.nil()
} else {
@ -490,11 +505,13 @@ impl<'a> RocDocAllocator<'a> {
}
if error_highlight_line {
let highlight_text = "^".repeat((sub_region.end_col - sub_region.start_col) as usize);
let highlight_text =
ERROR_UNDERLINE.repeat((sub_region.end_col - sub_region.start_col) as usize);
let highlight_line = self
.line()
.append(self.text(" ".repeat(max_line_number_length)))
.append(self.text(GUTTER_BAR).annotate(Annotation::GutterBar))
// Omit the gutter bar when we know there are no further
// line numbers to be printed after this!
.append(self.text(" ".repeat(max_line_number_length + GUTTER_BAR_WIDTH)))
.append(if highlight_text.is_empty() {
self.nil()
} else {
@ -575,7 +592,8 @@ pub enum Annotation {
Module,
Typo,
TypoSuggestion,
Hint,
Tip,
Header,
}
/// Render with minimal formatting
@ -718,7 +736,7 @@ where
Emphasized => {
self.write_str(BOLD_CODE)?;
}
Url | Hint => {
Url | Tip => {
self.write_str(UNDERLINE_CODE)?;
}
PlainText => {
@ -745,6 +763,9 @@ where
Error => {
self.write_str(self.palette.error)?;
}
Header => {
self.write_str(self.palette.header)?;
}
LineNumber => {
self.write_str(self.palette.line_number)?;
}
@ -773,8 +794,8 @@ where
None => {}
Some(annotation) => match annotation {
Emphasized | Url | TypeVariable | Alias | Symbol | BinOp | Error | GutterBar
| Typo | TypoSuggestion | Structure | CodeBlock | PlainText | LineNumber | Hint
| Module => {
| Typo | TypoSuggestion | Structure | CodeBlock | PlainText | LineNumber | Tip
| Module | Header => {
self.write_str(RESET_CODE)?;
}

File diff suppressed because it is too large Load diff

View file

@ -93,7 +93,7 @@ pub fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<ast::Expr<'a>,
#[allow(dead_code)]
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

View file

@ -277,21 +277,53 @@ mod solve_expr {
);
}
// // INTERPOLATED STRING
// INTERPOLATED STRING
// #[test]
// fn infer_interpolated_string() {
// infer_eq(
// indoc!(
// r#"
// whatItIs = "great"
#[test]
fn infer_interpolated_string() {
infer_eq(
indoc!(
r#"
whatItIs = "great"
// "type inference is \(whatItIs)!"
// "#
// ),
// "Str",
// );
// }
"type inference is \(whatItIs)!"
"#
),
"Str",
);
}
#[test]
fn infer_interpolated_var() {
infer_eq(
indoc!(
r#"
whatItIs = "great"
str = "type inference is \(whatItIs)!"
whatItIs
"#
),
"Str",
);
}
#[test]
fn infer_interpolated_field() {
infer_eq(
indoc!(
r#"
rec = { whatItIs: "great" }
str = "type inference is \(rec.whatItIs)!"
rec
"#
),
"{ whatItIs : Str }",
);
}
// LIST MISMATCH

View file

@ -151,10 +151,9 @@ impl Variable {
pub const EMPTY_TAG_UNION: Variable = Variable(2);
// Builtins
const BOOL_ENUM: Variable = Variable(3);
pub const BOOL: Variable = Variable(4);
pub const LIST_GET: Variable = Variable(5);
pub const BOOL: Variable = Variable(4); // Used in `if` conditions
pub const NUM_RESERVED_VARS: usize = 6;
pub const NUM_RESERVED_VARS: usize = 5;
const FIRST_USER_SPACE_VAR: Variable = Variable(Self::NUM_RESERVED_VARS as u32);

View file

@ -904,7 +904,7 @@ pub enum Reason {
FloatLiteral,
IntLiteral,
NumLiteral,
InterpolatedStringVar,
StrInterpolation,
WhenBranch {
index: Index,
},
@ -930,6 +930,7 @@ pub enum Category {
TagApply(TagName),
Lambda,
Uniqueness,
StrInterpolation,
// storing variables in the ast
Storage,

View file

@ -787,8 +787,7 @@ pub fn annotate_usage(expr: &Expr, usage: &mut VarUsage) {
| Num(_, _)
| Int(_, _)
| Float(_, _)
| Str(_)
| BlockStr(_)
| Str { .. }
| EmptyRecord
| Accessor { .. }
| RunLowLevel { .. } => {}

View file

@ -93,7 +93,7 @@ pub fn parse_with<'a>(arena: &'a Bump, input: &'a str) -> Result<ast::Expr<'a>,
#[allow(dead_code)]
pub fn parse_loc_with<'a>(arena: &'a Bump, input: &'a str) -> Result<Located<ast::Expr<'a>>, Fail> {
let state = State::new(input.as_bytes(), Attempting::Module);
let state = State::new(input.trim().as_bytes(), Attempting::Module);
let parser = space0_before(loc(roc_parse::expr::expr(0)), 0);
let answer = parser.parse(&arena, state);

65
editor/editor-ideas.md Normal file
View file

@ -0,0 +1,65 @@
(For background, [this talk](https://youtu.be/ZnYa99QoznE?t=4790) has an overview of the design goals for the editor.)
# Editor Ideas
Here are some ideas and interesting resources for the editor. Feel free to make a PR to add more!
## Sources of Potential Inspiration
These are potentially inspirational resources for the editor's design.
### Package-specific editor integrations
(Or possibly module-specific integrations, type-specific integrations, etc.)
* [What FP can learn from Smalltalk](https://youtu.be/baxtyeFVn3w) by [Aditya Siram](https://github.com/deech)
* [Moldable development](https://youtu.be/Pot9GnHFOVU) by [Tudor Gîrba](https://github.com/girba)
* [Unity game engine](https://unity.com/)
* Scripts can expose values as text inputs, sliders, checkboxes, etc or even generate custom graphical inputs
* Drag-n-drop game objects and component into script interfaces
### Live Interactivity
* [Up and Down the Ladder of Abstraction](http://worrydream.com/LadderOfAbstraction/) by [Bret Victor](http://worrydream.com/)
* [7 Bret Victor talks](https://www.youtube.com/watch?v=PUv66718DII&list=PLS4RYH2XfpAmswi1WDU6lwwggruEZrlPH)
* [Against the Current](https://youtu.be/WT2CMS0MxJ0) by [Chris Granger](https://github.com/ibdknox/)
* [Sketch-n-Sketch: Interactive SVG Programming with Direct Manipulation](https://youtu.be/YuGVC8VqXz0) by [Ravi Chugh](http://people.cs.uchicago.edu/~rchugh/)
* [Xi](https://xi-editor.io/) modern text editor with concurrent editing (related to [Druid](https://github.com/linebender/druid))
* [Self](https://selflanguage.org/) programming language
### Structured Editing
* [Deuce](http://ravichugh.github.io/sketch-n-sketch/) (videos on the right) by [Ravi Chugh](http://people.cs.uchicago.edu/~rchugh/) and others
* [Fructure: A Structured Editing Engine in Racket](https://youtu.be/CnbVCNIh1NA) by Andrew Blinn
* [Hazel: A Live FP Environment with Typed Holes](https://youtu.be/UkDSL0U9ndQ) by [Cyrus Omar](https://web.eecs.umich.edu/~comar/)
* [Dark Demo](https://youtu.be/QgimI2SnpTQ) by [Ellen Chisa](https://twitter.com/ellenchisa)
* [Introduction to JetBrains MPS](https://youtu.be/JoyzxjgVlQw) by [Kolja Dummann](https://www.youtube.com/channel/UCq_mWDvKdXYJJzBmXkci17w)
* [Eve](http://witheve.com/)
* code editor as prose writer
* live preview
* possible inspiration for live interactivity as well
* [Unreal Engine 4](https://www.unrealengine.com/en-US/)
* [Blueprints](https://docs.unrealengine.com/en-US/Engine/Blueprints/index.html) visual scripting (not suggesting visual scripting for Roc)
### Non-Code Related Inspiration
* [Scrivner](https://www.literatureandlatte.com/scrivener/overview) writing app for novelists, screenwriters, and more
* Word processors (Word, Google Docs, etc)
* Comments that are parallel to the text of the document.
* Comments can act as discussions and not just statements.
* Easy tooling around adding tables and other stylised text
* Excel and Google Sheets
* Not sure, maybe something they do well that we (code editors) could learn from
## General Thoughts/Ideas
Thoughts and ideas possibly taken from above inspirations or separate.
* ACCESSIBILITY!!!
* From Google Docs' comments, adding tests in a similar manner, where they exists in the same "document" but parallel to the code being written
* Makes sense for unit tests, keeps the test close to the source
* Doesn't necessarily make sense for integration or e2e testing
* Maybe easier to manually trigger a test related to exactly what code you're writing
* "Error mode" where the editor jumps you to the next error
* Similar in theory to diff tools that jump you to the next merge conflict
* dependency recommendation

32
name-and-logo.md Normal file
View file

@ -0,0 +1,32 @@
<img width="185" alt="The Roc logo, an origami bird" src="https://user-images.githubusercontent.com/1094080/92188927-e61ebd00-ee2b-11ea-97ef-2fc88e0094b0.png">
# Name and Logo
The Roc programming language is named after [a mythical bird](https://en.wikipedia.org/wiki/Roc_(mythology)).
Thats why the logo is a bird. Its specifically an [*origami* bird](https://youtu.be/9gni1t1k1uY) as a homage
to [Elm](https://elm-lang.org/)s tangram logo.
Roc is a direct descendant of Elm. The languages are similar, but not the same.
[Origami](https://en.wikipedia.org/wiki/Origami) likewise has similarities to [tangrams](https://en.wikipedia.org/wiki/Tangram), although they are not the same.
Both involve making a surprising variety of things
from simple primitives. [*Folds*](https://en.wikipedia.org/wiki/Fold_(higher-order_function))
are also common in functional programming.
The logo was made by tracing triangles onto a photo of a physical origami bird.
Its made of triangles because triangles are a foundational primitive in
computer graphics.
The name was chosen because it makes for a three-letter file extension, it means
something fantastical, and it has incredible potential for puns.
# Different Ways to Spell Roc
* **Roc** - traditional
* **roc** - low-key
* **ROC** - [YELLING](https://package.elm-lang.org/packages/elm/core/latest/String#toUpper)
* **Röc** - [metal 🤘](https://en.wikipedia.org/wiki/Metal_umlaut)
# Fun Facts
Roc translates to 鹏 in Chinese, [which means](https://www.mdbg.net/chinese/dictionary?page=worddict&wdrst=0&wdqb=%E9%B9%8F) "a large fabulous bird."