diff --git a/AUTHORS b/AUTHORS index 00f072e430..58357ab2bd 100644 --- a/AUTHORS +++ b/AUTHORS @@ -64,3 +64,4 @@ Jan Van Bruggen Mats Sigge <> Drew Lazzeri Tom Dohrmann +Elijah Schow diff --git a/FAQ.md b/FAQ.md new file mode 100644 index 0000000000..a80cc87ccc --- /dev/null +++ b/FAQ.md @@ -0,0 +1,277 @@ +# Frequently Asked Questions + +## Why doesn't Roc have higher-kinded polymorphism or arbitrary-rank types? + +_Since this is a FAQ answer, I'm going to assume familiarity with higher-kinded types and higher-rank types instead of including a primer on them._ + +A valuable aspect of Roc's type system is that it has [principal](https://en.wikipedia.org/wiki/Principal_type) +type inference. This means that: + +* At compile time, Roc can correctly infer the types for every expression in a program, even if you don't annotate any of the types. +* This inference always infers the most general type possible; you couldn't possibly add a valid type annotation that would make the type more flexible than the one that Roc would infer if you deleted the annotation. + +It's been proven that any type system which supports either [higher-kinded polymorphism](https://www.cl.cam.ac.uk/~jdy22/papers/lightweight-higher-kinded-polymorphism.pdf) or [arbitrary-rank types](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/putting.pdf) cannot have +principal type inference. With either of those features in the language, there will be situations where the compiler +reports an error that can only be fixed by the programmer adding a type annotation. This also means there would be +situations where the editor would not be able to reliably tell you the type of part of your program, unlike today +where it can accurately tell you the type of anything, even if you have no type annotations in your entire code base. + +### Arbitrary-rank types + +Unlike arbitrary-rank (aka "Rank-N") types, both Rank-1 and Rank-2 type systems are compatible with principal +type inference. Roc currently uses Rank-1 types, and the benefits of Rank-N over Rank-2 don't seem worth +sacrificing principal type inference to attain, so let's focus on the trade-offs between Rank-1 and Rank-2. + +Supporting Rank-2 types in Roc has been discussed before, but it has several important downsides: + +* It would increase the complexity of the language. +* It would make some compiler error messages more confusing (e.g. they might mention `forall` because that was the most general type that could be inferred, even if that wasn't helpful or related to the actual problem). +* It would substantially increase the complexity of the type checker, which would necessarily slow it down. + +No implementation of Rank-2 types can remove any of these downsides. Thus far, we've been able to come up +with sufficiently nice APIs that only require Rank-1 types, and we haven't seen a really compelling use case +where the gap between the Rank-2 and Rank-1 designs was big enough to justify switching to Rank-2. + +Since I prefer Roc being simpler and having a faster compiler with nicer error messages, my hope is that Roc +will never get Rank-2 types. However, it may turn out that in the future we learn about currently-unknown +upsides that somehow outweigh these downsides, so I'm open to considering the possibility - while rooting against it. + +### Higher-kinded polymorphism + +I want to be really clear about this one: the explicit plan is that Roc will never support higher-kinded polymorphism. + +On the technical side, the reasons for this are ordinary: I understand the practical benefits and +drawbacks of HKP, and I think the drawbacks outweigh the benefits when it comes to Roc. (Those who come to a +different conclusion may think HKP's drawbacks would be less of a big a deal in Roc than I do. That's reasonable; +we programmers often weigh the same trade-offs differently.) To be clear, I think this in the specific context of +Roc; there are plenty of other languages where HKP seems like a great fit. For example, it's hard to imagine Haskell +without it. Similarly, I think lifetime annotations are a great fit for Rust, but don't think they'd be right +for Roc either. + +I also think it's important to consider the cultural implications of deciding whether or not to support HKP. +To illustrate what I mean, imagine this conversation: + +**Programmer 1:** "How do you feel about higher-kinded polymorphism?" + +**Programmer 2:** "I have no idea what that is." + +**Programmer 1:** "Okay, how do you feel about monads?" + +**Programmer 2:** "OH NO." + +I've had several variations of this conversation: I'm talking about higher-kinded types, +another programmer asks what that means, I give monads as an example, and their reaction is strongly negative. +I've also had plenty of conversations with programmers who love HKP and vigorously advocate for its addition +to languages they use which don't have it. Feelings about HKP seem strongly divided, maybe more so +than any other type system feature besides static and dynamic types. + +It's impossible for a programming language to be neutral on this. If the language doesn't support HKP, nobody can +implement a Monad typeclass (or equivalent) in any way that can be expected to catch on. Advocacy to add HKP to the +language will inevitably follow. If the language does support HKP, one or more alternate standard libraries built +around monads will inevitably follow, along with corresponding cultural changes. (See Scala for example.) +Culturally, to support HKP is to take a side, and to decline to support it is also to take a side. + +Given this, language designers have three options: + +* Have HKP and have Monad in the standard library. Embrace them and build a culture and ecosystem around them. +* Have HKP and don't have Monad in the standard library. An alternate standard lbirary built around monads will inevitably emerge, and both the community and ecosystem will divide themselves along pro-monad and anti-monad lines. +* Don't have HKP; build a culture and ecosystem around other things. + +Considering that these are the only three options, I think the best choice for Roc—not only on a technical +level, but on a cultural level as well—is to make it clear that the plan is for Roc never to support HKP. +I hope this clarity can save a lot of community members' time that would otherwise be spent on advocacy or +arguing between the two sides of the divide. Again, I think it's completely reasonable for anyone to have a +different preference, but given that language designers can only choose one of these options, I'm confident +I've made the right choice for Roc by designing it never to have higher-kinded polymorphism. + +## Why does Roc's syntax and standard library differ from Elm's? + +Roc is a direct descendant of [Elm](https://elm-lang.org/). However, there are some differences between the two languages. + +Syntactic differences are among these. This is a feature, not a bug; if Roc had identical syntax to Elm, then it's +predictable that people would write code that was designed to work in both languages - and would then rely on +that being true, for example by making a package which advertised "Works in both Elm and Roc!" This in turn +would mean that later if either language were to change its syntax in a way that didn't make sense for the other, +the result would be broken code and sadness. + +So why does Roc have the specific syntax changes it does? Here are some brief explanations: + +* `#` instead of `--` for comments - this allows [hashbang](https://senthilnayagan.medium.com/shebang-hashbang-10966b8f28a8)s to work without needing special syntax. That isn't a use case Elm supports, but it is one Roc is designed to support. +* `{}` instead of `()` for the unit type - Elm has both, and they can both be used as a unit type. Since `{}` has other uses in the type system, but `()` doesn't, I consider it redundant and took it out. +* No tuples - I wanted to try simplifying the language and seeing how much we'd miss them. Anything that could be represented as a tuple can be represented with either a record or a single-tag union instead (e.g. `Pair x y = ...`), so is it really necessary to have a third syntax for representing a [product type](https://en.wikipedia.org/wiki/Product_type)? +* `when`...`is` instead of `case`...`of` - I predict it will be easier for beginners to pick up, because usually the way I explain `case`...`of` to beginners is by saying the words "when" and "is" out loud - e.g. "when `color` is `Red`, it runs this first branch; when `color` is `Blue`, it runs this other branch..." +* `:` instead of `=` for record field names: I like `=` being reserved for definitions, and `:` is the most popular alternative. +* Backpassing syntax - since Roc is designed to be used for use cases like command-line apps, shell scripts, and servers, I expect chained effects to come up a lot more often than they do in Elm. I think backpassing is nice for those use cases, similarly to how `do` notation is nice for them in Haskell. +* Tag unions instead of Elm's custom types (aka algebraic data types). This isn't just a syntactic change; tag unions are mainly in Roc because they can facilitate errors being accumulated across chained effects, which (as noted a moment ago) I expect to be a lot more common in Roc than in Elm. If you have tag unions, you don't really need a separate language feature for algebraic data types, since closed tag unions essentially work the same way - aside from not giving you a way to selectively expose variants or define phantom types. Roc's opaque types language feature covers those use cases instead. +* No `<|` operator. In Elm, I almost exclusively found myself wanting to use this in conjunction with anonymous functions (e.g. `foo <| \bar -> ...`) or conditionals (e.g. `foo <| if bar then ...`). In Roc you can do both of these without the `<|`. That means the main remaining use for `<|` is to reduce parentheses, but I tend to think `|>` is better at that (or else the parens are fine), so after the other syntactic changes, I considered `<|` an unnecessary stylistic alternative to `|>` or parens. +* `:` instead of `type alias` - I like to avoid reserved keywords for terms that are desirable in userspace, so that people don't have to name things `typ` because `type` is a reserved keyword, or `clazz` because `class` is reserved. (I couldn't think of satisfactory alternatives for `as`, `when`, `is`, or `if` other than different reserved keywords. I could see an argument for `then`—and maybe even `is`—being replaced with a `->` or `=>` or something, but I don't anticipate missing either of those words much in userspace. `then` is used in JavaScript promises, but I think there are several better names for that function.) +* No underscores in variable names - I've seen Elm beginners reflexively use `snake_case` over `camelCase` and then need to un-learn the habit after the compiler accepted it. I'd rather have the compiler give feedback that this isn't the way to do it in Roc, and suggest a camelCase alternative. I've also seen underscores used for lazy naming, e.g. `foo` and then `foo_`. If lazy naming is the goal, `foo2` is just as concise as `foo_`, but `foo3` is more concise than `foo__`. So in a way, removing `_` is a forcing function for improved laziness. (Of course, more descriptive naming would be even better.) +* Trailing commas - I've seen people walk away (in some cases physically!) from Elm as soon as they saw the leading commas in collection literals. While I think they've made a mistake by not pushing past this aesthetic preference to give the language a chance, I also would prefer not put them in a position to make such a mistake in the first place. Secondarily, while I'm personally fine with either style, between the two I prefer the look of trailing commas. +* The `!` unary prefix operator. I didn't want to have a `Basics` module (more on that in a moment), and without `Basics`, this would either need to be called fully-qualified (`Bool.not`) or else a module import of `Bool.{ not }` would be necessary. Both seemed less nice than supporting the `!` prefix that's common to so many widely-used languages, especially when we already have a unary prefix operator of `-` for negation (e.g. `-x`). +* `!=` for the inequality operator (instead of Elm's `/=`) - this one pairs more naturally with the `!` prefix operator and is also very common in other languages. + +Roc also has a different standard library from Elm. Some of the differences come down to platforms and applications (e.g. having `Task` in Roc's standard library wouldn't make sense), but others do not. Here are some brief explanations: + +* No `Basics` module. I wanted to have a simple rule of "all modules in the standard library are imported by default, and so are their exposed types," and that's it. Given that I wanted the comparison operators (e.g. `<`) to work only on numbers, it ended up that having `Num` and `Bool` modules meant that almost nothing would be left for a `Basics` equivalent in Roc except `identity` and `Never`. The Roc type `[]` (empty tag union) is equivalent to `Never`, so that wasn't necessary, and I generally think that `identity` is a good concept but a sign of an incomplete API whenever its use comes up in practice. For example, instead of calling `|> List.filterMap identity` I'd rather have access to a more self-descriptive function like `|> List.dropNothings`. With `Num` and `Bool`, and without `identity` and `Never`, there was nothing left in `Basics`. +* `Str` instead of `String` - after using the `str` type in Rust, I realized I had no issue whatsoever with the more concise name, especially since it was used in so many places (similar to `Msg` and `Cmd` in Elm) - so I decided to save a couple of letters. +* No function composition operators - I stopped using these in Elm so long ago, at one point I forgot they were in the language! See the FAQ entry on currying for details about why. +* No `Maybe`. If a function returns a potential error, I prefer `Result` with an error type that uses a no-payload tag to describe what went wrong. (For example, `List.first : List a -> Result a [ ListWasEmpty ]*` instead of `List.first : List a -> Maybe a`.) This is not only more self-descriptive, it also composes better with operations that have multiple ways to fail. Optional record fields can be handled using the explicit Optional Record Field language feature. To describe something that's neither an operation that can fail nor an optional field, I prefer using a more descriptive tag - e.g. for a nullable JSON decoder, instead of `nullable : Decoder a -> Decoder (Maybe a)`, making a self-documenting API like `nullable : Decoder a -> Decoder [ Null, NonNull a ]`. Joël's legendary [talk about Maybe](https://youtu.be/43eM4kNbb6c) is great, but the fact that a whole talk about such a simple type can be so useful speaks to how easy the type is to misuse. Imagine a 20-minute talk about `Result` - could it be anywhere near as hepful? On a historical note, it's conceivable that the creation of `Maybe` predated `Result`, and `Maybe` might have been thought of as a substitute for null pointers—as opposed to something that emerged organically based on specific motivating use cases after `Result` already existed. +* No `Char`. What most people think of as a "character" is a rendered glyph. However, rendered glyphs are comprised of [grapheme clusters](https://stackoverflow.com/a/27331885), which are a variable number of Unicode code points - and there's no upper bound on how many code points there can be in a single cluster. In a world of emoji, I think this makes `Char` error-prone and it's better to have `Str` be the only first-class unit. For convenience when working with unicode code points (e.g. for performance-critical tasks like parsing), the single-quote syntax is sugar for the corresponding `U32` code point - for example, writing `'鹏'` is exactly the same as writing `40527`. Like Rust, you get a compiler error if you put something in single quotes that's not a valid [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value). +* No `Debug.log` - the editor can do a better job at this, or you can write `expect x != x` to see what `x` is when the expectation fails. Using the editor means your code doesn't change, and using `expect` gives a natural reminder to remove the debugging code before shipping: the build will fail. +* No `Debug.todo` - instead you can write a type annotation with no implementation below it; the type checker will treat it normally, but attempting to use the value will cause a runtime exception. This is a feature I've often wanted in Elm, because I like prototyping APIs by writing out the types only, but then when I want the compiler to type-check them for me, I end up having to add `Debug.todo` in various places. + +## Why aren't Roc functions curried by default? + +Although technically any language with first-class functions makes it possible to curry +any function (e.g. I can manually curry a Roc function `\x, y, z ->` by writing `\x -> \y -> \z ->` instead), +typically what people mean when they say Roc isn't a curried language is that Roc functions aren't curried +by default. For the rest of this section, I'll use "currying" as a shorthand for "functions that are curried +by default" for the sake of brevity. + +As I see it, currying has one major upside and three major downsides. The upside: + +* It makes function calls more concise in some cases. + +The downsides: + +* It makes the `|>` operator more error-prone in some cases. +* It makes higher-order function calls need more parentheses in some cases. +* It significantly increases the language's learning curve. +* It facilitates pointfree function composition. (More on why this is listed as a downside later.) + +There's also a downside that it would make runtime performance of compiled progarms worse by default, +but I assume it would be possible to optimize that away at the cost of slightly longer compile times. + +I consider the one upside (conciseness in some places) extremely minor, and have almost never missed it in Roc. +Here are some more details about the downsides as I see them. + +### Currying and the `|>` operator + +In Roc, this code produces `"Hello, World!"` + +```elm +"Hello, World" + |> Str.concat "!" +``` + +This is because Roc's `|>` operator uses the expression before the `|>` as the *first* argument to the function +after it. For functions where both arguments have the same type, but it's obvious which argument goes where (e.g. +`Str.concat "Hello, " "World!"`, `List.concat [ 1, 2 ] [ 3, 4 ]`), this works out well. Another example would +be `|> Num.sub 1`, which subtracts 1 from whatever came before the `|>`. + +For this reason, "pipeline-friendliness" in Roc means that the first argument to each function is typically +the one that's most likely to be built up using a pipeline. For example, `List.map`: + +```elm +numbers + |> List.map Num.abs +``` + +This argument ordering convention also often makes it possible to pass anonymous functions to higher-order +functions without needing parentheses, like so: + +```elm +List.map numbers \num -> Num.abs (num - 1) +``` + +(If the arguments were reversed, this would be `List.map (\num -> Num.abs (num - 1)) numbers` and the +extra parentheses would be required.) + +Neither of these benefits is compatible with the argument ordering currying encourages. Currying encourages +`List.map` to take the `List` as its second argument instead of the first, so that you can partially apply it +like `(List.map Num.abs)`; if Roc introduced currying but kept the order of `List.map` the same way it is today, +then partially applying `List.map` (e.g. `(List.map numbers)`) would be much less useful than if the arguments +were swapped - but that in turn would make it less useful with `|>` and would require parentheses when passing +it an anonymous function. + +This is a fundamental design tension. One argument order works well with `|>` (at least the way it works in Roc +today) and with passing anonymous functions to higher-order functions, and the other works well with currying. +It's impossible to have both. + +Of note, one possible design is to have currying while also having `|>` pass the *last* argument instead of the first. +This is what Elm does, and it makes pipeline-friendliness and curry-friendliness the same thing. However, it also +means that either `|> Str.concat "!"` would add the `"!"` to the front of the string, or else `Str.concat`'s +arguments would have to be flipped - meaning that `Str.concat "Hello, World" "!"` would evaluate to `"!Hello, World"`. + +The only way to have `Str.concat` work the way it does in Roc today (where both pipelines and non-pipeline calling +do what you'd want them to) is to order function arguments in a way that is not conducive to currying. This design +tension only exists if there's currying in the language; without it, you can order arguments for pipeline-friendliness +without concern. + +### Currying and learning curve + +Prior to designing Roc, I taught a lot of beginner [Elm](https://elm-lang.org/) workshops. Sometimes at +conferences, sometimes for [Frontend Masters](https://frontendmasters.com/courses/intro-elm/), +sometimes for free at local coding bootcamps or meetup groups. +In total I've spent well over 100 hours standing in front of a class, introducing the students to their +first pure functional programming language. + +Here was my experience teaching currying: + +* The only way to avoid teaching it is to refuse to explain why multi-argument functions have multiple `->`s in them. (If you don't explain it, at least one student will ask about it - and many if not all of the others will wonder.) +* Teaching currying properly takes a solid chunk of time, because it requires explaining partial application, explaining how curried functions facilitate partial application, how function signatures accurately reflect that they're curried, and going through examples for all of these. +* Even after doing all this, and iterating on my approach each time to try to explain it more effectively than I had the time before, I'd estimate that under 50% of the class ended up actually understanding currying. I consistently heard that in practice it only "clicked" for most people after spending significantly more time writing code with it. + +This is not the end of the world, especially because it's easy enough to think "okay, I still don't totally get this +even after that explanation, but I can remember that function arguments are separated by `->` in this language +and maybe I'll understand the rest later." (Which they almost always do, if they stick with the language.) +Clearly currying doesn't preclude a language from being easy to learn, because Elm has currying, and Elm's learning +curve is famously gentle. + +That said, beginners who feel confused while learning the language are less likely to continue with it. +And however easy Roc would be to learn if it had currying, the language is certainly easier to learn without it. + +### Pointfree function composition + +[Pointfree function composition](https://en.wikipedia.org/wiki/Tacit_programming) is where you define +a new function by composing together two existing functions without naming intermediate arguments. +Here's an example: + +```elm +reverseSort : List elem -> List elem +reverseSort = compose List.reverse List.sort + +compose : (a -> b), (c -> a) -> (c -> b) +compose = \f, g, x -> f (g x) +``` + +Here's how I would instead write this: + +```elm +reverseSort : List elem -> List elem +reverseSort = \list -> List.reverse (List.sort list) +``` + +I've consistently found that I can more quickly and accurately understand function definitions that use +named arguments, even though the code is longer. I suspect this is because I'm faster at reading than I am at +desugaring, and whenever I read the top version I end up needing to mentally desugar it into the bottom version. +In more complex examples (this is among the tamest pointfree function composition examples I've seen), I make +a mistake in my mental desugaring, and misunderstand what the function is doing - which can cause bugs. + +I assumed I would get faster and more accurate at this over time. However, by now it's been about a decade +since I first learned about the technique, and I'm still slower and less accurate at reading code that uses +pointfree function composition (including if I wrote it - but even moreso if I didn't) than code written with +with named arguments. I've asked a lot of other programmers about their experiences with pointfree function +composition over the years, and the overwhelming majority of responses have been consistent with my experience. + +As such, my opinion about pointfree function composition has gotten less and less nuanced over time. I've now moved +past "it's the right tool for the job, sometimes" to concluding it's best thought of as an antipattern. This is +because I realized how much time I was spending evaluating on a case-by-case basis whether it might be the +right fit for a given situation. The time spent on this analysis alone vastly outweighed the sum of all the +benefits I got in the rare cases where I concluded it was a fit. So I've found the way to get the most out of +pointfree function composition is to never even think about using it; every other strategy leads to a worse outcome. + +Currying facilitates the antipattern of pointfree function composition, which I view as a downside of currying. + +Stacking up all these downsides of currying against the one upside of making certain function calls more concise, +I concluded that it would be a mistake to have it in Roc. + +## Is there syntax highlighting for Vim/Emacs/VS Code or a LSP? + +Not currently. Although they will presumably exist someday, while Roc is in the early days there's actually a conscious +effort to focus on the Roc Editor *instead of* adding Roc support to other editors - specifically in order to give the Roc +Editor the best possible chance at kickstarting a virtuous cycle of plugin authorship. + +This is an unusual approach, but there are more details in [this 2021 interview](https://youtu.be/ITrDd6-PbvY?t=212). + +In the meantime, using CoffeeScript syntax highlighting for .roc files turns out to work surprisingly well! diff --git a/README.md b/README.md index 777f2bc0d9..6197a8a6ac 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ The [tutorial](TUTORIAL.md) is the best place to learn about how to use the lang There's also a folder of [examples](https://github.com/rtfeldman/roc/tree/trunk/examples) - the [CLI example](https://github.com/rtfeldman/roc/tree/trunk/examples/cli) in particular is a reasonable starting point to build on. -[Roc Zulip chat](https://roc.zulipchat.com) is the best place to ask questions and get help! It's also where we discuss [ideas](https://roc.zulipchat.com/#narrow/stream/304641-ideas) for the language. If you want to get involved in contributing to the language, Zulip is also a great place to ask about good first projects. +If you have a specific question, the [FAQ](FAQ.md) might have an answer, although [Roc Zulip chat](https://roc.zulipchat.com) is overall the best place to ask questions and get help! It's also where we discuss [ideas](https://roc.zulipchat.com/#narrow/stream/304641-ideas) for the language. If you want to get involved in contributing to the language, Zulip is also a great place to ask about good first projects. ## State of Roc diff --git a/cli/src/build.rs b/cli/src/build.rs index 4786df2e65..98563beedb 100644 --- a/cli/src/build.rs +++ b/cli/src/build.rs @@ -204,7 +204,11 @@ pub fn build_file<'a>( buf.push_str("Code Generation"); buf.push('\n'); - report_timing(buf, "Generate LLVM IR", code_gen_timing.code_gen); + report_timing( + buf, + "Generate Assembly from Mono IR", + code_gen_timing.code_gen, + ); report_timing(buf, "Emit .o file", code_gen_timing.emit_o_file); let compilation_end = compilation_start.elapsed().unwrap(); diff --git a/cli/tests/cli_run.rs b/cli/tests/cli_run.rs index a4a105349b..45a32eecd9 100644 --- a/cli/tests/cli_run.rs +++ b/cli/tests/cli_run.rs @@ -69,6 +69,17 @@ mod cli_run { assert_multiline_str_eq!(err, expected.into()); } + fn check_format_check_as_expected(file: &Path, expects_success_exit_code: bool) { + let flags = &["--check"]; + let out = run_roc(&[&["format", &file.to_str().unwrap()], &flags[..]].concat()); + + if expects_success_exit_code { + assert!(out.status.success()); + } else { + assert!(!out.status.success()); + } + } + fn check_output_with_stdin( file: &Path, stdin: &[&str], @@ -863,6 +874,16 @@ mod cli_run { ), ); } + + #[test] + fn format_check_good() { + check_format_check_as_expected(&fixture_file("format", "Formatted.roc"), true); + } + + #[test] + fn format_check_reformatting_needed() { + check_format_check_as_expected(&fixture_file("format", "NotFormatted.roc"), false); + } } #[allow(dead_code)] diff --git a/cli/tests/fixtures/format/Formatted.roc b/cli/tests/fixtures/format/Formatted.roc new file mode 100644 index 0000000000..b62c494e66 --- /dev/null +++ b/cli/tests/fixtures/format/Formatted.roc @@ -0,0 +1,6 @@ +app "formatted" + packages { pf: "platform" } imports [] + provides [ main ] to pf + +main : Str +main = Dep1.value1 {} diff --git a/cli/tests/fixtures/format/NotFormatted.roc b/cli/tests/fixtures/format/NotFormatted.roc new file mode 100644 index 0000000000..df12071466 --- /dev/null +++ b/cli/tests/fixtures/format/NotFormatted.roc @@ -0,0 +1,6 @@ +app "formatted" + packages { pf: "platform" } + provides [ main ] to pf + +main : Str +main = Dep1.value1 {} diff --git a/compiler/build/src/program.rs b/compiler/build/src/program.rs index 252a75c050..933e8f9e7a 100644 --- a/compiler/build/src/program.rs +++ b/compiler/build/src/program.rs @@ -7,7 +7,7 @@ use roc_module::symbol::{Interns, ModuleId}; use roc_mono::ir::OptLevel; use roc_region::all::LineInfo; use std::path::{Path, PathBuf}; -use std::time::Duration; +use std::time::{Duration, SystemTime}; use roc_collections::all::MutMap; #[cfg(feature = "target-wasm32")] @@ -230,7 +230,6 @@ pub fn gen_from_mono_module_llvm( use inkwell::context::Context; use inkwell::module::Linkage; use inkwell::targets::{CodeModel, FileType, RelocMode}; - use std::time::SystemTime; let code_gen_start = SystemTime::now(); @@ -486,6 +485,7 @@ fn gen_from_mono_module_dev_wasm32( loaded: MonomorphizedModule, app_o_file: &Path, ) -> CodeGenTiming { + let code_gen_start = SystemTime::now(); let MonomorphizedModule { module_id, procedures, @@ -519,9 +519,17 @@ fn gen_from_mono_module_dev_wasm32( procedures, ); + let code_gen = code_gen_start.elapsed().unwrap(); + let emit_o_file_start = SystemTime::now(); + std::fs::write(&app_o_file, &bytes).expect("failed to write object to file"); - CodeGenTiming::default() + let emit_o_file = emit_o_file_start.elapsed().unwrap(); + + CodeGenTiming { + code_gen, + emit_o_file, + } } fn gen_from_mono_module_dev_assembly( @@ -530,6 +538,8 @@ fn gen_from_mono_module_dev_assembly( target: &target_lexicon::Triple, app_o_file: &Path, ) -> CodeGenTiming { + let code_gen_start = SystemTime::now(); + let lazy_literals = true; let generate_allocators = false; // provided by the platform @@ -551,10 +561,18 @@ fn gen_from_mono_module_dev_assembly( let module_object = roc_gen_dev::build_module(&env, &mut interns, target, procedures); + let code_gen = code_gen_start.elapsed().unwrap(); + let emit_o_file_start = SystemTime::now(); + let module_out = module_object .write() .expect("failed to build output object"); std::fs::write(&app_o_file, module_out).expect("failed to write object to file"); - CodeGenTiming::default() + let emit_o_file = emit_o_file_start.elapsed().unwrap(); + + CodeGenTiming { + code_gen, + emit_o_file, + } } diff --git a/compiler/gen_dev/src/generic64/aarch64.rs b/compiler/gen_dev/src/generic64/aarch64.rs index 591c799052..302db9b596 100644 --- a/compiler/gen_dev/src/generic64/aarch64.rs +++ b/compiler/gen_dev/src/generic64/aarch64.rs @@ -1,8 +1,7 @@ -use crate::generic64::{Assembler, CallConv, RegTrait, SymbolStorage}; +use crate::generic64::{storage::StorageManager, Assembler, CallConv, RegTrait}; use crate::Relocation; use bumpalo::collections::Vec; use packed_struct::prelude::*; -use roc_collections::all::MutMap; use roc_error_macros::internal_error; use roc_module::symbol::Symbol; use roc_mono::layout::Layout; @@ -75,7 +74,7 @@ pub struct AArch64Call {} const STACK_ALIGNMENT: u8 = 16; -impl CallConv for AArch64Call { +impl CallConv for AArch64Call { const BASE_PTR_REG: AArch64GeneralReg = AArch64GeneralReg::FP; const STACK_PTR_REG: AArch64GeneralReg = AArch64GeneralReg::ZRSP; @@ -160,13 +159,14 @@ impl CallConv for AArch64Call { #[inline(always)] fn setup_stack( buf: &mut Vec<'_, u8>, - saved_regs: &[AArch64GeneralReg], + saved_general_regs: &[AArch64GeneralReg], + saved_float_regs: &[AArch64FloatReg], requested_stack_size: i32, fn_call_stack_size: i32, ) -> i32 { // Full size is upcast to i64 to make sure we don't overflow here. let full_stack_size = match requested_stack_size - .checked_add(8 * saved_regs.len() as i32 + 8) // The extra 8 is space to store the frame pointer. + .checked_add(8 * (saved_general_regs.len() + saved_float_regs.len()) as i32 + 8) // The extra 8 is space to store the frame pointer. .and_then(|size| size.checked_add(fn_call_stack_size)) { Some(size) => size, @@ -204,10 +204,14 @@ impl CallConv for AArch64Call { AArch64Assembler::mov_stack32_reg64(buf, offset, AArch64GeneralReg::FP); offset = aligned_stack_size - fn_call_stack_size; - for reg in saved_regs { + for reg in saved_general_regs { offset -= 8; AArch64Assembler::mov_base32_reg64(buf, offset, *reg); } + for reg in saved_float_regs { + offset -= 8; + AArch64Assembler::mov_base32_freg64(buf, offset, *reg); + } aligned_stack_size } else { 0 @@ -220,7 +224,8 @@ impl CallConv for AArch64Call { #[inline(always)] fn cleanup_stack( buf: &mut Vec<'_, u8>, - saved_regs: &[AArch64GeneralReg], + saved_general_regs: &[AArch64GeneralReg], + saved_float_regs: &[AArch64FloatReg], aligned_stack_size: i32, fn_call_stack_size: i32, ) { @@ -233,10 +238,14 @@ impl CallConv for AArch64Call { AArch64Assembler::mov_reg64_stack32(buf, AArch64GeneralReg::FP, offset); offset = aligned_stack_size - fn_call_stack_size; - for reg in saved_regs { + for reg in saved_general_regs { offset -= 8; AArch64Assembler::mov_reg64_base32(buf, *reg, offset); } + for reg in saved_float_regs { + offset -= 8; + AArch64Assembler::mov_freg64_base32(buf, *reg, offset); + } AArch64Assembler::add_reg64_reg64_imm32( buf, AArch64GeneralReg::ZRSP, @@ -249,37 +258,64 @@ impl CallConv for AArch64Call { #[inline(always)] fn load_args<'a>( _buf: &mut Vec<'a, u8>, - _symbol_map: &mut MutMap>, + _storage_manager: &mut StorageManager< + 'a, + AArch64GeneralReg, + AArch64FloatReg, + AArch64Assembler, + AArch64Call, + >, _args: &'a [(Layout<'a>, Symbol)], _ret_layout: &Layout<'a>, - mut _stack_size: u32, - ) -> u32 { + ) { todo!("Loading args for AArch64"); } #[inline(always)] fn store_args<'a>( _buf: &mut Vec<'a, u8>, - _symbol_map: &MutMap>, + _storage_manager: &mut StorageManager< + 'a, + AArch64GeneralReg, + AArch64FloatReg, + AArch64Assembler, + AArch64Call, + >, _args: &'a [Symbol], _arg_layouts: &[Layout<'a>], _ret_layout: &Layout<'a>, - ) -> u32 { + ) { todo!("Storing args for AArch64"); } - fn return_struct<'a>( + fn return_complex_symbol<'a>( _buf: &mut Vec<'a, u8>, - _struct_offset: i32, - _struct_size: u32, - _field_layouts: &[Layout<'a>], - _ret_reg: Option, + _storage_manager: &mut StorageManager< + 'a, + AArch64GeneralReg, + AArch64FloatReg, + AArch64Assembler, + AArch64Call, + >, + _sym: &Symbol, + _layout: &Layout<'a>, ) { - todo!("Returning structs for AArch64"); + todo!("Returning complex symbols for AArch64"); } - fn returns_via_arg_pointer(_ret_layout: &Layout) -> bool { - todo!("Returning via arg pointer for AArch64"); + fn load_returned_complex_symbol<'a>( + _buf: &mut Vec<'a, u8>, + _storage_manager: &mut StorageManager< + 'a, + AArch64GeneralReg, + AArch64FloatReg, + AArch64Assembler, + AArch64Call, + >, + _sym: &Symbol, + _layout: &Layout<'a>, + ) { + todo!("Loading returned complex symbols for AArch64"); } } diff --git a/compiler/gen_dev/src/generic64/mod.rs b/compiler/gen_dev/src/generic64/mod.rs index f30e98edc3..6639d27d6f 100644 --- a/compiler/gen_dev/src/generic64/mod.rs +++ b/compiler/gen_dev/src/generic64/mod.rs @@ -1,7 +1,7 @@ -use crate::{Backend, Env, Relocation}; +use crate::{single_register_floats, single_register_integers, Backend, Env, Relocation}; use bumpalo::collections::Vec; use roc_builtins::bitcode::{FloatWidth, IntWidth}; -use roc_collections::all::{MutMap, MutSet}; +use roc_collections::all::MutMap; use roc_error_macros::internal_error; use roc_module::symbol::{Interns, Symbol}; use roc_mono::code_gen_help::CodeGenHelp; @@ -10,12 +10,15 @@ use roc_mono::layout::{Builtin, Layout}; use roc_target::TargetInfo; use std::marker::PhantomData; -pub mod aarch64; -pub mod x86_64; +pub(crate) mod aarch64; +pub(crate) mod storage; +pub(crate) mod x86_64; -const TARGET_INFO: TargetInfo = TargetInfo::default_x86_64(); +use storage::StorageManager; -pub trait CallConv { +pub trait CallConv>: + Sized +{ const BASE_PTR_REG: GeneralReg; const STACK_PTR_REG: GeneralReg; @@ -43,50 +46,55 @@ pub trait CallConv { fn setup_stack<'a>( buf: &mut Vec<'a, u8>, general_saved_regs: &[GeneralReg], + float_saved_regs: &[FloatReg], requested_stack_size: i32, fn_call_stack_size: i32, ) -> i32; fn cleanup_stack<'a>( buf: &mut Vec<'a, u8>, general_saved_regs: &[GeneralReg], + float_saved_regs: &[FloatReg], aligned_stack_size: i32, fn_call_stack_size: i32, ); - // load_args updates the symbol map to know where every arg is stored. - // It returns the total stack space after loading the args. + /// load_args updates the storage manager to know where every arg is stored. fn load_args<'a>( buf: &mut Vec<'a, u8>, - symbol_map: &mut MutMap>, + storage_manager: &mut StorageManager<'a, GeneralReg, FloatReg, ASM, Self>, args: &'a [(Layout<'a>, Symbol)], // ret_layout is needed because if it is a complex type, we pass a pointer as the first arg. ret_layout: &Layout<'a>, - stack_size: u32, - ) -> u32; + ); - // store_args stores the args in registers and on the stack for function calling. - // It returns the amount of stack space needed to temporarily store the args. + /// store_args stores the args in registers and on the stack for function calling. + /// It also updates the amount of temporary stack space needed in the storage manager. fn store_args<'a>( buf: &mut Vec<'a, u8>, - symbol_map: &MutMap>, + storage_manager: &mut StorageManager<'a, GeneralReg, FloatReg, ASM, Self>, args: &'a [Symbol], arg_layouts: &[Layout<'a>], // ret_layout is needed because if it is a complex type, we pass a pointer as the first arg. ret_layout: &Layout<'a>, - ) -> u32; - - // return_struct returns a struct currently on the stack at `struct_offset`. - // It does so using registers and stack as necessary. - fn return_struct<'a>( - buf: &mut Vec<'a, u8>, - struct_offset: i32, - struct_size: u32, - field_layouts: &[Layout<'a>], - ret_reg: Option, ); - // returns true if the layout should be returned via an argument pointer. - fn returns_via_arg_pointer(ret_layout: &Layout) -> bool; + /// return_complex_symbol returns the specified complex/non-primative symbol. + /// It uses the layout to determine how the data should be returned. + fn return_complex_symbol<'a>( + buf: &mut Vec<'a, u8>, + storage_manager: &mut StorageManager<'a, GeneralReg, FloatReg, ASM, Self>, + sym: &Symbol, + layout: &Layout<'a>, + ); + + /// load_returned_complex_symbol loads a complex symbol that was returned from a function call. + /// It uses the layout to determine how the data should be loaded into the symbol. + fn load_returned_complex_symbol<'a>( + buf: &mut Vec<'a, u8>, + storage_manager: &mut StorageManager<'a, GeneralReg, FloatReg, ASM, Self>, + sym: &Symbol, + layout: &Layout<'a>, + ); } /// Assembler contains calls to the backend assembly generator. @@ -95,7 +103,7 @@ pub trait CallConv { /// Thus, some backends will need to use mulitiple instructions to preform a single one of this calls. /// Generally, I prefer explicit sources, as opposed to dst being one of the sources. Ex: `x = x + y` would be `add x, x, y` instead of `add x, y`. /// dst should always come before sources. -pub trait Assembler { +pub trait Assembler: Sized { fn abs_reg64_reg64(buf: &mut Vec<'_, u8>, dst: GeneralReg, src: GeneralReg); fn abs_freg64_freg64( buf: &mut Vec<'_, u8>, @@ -120,16 +128,16 @@ pub trait Assembler { fn call(buf: &mut Vec<'_, u8>, relocs: &mut Vec<'_, Relocation>, fn_name: String); - // Jumps by an offset of offset bytes unconditionally. - // It should always generate the same number of bytes to enable replacement if offset changes. - // It returns the base offset to calculate the jump from (generally the instruction after the jump). + /// Jumps by an offset of offset bytes unconditionally. + /// It should always generate the same number of bytes to enable replacement if offset changes. + /// It returns the base offset to calculate the jump from (generally the instruction after the jump). fn jmp_imm32(buf: &mut Vec<'_, u8>, offset: i32) -> usize; fn tail_call(buf: &mut Vec<'_, u8>) -> u64; - // Jumps by an offset of offset bytes if reg is not equal to imm. - // It should always generate the same number of bytes to enable replacement if offset changes. - // It returns the base offset to calculate the jump from (generally the instruction after the jump). + /// Jumps by an offset of offset bytes if reg is not equal to imm. + /// It should always generate the same number of bytes to enable replacement if offset changes. + /// It returns the base offset to calculate the jump from (generally the instruction after the jump). fn jne_reg64_imm64_imm32( buf: &mut Vec<'_, u8>, reg: GeneralReg, @@ -219,30 +227,7 @@ pub trait Assembler { fn ret(buf: &mut Vec<'_, u8>); } -#[derive(Clone, Debug, PartialEq)] -pub enum SymbolStorage { - GeneralReg(GeneralReg), - FloatReg(FloatReg), - Base { - offset: i32, - size: u32, - owned: bool, - }, - BaseAndGeneralReg { - reg: GeneralReg, - offset: i32, - size: u32, - owned: bool, - }, - BaseAndFloatReg { - reg: FloatReg, - offset: i32, - size: u32, - owned: bool, - }, -} - -pub trait RegTrait: Copy + Eq + std::hash::Hash + std::fmt::Debug + 'static { +pub trait RegTrait: Copy + PartialEq + Eq + std::hash::Hash + std::fmt::Debug + 'static { fn value(&self) -> u8; } @@ -251,8 +236,10 @@ pub struct Backend64Bit< GeneralReg: RegTrait, FloatReg: RegTrait, ASM: Assembler, - CC: CallConv, + CC: CallConv, > { + // TODO: A number of the uses of MutMap could probably be some form of linear mutmap + // They are likely to be small enough that it is faster to use a vec and linearly scan it or keep it sorted and binary search. phantom_asm: PhantomData, phantom_cc: PhantomData, env: &'a Env<'a>, @@ -268,29 +255,10 @@ pub struct Backend64Bit< layout_map: MutMap>, free_map: MutMap<*const Stmt<'a>, Vec<'a, Symbol>>, - symbol_storage_map: MutMap>, literal_map: MutMap, *const Layout<'a>)>, join_map: MutMap, - // This should probably be smarter than a vec. - // There are certain registers we should always use first. With pushing and popping, this could get mixed. - general_free_regs: Vec<'a, GeneralReg>, - float_free_regs: Vec<'a, FloatReg>, - - // The last major thing we need is a way to decide what reg to free when all of them are full. - // Theoretically we want a basic lru cache for the currently loaded symbols. - // For now just a vec of used registers and the symbols they contain. - general_used_regs: Vec<'a, (GeneralReg, Symbol)>, - float_used_regs: Vec<'a, (FloatReg, Symbol)>, - - // used callee saved regs must be tracked for pushing and popping at the beginning/end of the function. - general_used_callee_saved_regs: MutSet, - float_used_callee_saved_regs: MutSet, - - free_stack_chunks: Vec<'a, (i32, u32)>, - stack_size: u32, - // The amount of stack space needed to pass args for function calling. - fn_call_stack_size: u32, + storage_manager: StorageManager<'a, GeneralReg, FloatReg, ASM, CC>, } /// new creates a new backend that will output to the specific Object. @@ -299,9 +267,10 @@ pub fn new_backend_64bit< GeneralReg: RegTrait, FloatReg: RegTrait, ASM: Assembler, - CC: CallConv, + CC: CallConv, >( env: &'a Env, + target_info: TargetInfo, interns: &'a mut Interns, ) -> Backend64Bit<'a, GeneralReg, FloatReg, ASM, CC> { Backend64Bit { @@ -309,7 +278,7 @@ pub fn new_backend_64bit< phantom_cc: PhantomData, env, interns, - helper_proc_gen: CodeGenHelp::new(env.arena, TARGET_INFO, env.module_id), + helper_proc_gen: CodeGenHelp::new(env.arena, target_info, env.module_id), helper_proc_symbols: bumpalo::vec![in env.arena], proc_name: None, is_self_recursive: None, @@ -318,18 +287,9 @@ pub fn new_backend_64bit< last_seen_map: MutMap::default(), layout_map: MutMap::default(), free_map: MutMap::default(), - symbol_storage_map: MutMap::default(), literal_map: MutMap::default(), join_map: MutMap::default(), - general_free_regs: bumpalo::vec![in env.arena], - general_used_regs: bumpalo::vec![in env.arena], - general_used_callee_saved_regs: MutSet::default(), - float_free_regs: bumpalo::vec![in env.arena], - float_used_regs: bumpalo::vec![in env.arena], - float_used_callee_saved_regs: MutSet::default(), - free_stack_chunks: bumpalo::vec![in env.arena], - stack_size: 0, - fn_call_stack_size: 0, + storage_manager: storage::new_storage_manager(env, target_info), } } @@ -338,7 +298,7 @@ impl< GeneralReg: RegTrait, FloatReg: RegTrait, ASM: Assembler, - CC: CallConv, + CC: CallConv, > Backend<'a> for Backend64Bit<'a, GeneralReg, FloatReg, ASM, CC> { fn env(&self) -> &Env<'a> { @@ -363,26 +323,13 @@ impl< fn reset(&mut self, name: String, is_self_recursive: SelfRecursive) { self.proc_name = Some(name); self.is_self_recursive = Some(is_self_recursive); - self.stack_size = 0; - self.free_stack_chunks.clear(); - self.fn_call_stack_size = 0; self.last_seen_map.clear(); self.layout_map.clear(); self.join_map.clear(); self.free_map.clear(); - self.symbol_storage_map.clear(); self.buf.clear(); - self.general_used_callee_saved_regs.clear(); - self.general_free_regs.clear(); - self.general_used_regs.clear(); - self.general_free_regs - .extend_from_slice(CC::GENERAL_DEFAULT_FREE_REGS); - self.float_used_callee_saved_regs.clear(); - self.float_free_regs.clear(); - self.float_used_regs.clear(); - self.float_free_regs - .extend_from_slice(CC::FLOAT_DEFAULT_FREE_REGS); self.helper_proc_symbols.clear(); + self.storage_manager.reset(); } fn literal_map(&mut self) -> &mut MutMap, *const Layout<'a>)> { @@ -409,13 +356,14 @@ impl< let mut out = bumpalo::vec![in self.env.arena]; // Setup stack. - let mut used_regs = bumpalo::vec![in self.env.arena]; - used_regs.extend(&self.general_used_callee_saved_regs); + let used_general_regs = self.storage_manager.general_used_callee_saved_regs(); + let used_float_regs = self.storage_manager.float_used_callee_saved_regs(); let aligned_stack_size = CC::setup_stack( &mut out, - &used_regs, - self.stack_size as i32, - self.fn_call_stack_size as i32, + &used_general_regs, + &used_float_regs, + self.storage_manager.stack_size() as i32, + self.storage_manager.fn_call_stack_size() as i32, ); let setup_offset = out.len(); @@ -466,9 +414,10 @@ impl< // Cleanup stack. CC::cleanup_stack( &mut out, - &used_regs, + &used_general_regs, + &used_float_regs, aligned_stack_size, - self.fn_call_stack_size as i32, + self.storage_manager.fn_call_stack_size() as i32, ); ASM::ret(&mut out); @@ -498,27 +447,7 @@ impl< } fn load_args(&mut self, args: &'a [(Layout<'a>, Symbol)], ret_layout: &Layout<'a>) { - self.stack_size = CC::load_args( - &mut self.buf, - &mut self.symbol_storage_map, - args, - ret_layout, - self.stack_size, - ); - // Update used and free regs. - for (sym, storage) in &self.symbol_storage_map { - match storage { - SymbolStorage::GeneralReg(reg) | SymbolStorage::BaseAndGeneralReg { reg, .. } => { - self.general_free_regs.retain(|r| *r != *reg); - self.general_used_regs.push((*reg, *sym)); - } - SymbolStorage::FloatReg(reg) | SymbolStorage::BaseAndFloatReg { reg, .. } => { - self.float_free_regs.retain(|r| *r != *reg); - self.float_used_regs.push((*reg, *sym)); - } - SymbolStorage::Base { .. } => {} - } - } + CC::load_args(&mut self.buf, &mut self.storage_manager, args, ret_layout); } /// Used for generating wrappers for malloc/realloc/free @@ -543,53 +472,39 @@ impl< } } // Save used caller saved regs. - self.push_used_caller_saved_regs_to_stack(); + self.storage_manager + .push_used_caller_saved_regs_to_stack(&mut self.buf); // Put values in param regs or on top of the stack. - let tmp_stack_size = CC::store_args( + CC::store_args( &mut self.buf, - &self.symbol_storage_map, + &mut self.storage_manager, args, arg_layouts, ret_layout, ); - self.fn_call_stack_size = std::cmp::max(self.fn_call_stack_size, tmp_stack_size); // Call function and generate reloc. ASM::call(&mut self.buf, &mut self.relocs, fn_name); // move return value to dst. match ret_layout { - Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64) | Builtin::Bool) => { - let dst_reg = self.claim_general_reg(dst); + single_register_integers!() => { + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); ASM::mov_reg64_reg64(&mut self.buf, dst_reg, CC::GENERAL_RETURN_REGS[0]); } - Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { - let dst_reg = self.claim_float_reg(dst); + single_register_floats!() => { + let dst_reg = self.storage_manager.claim_float_reg(&mut self.buf, dst); ASM::mov_freg64_freg64(&mut self.buf, dst_reg, CC::FLOAT_RETURN_REGS[0]); } - Layout::Builtin(Builtin::Str) => { - if CC::returns_via_arg_pointer(ret_layout) { - // This will happen on windows, return via pointer here. - todo!("FnCall: Returning strings via pointer"); - } else { - let offset = self.claim_stack_size(16); - self.symbol_storage_map.insert( - *dst, - SymbolStorage::Base { - offset, - size: 16, - owned: true, - }, - ); - ASM::mov_base32_reg64(&mut self.buf, offset, CC::GENERAL_RETURN_REGS[0]); - ASM::mov_base32_reg64(&mut self.buf, offset + 8, CC::GENERAL_RETURN_REGS[1]); - } + _ => { + CC::load_returned_complex_symbol( + &mut self.buf, + &mut self.storage_manager, + dst, + ret_layout, + ); } - Layout::Struct([]) => { - // Nothing needs to be done to load a returned empty struct. - } - x => todo!("FnCall: receiving return type, {:?}", x), } } @@ -604,7 +519,9 @@ impl< // Switches are a little complex due to keeping track of jumps. // In general I am trying to not have to loop over things multiple times or waste memory. // The basic plan is to make jumps to nowhere and then correct them once we know the correct address. - let cond_reg = self.load_to_general_reg(cond_symbol); + let cond_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, cond_symbol); let mut ret_jumps = bumpalo::vec![in self.env.arena]; let mut tmp = bumpalo::vec![in self.env.arena]; @@ -663,74 +580,19 @@ impl< remainder: &'a Stmt<'a>, ret_layout: &Layout<'a>, ) { + // Ensure all the joinpoint parameters have storage locations. + // On jumps to the joinpoint, we will overwrite those locations as a way to "pass parameters" to the joinpoint. + self.storage_manager + .setup_joinpoint(&mut self.buf, id, parameters); + // Create jump to remaining. let jmp_location = self.buf.len(); let start_offset = ASM::jmp_imm32(&mut self.buf, 0x1234_5678); - // This section can essentially be seen as a sub function within the main function. - // Thus we build using a new backend with some minor extra synchronization. - { - let mut sub_backend = - new_backend_64bit::(self.env, self.interns); - sub_backend.reset( - self.proc_name.as_ref().unwrap().clone(), - self.is_self_recursive.as_ref().unwrap().clone(), - ); - // Sync static maps of important information. - sub_backend.last_seen_map = self.last_seen_map.clone(); - sub_backend.layout_map = self.layout_map.clone(); - sub_backend.free_map = self.free_map.clone(); + // Build all statements in body. + self.join_map.insert(*id, self.buf.len() as u64); + self.build_stmt(body, ret_layout); - // Setup join point. - sub_backend.join_map.insert(*id, 0); - self.join_map.insert(*id, self.buf.len() as u64); - - // Sync stack size so the "sub function" doesn't mess up our stack. - sub_backend.stack_size = self.stack_size; - sub_backend.fn_call_stack_size = self.fn_call_stack_size; - - // Load params as if they were args. - let mut args = bumpalo::vec![in self.env.arena]; - for param in parameters { - args.push((param.layout, param.symbol)); - } - sub_backend.load_args(args.into_bump_slice(), ret_layout); - - // Build all statements in body. - sub_backend.build_stmt(body, ret_layout); - - // Merge the "sub function" into the main function. - let sub_func_offset = self.buf.len() as u64; - self.buf.extend_from_slice(&sub_backend.buf); - // Update stack based on how much was used by the sub function. - self.stack_size = sub_backend.stack_size; - self.fn_call_stack_size = sub_backend.fn_call_stack_size; - // Relocations must be shifted to be merged correctly. - self.relocs - .extend(sub_backend.relocs.into_iter().map(|reloc| match reloc { - Relocation::LocalData { offset, data } => Relocation::LocalData { - offset: offset + sub_func_offset, - data, - }, - Relocation::LinkedData { offset, name } => Relocation::LinkedData { - offset: offset + sub_func_offset, - name, - }, - Relocation::LinkedFunction { offset, name } => Relocation::LinkedFunction { - offset: offset + sub_func_offset, - name, - }, - Relocation::JmpToReturn { - inst_loc, - inst_size, - offset, - } => Relocation::JmpToReturn { - inst_loc: inst_loc + sub_func_offset, - inst_size, - offset: offset + sub_func_offset, - }, - })); - } // Overwrite the original jump with the correct offset. let mut tmp = bumpalo::vec![in self.env.arena]; self.update_jmp_imm32_offset( @@ -749,20 +611,10 @@ impl< id: &JoinPointId, args: &'a [Symbol], arg_layouts: &[Layout<'a>], - ret_layout: &Layout<'a>, + _ret_layout: &Layout<'a>, ) { - // Treat this like a function call, but with a jump instead of a call instruction at the end. - - self.push_used_caller_saved_regs_to_stack(); - - let tmp_stack_size = CC::store_args( - &mut self.buf, - &self.symbol_storage_map, - args, - arg_layouts, - ret_layout, - ); - self.fn_call_stack_size = std::cmp::max(self.fn_call_stack_size, tmp_stack_size); + self.storage_manager + .setup_jump(&mut self.buf, id, args, arg_layouts); let jmp_location = self.buf.len(); let start_offset = ASM::jmp_imm32(&mut self.buf, 0x1234_5678); @@ -784,13 +636,13 @@ impl< fn build_num_abs(&mut self, dst: &Symbol, src: &Symbol, layout: &Layout<'a>) { match layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src_reg = self.load_to_general_reg(src); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src_reg = self.storage_manager.load_to_general_reg(&mut self.buf, src); ASM::abs_reg64_reg64(&mut self.buf, dst_reg, src_reg); } Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { - let dst_reg = self.claim_float_reg(dst); - let src_reg = self.load_to_float_reg(src); + let dst_reg = self.storage_manager.claim_float_reg(&mut self.buf, dst); + let src_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src); ASM::abs_freg64_freg64(&mut self.buf, &mut self.relocs, dst_reg, src_reg); } x => todo!("NumAbs: layout, {:?}", x), @@ -800,15 +652,19 @@ impl< fn build_num_add(&mut self, dst: &Symbol, src1: &Symbol, src2: &Symbol, layout: &Layout<'a>) { match layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::add_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { - let dst_reg = self.claim_float_reg(dst); - let src1_reg = self.load_to_float_reg(src1); - let src2_reg = self.load_to_float_reg(src2); + let dst_reg = self.storage_manager.claim_float_reg(&mut self.buf, dst); + let src1_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src1); + let src2_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src2); ASM::add_freg64_freg64_freg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumAdd: layout, {:?}", x), @@ -818,9 +674,13 @@ impl< fn build_num_mul(&mut self, dst: &Symbol, src1: &Symbol, src2: &Symbol, layout: &Layout<'a>) { match layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::imul_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumMul: layout, {:?}", x), @@ -830,8 +690,8 @@ impl< fn build_num_neg(&mut self, dst: &Symbol, src: &Symbol, layout: &Layout<'a>) { match layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src_reg = self.load_to_general_reg(src); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src_reg = self.storage_manager.load_to_general_reg(&mut self.buf, src); ASM::neg_reg64_reg64(&mut self.buf, dst_reg, src_reg); } x => todo!("NumNeg: layout, {:?}", x), @@ -841,9 +701,13 @@ impl< fn build_num_sub(&mut self, dst: &Symbol, src1: &Symbol, src2: &Symbol, layout: &Layout<'a>) { match layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::sub_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumSub: layout, {:?}", x), @@ -853,9 +717,13 @@ impl< fn build_eq(&mut self, dst: &Symbol, src1: &Symbol, src2: &Symbol, arg_layout: &Layout<'a>) { match arg_layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::eq_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumEq: layout, {:?}", x), @@ -865,9 +733,13 @@ impl< fn build_neq(&mut self, dst: &Symbol, src1: &Symbol, src2: &Symbol, arg_layout: &Layout<'a>) { match arg_layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::neq_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumNeq: layout, {:?}", x), @@ -883,9 +755,13 @@ impl< ) { match arg_layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::lt_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumLt: layout, {:?}", x), @@ -899,48 +775,48 @@ impl< arg_layout: &Layout<'a>, ret_layout: &Layout<'a>, ) { - let dst_reg = self.claim_float_reg(dst); + let dst_reg = self.storage_manager.claim_float_reg(&mut self.buf, dst); match (arg_layout, ret_layout) { ( Layout::Builtin(Builtin::Int(IntWidth::I32 | IntWidth::I64)), Layout::Builtin(Builtin::Float(FloatWidth::F64)), ) => { - let src_reg = self.load_to_general_reg(src); + let src_reg = self.storage_manager.load_to_general_reg(&mut self.buf, src); ASM::to_float_freg64_reg64(&mut self.buf, dst_reg, src_reg); } ( Layout::Builtin(Builtin::Int(IntWidth::I32 | IntWidth::I64)), Layout::Builtin(Builtin::Float(FloatWidth::F32)), ) => { - let src_reg = self.load_to_general_reg(src); + let src_reg = self.storage_manager.load_to_general_reg(&mut self.buf, src); ASM::to_float_freg32_reg64(&mut self.buf, dst_reg, src_reg); } ( Layout::Builtin(Builtin::Float(FloatWidth::F64)), Layout::Builtin(Builtin::Float(FloatWidth::F32)), ) => { - let src_reg = self.load_to_float_reg(src); + let src_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src); ASM::to_float_freg32_freg64(&mut self.buf, dst_reg, src_reg); } ( Layout::Builtin(Builtin::Float(FloatWidth::F32)), Layout::Builtin(Builtin::Float(FloatWidth::F64)), ) => { - let src_reg = self.load_to_float_reg(src); + let src_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src); ASM::to_float_freg64_freg32(&mut self.buf, dst_reg, src_reg); } ( Layout::Builtin(Builtin::Float(FloatWidth::F64)), Layout::Builtin(Builtin::Float(FloatWidth::F64)), ) => { - let src_reg = self.load_to_float_reg(src); + let src_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src); ASM::mov_freg64_freg64(&mut self.buf, dst_reg, src_reg); } ( Layout::Builtin(Builtin::Float(FloatWidth::F32)), Layout::Builtin(Builtin::Float(FloatWidth::F32)), ) => { - let src_reg = self.load_to_float_reg(src); + let src_reg = self.storage_manager.load_to_float_reg(&mut self.buf, src); ASM::mov_freg64_freg64(&mut self.buf, dst_reg, src_reg); } (a, r) => todo!("NumToFloat: layout, arg {:?}, ret {:?}", a, r), @@ -956,9 +832,13 @@ impl< ) { match arg_layout { Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let dst_reg = self.claim_general_reg(dst); - let src1_reg = self.load_to_general_reg(src1); - let src2_reg = self.load_to_general_reg(src2); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src1_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src1); + let src2_reg = self + .storage_manager + .load_to_general_reg(&mut self.buf, src2); ASM::gte_reg64_reg64_reg64(&mut self.buf, dst_reg, src1_reg, src2_reg); } x => todo!("NumGte: layout, {:?}", x), @@ -969,55 +849,14 @@ impl< // We may not strictly need an instruction here. // What's important is to load the value, and for src and dest to have different Layouts. // This is used for pointer math in refcounting and for pointer equality - let dst_reg = self.claim_general_reg(dst); - let src_reg = self.load_to_general_reg(src); + let dst_reg = self.storage_manager.claim_general_reg(&mut self.buf, dst); + let src_reg = self.storage_manager.load_to_general_reg(&mut self.buf, src); ASM::mov_reg64_reg64(&mut self.buf, dst_reg, src_reg); } fn create_struct(&mut self, sym: &Symbol, layout: &Layout<'a>, fields: &'a [Symbol]) { - let struct_size = layout.stack_size(TARGET_INFO); - - if let Layout::Struct(field_layouts) = layout { - if struct_size > 0 { - let offset = self.claim_stack_size(struct_size); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: struct_size, - owned: true, - }, - ); - - let mut current_offset = offset; - for (field, field_layout) in fields.iter().zip(field_layouts.iter()) { - self.copy_symbol_to_stack_offset(current_offset, field, field_layout); - let field_size = field_layout.stack_size(TARGET_INFO); - current_offset += field_size as i32; - } - } else { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset: 0, - size: 0, - owned: false, - }, - ); - } - } else { - // This is a single element struct. Just copy the single field to the stack. - let offset = self.claim_stack_size(struct_size); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: struct_size, - owned: true, - }, - ); - self.copy_symbol_to_stack_offset(offset, &fields[0], layout); - } + self.storage_manager + .create_struct(&mut self.buf, sym, layout, fields); } fn load_struct_at_index( @@ -1027,23 +866,8 @@ impl< index: u64, field_layouts: &'a [Layout<'a>], ) { - if let Some(SymbolStorage::Base { offset, .. }) = self.symbol_storage_map.get(structure) { - let mut data_offset = *offset; - for i in 0..index { - let field_size = field_layouts[i as usize].stack_size(TARGET_INFO); - data_offset += field_size as i32; - } - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset: data_offset, - size: field_layouts[index as usize].stack_size(TARGET_INFO), - owned: false, - }, - ); - } else { - internal_error!("unknown struct: {:?}", structure); - } + self.storage_manager + .load_field_at_index(sym, structure, index, field_layouts); } fn load_literal(&mut self, sym: &Symbol, layout: &Layout<'a>, lit: &Literal<'a>) { @@ -1061,201 +885,76 @@ impl< | IntWidth::I64, )), ) => { - let reg = self.claim_general_reg(sym); + let reg = self.storage_manager.claim_general_reg(&mut self.buf, sym); let val = *x; ASM::mov_reg64_imm64(&mut self.buf, reg, val as i64); } (Literal::Float(x), Layout::Builtin(Builtin::Float(FloatWidth::F64))) => { - let reg = self.claim_float_reg(sym); + let reg = self.storage_manager.claim_float_reg(&mut self.buf, sym); let val = *x; ASM::mov_freg64_imm64(&mut self.buf, &mut self.relocs, reg, val); } (Literal::Float(x), Layout::Builtin(Builtin::Float(FloatWidth::F32))) => { - let reg = self.claim_float_reg(sym); + let reg = self.storage_manager.claim_float_reg(&mut self.buf, sym); let val = *x as f32; ASM::mov_freg32_imm32(&mut self.buf, &mut self.relocs, reg, val); } (Literal::Str(x), Layout::Builtin(Builtin::Str)) if x.len() < 16 => { // Load small string. - let reg = self.get_tmp_general_reg(); + self.storage_manager.with_tmp_general_reg( + &mut self.buf, + |storage_manager, buf, reg| { + let base_offset = storage_manager.claim_stack_area(sym, 16); + let mut bytes = [0; 16]; + bytes[..x.len()].copy_from_slice(x.as_bytes()); + bytes[15] = (x.len() as u8) | 0b1000_0000; - let offset = self.claim_stack_size(16); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: 16, - owned: true, + let mut num_bytes = [0; 8]; + num_bytes.copy_from_slice(&bytes[..8]); + let num = i64::from_ne_bytes(num_bytes); + ASM::mov_reg64_imm64(buf, reg, num); + ASM::mov_base32_reg64(buf, base_offset, reg); + + num_bytes.copy_from_slice(&bytes[8..]); + let num = i64::from_ne_bytes(num_bytes); + ASM::mov_reg64_imm64(buf, reg, num); + ASM::mov_base32_reg64(buf, base_offset + 8, reg); }, ); - let mut bytes = [0; 16]; - bytes[..x.len()].copy_from_slice(x.as_bytes()); - bytes[15] = (x.len() as u8) | 0b1000_0000; - - let mut num_bytes = [0; 8]; - num_bytes.copy_from_slice(&bytes[..8]); - let num = i64::from_ne_bytes(num_bytes); - ASM::mov_reg64_imm64(&mut self.buf, reg, num); - ASM::mov_base32_reg64(&mut self.buf, offset, reg); - - num_bytes.copy_from_slice(&bytes[8..]); - let num = i64::from_ne_bytes(num_bytes); - ASM::mov_reg64_imm64(&mut self.buf, reg, num); - ASM::mov_base32_reg64(&mut self.buf, offset + 8, reg); } x => todo!("loading literal, {:?}", x), } } fn free_symbol(&mut self, sym: &Symbol) { - match self.symbol_storage_map.remove(sym) { - Some( - SymbolStorage::Base { - offset, - size, - owned: true, - } - | SymbolStorage::BaseAndGeneralReg { - offset, - size, - owned: true, - .. - } - | SymbolStorage::BaseAndFloatReg { - offset, - size, - owned: true, - .. - }, - ) => { - let loc = (offset, size); - // Note: this position current points to the offset following the specified location. - // If loc was inserted at this position, it would shift the data at this position over by 1. - let pos = self - .free_stack_chunks - .binary_search(&loc) - .unwrap_or_else(|e| e); - - // Check for overlap with previous and next free chunk. - let merge_with_prev = if pos > 0 { - if let Some((prev_offset, prev_size)) = self.free_stack_chunks.get(pos - 1) { - let prev_end = *prev_offset + *prev_size as i32; - if prev_end > offset { - internal_error!("Double free? A previously freed stack location overlaps with the currently freed stack location."); - } - prev_end == offset - } else { - false - } - } else { - false - }; - let merge_with_next = if let Some((next_offset, _)) = - self.free_stack_chunks.get(pos) - { - let current_end = offset + size as i32; - if current_end > *next_offset { - internal_error!("Double free? A previously freed stack location overlaps with the currently freed stack location."); - } - current_end == *next_offset - } else { - false - }; - - match (merge_with_prev, merge_with_next) { - (true, true) => { - let (prev_offset, prev_size) = self.free_stack_chunks[pos - 1]; - let (_, next_size) = self.free_stack_chunks[pos]; - self.free_stack_chunks[pos - 1] = - (prev_offset, prev_size + size + next_size); - self.free_stack_chunks.remove(pos); - } - (true, false) => { - let (prev_offset, prev_size) = self.free_stack_chunks[pos - 1]; - self.free_stack_chunks[pos - 1] = (prev_offset, prev_size + size); - } - (false, true) => { - let (_, next_size) = self.free_stack_chunks[pos]; - self.free_stack_chunks[pos] = (offset, next_size + size); - } - (false, false) => self.free_stack_chunks.insert(pos, loc), - } - } - Some(_) | None => {} - } - for i in 0..self.general_used_regs.len() { - let (reg, saved_sym) = self.general_used_regs[i]; - if saved_sym == *sym { - self.general_free_regs.push(reg); - self.general_used_regs.remove(i); - break; - } - } - for i in 0..self.float_used_regs.len() { - let (reg, saved_sym) = self.float_used_regs[i]; - if saved_sym == *sym { - self.float_free_regs.push(reg); - self.float_used_regs.remove(i); - break; - } - } + self.join_map.remove(&JoinPointId(*sym)); + self.storage_manager.free_symbol(sym); } fn return_symbol(&mut self, sym: &Symbol, layout: &Layout<'a>) { - let val = self.symbol_storage_map.get(sym); - match val { - Some(SymbolStorage::GeneralReg(reg)) if *reg == CC::GENERAL_RETURN_REGS[0] => {} - Some(SymbolStorage::GeneralReg(reg)) => { - // If it fits in a general purpose register, just copy it over to. - // Technically this can be optimized to produce shorter instructions if less than 64bits. - ASM::mov_reg64_reg64(&mut self.buf, CC::GENERAL_RETURN_REGS[0], *reg); - } - Some(SymbolStorage::FloatReg(reg)) if *reg == CC::FLOAT_RETURN_REGS[0] => {} - Some(SymbolStorage::FloatReg(reg)) => { - ASM::mov_freg64_freg64(&mut self.buf, CC::FLOAT_RETURN_REGS[0], *reg); - } - Some(SymbolStorage::Base { offset, size, .. }) => match layout { - Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - ASM::mov_reg64_base32(&mut self.buf, CC::GENERAL_RETURN_REGS[0], *offset); + if self.storage_manager.is_stored_primitive(sym) { + // Just load it to the correct type of reg as a stand alone value. + match layout { + single_register_integers!() => { + self.storage_manager.load_to_specified_general_reg( + &mut self.buf, + sym, + CC::GENERAL_RETURN_REGS[0], + ); } - Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { - ASM::mov_freg64_base32(&mut self.buf, CC::FLOAT_RETURN_REGS[0], *offset); + single_register_floats!() => { + self.storage_manager.load_to_specified_float_reg( + &mut self.buf, + sym, + CC::FLOAT_RETURN_REGS[0], + ); } - Layout::Builtin(Builtin::Str) => { - if self.symbol_storage_map.contains_key(&Symbol::RET_POINTER) { - // This will happen on windows, return via pointer here. - todo!("Returning strings via pointer"); - } else { - ASM::mov_reg64_base32(&mut self.buf, CC::GENERAL_RETURN_REGS[0], *offset); - ASM::mov_reg64_base32( - &mut self.buf, - CC::GENERAL_RETURN_REGS[1], - *offset + 8, - ); - } + _ => { + internal_error!("All primitive valuse should fit in a single register"); } - Layout::Struct(field_layouts) => { - let (offset, size) = (*offset, *size); - // Nothing to do for empty struct - if size > 0 { - let ret_reg = if self.symbol_storage_map.contains_key(&Symbol::RET_POINTER) - { - Some(self.load_to_general_reg(&Symbol::RET_POINTER)) - } else { - None - }; - CC::return_struct(&mut self.buf, offset, size, field_layouts, ret_reg); - } - } - x => todo!("returning symbol with layout, {:?}", x), - }, - Some(x) => todo!("returning symbol storage, {:?}", x), - None if layout == &Layout::Struct(&[]) => { - // Empty struct is not defined and does nothing. - } - None => { - internal_error!("Unknown return symbol: {:?}", sym); } + } else { + CC::return_complex_symbol(&mut self.buf, &mut self.storage_manager, sym, layout) } let inst_loc = self.buf.len() as u64; let offset = ASM::jmp_imm32(&mut self.buf, 0x1234_5678) as u64; @@ -1274,343 +973,10 @@ impl< FloatReg: RegTrait, GeneralReg: RegTrait, ASM: Assembler, - CC: CallConv, + CC: CallConv, > Backend64Bit<'a, GeneralReg, FloatReg, ASM, CC> { - fn get_tmp_general_reg(&mut self) -> GeneralReg { - if !self.general_free_regs.is_empty() { - let free_reg = *self - .general_free_regs - .get(self.general_free_regs.len() - 1) - .unwrap(); - if CC::general_callee_saved(&free_reg) { - self.general_used_callee_saved_regs.insert(free_reg); - } - free_reg - } else if !self.general_used_regs.is_empty() { - let (reg, sym) = self.general_used_regs.remove(0); - self.free_to_stack(&sym); - reg - } else { - internal_error!("completely out of general purpose registers"); - } - } - - fn claim_general_reg(&mut self, sym: &Symbol) -> GeneralReg { - let reg = if !self.general_free_regs.is_empty() { - let free_reg = self.general_free_regs.pop().unwrap(); - if CC::general_callee_saved(&free_reg) { - self.general_used_callee_saved_regs.insert(free_reg); - } - free_reg - } else if !self.general_used_regs.is_empty() { - let (reg, sym) = self.general_used_regs.remove(0); - self.free_to_stack(&sym); - reg - } else { - internal_error!("completely out of general purpose registers"); - }; - - self.general_used_regs.push((reg, *sym)); - self.symbol_storage_map - .insert(*sym, SymbolStorage::GeneralReg(reg)); - reg - } - - fn claim_float_reg(&mut self, sym: &Symbol) -> FloatReg { - let reg = if !self.float_free_regs.is_empty() { - let free_reg = self.float_free_regs.pop().unwrap(); - if CC::float_callee_saved(&free_reg) { - self.float_used_callee_saved_regs.insert(free_reg); - } - free_reg - } else if !self.float_used_regs.is_empty() { - let (reg, sym) = self.float_used_regs.remove(0); - self.free_to_stack(&sym); - reg - } else { - internal_error!("completely out of floating point registers"); - }; - - self.float_used_regs.push((reg, *sym)); - self.symbol_storage_map - .insert(*sym, SymbolStorage::FloatReg(reg)); - reg - } - - fn load_to_general_reg(&mut self, sym: &Symbol) -> GeneralReg { - let val = self.symbol_storage_map.remove(sym); - match val { - Some(SymbolStorage::GeneralReg(reg)) => { - self.symbol_storage_map - .insert(*sym, SymbolStorage::GeneralReg(reg)); - reg - } - Some(SymbolStorage::Base { - offset, - size, - owned, - }) => { - let reg = self.claim_general_reg(sym); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::BaseAndGeneralReg { - reg, - offset, - size, - owned, - }, - ); - ASM::mov_reg64_base32(&mut self.buf, reg, offset as i32); - reg - } - Some(SymbolStorage::BaseAndGeneralReg { - reg, - offset, - size, - owned, - }) => { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::BaseAndGeneralReg { - reg, - offset, - size, - owned, - }, - ); - reg - } - Some(SymbolStorage::FloatReg(_)) | Some(SymbolStorage::BaseAndFloatReg { .. }) => { - internal_error!("Cannot load floating point symbol into GeneralReg") - } - None => internal_error!("Unknown symbol: {}", sym), - } - } - - fn load_to_float_reg(&mut self, sym: &Symbol) -> FloatReg { - let val = self.symbol_storage_map.remove(sym); - match val { - Some(SymbolStorage::FloatReg(reg)) => { - self.symbol_storage_map - .insert(*sym, SymbolStorage::FloatReg(reg)); - reg - } - Some(SymbolStorage::Base { - offset, - size, - owned, - }) => { - let reg = self.claim_float_reg(sym); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::BaseAndFloatReg { - reg, - offset, - size, - owned, - }, - ); - ASM::mov_freg64_base32(&mut self.buf, reg, offset as i32); - reg - } - Some(SymbolStorage::BaseAndFloatReg { - reg, - offset, - size, - owned, - }) => { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::BaseAndFloatReg { - reg, - offset, - size, - owned, - }, - ); - reg - } - Some(SymbolStorage::GeneralReg(_)) | Some(SymbolStorage::BaseAndGeneralReg { .. }) => { - internal_error!("Cannot load integer symbol into FloatReg") - } - None => internal_error!("Unknown symbol: {}", sym), - } - } - - fn free_to_stack(&mut self, sym: &Symbol) { - let val = self.symbol_storage_map.remove(sym); - match val { - Some(SymbolStorage::GeneralReg(reg)) => { - let offset = self.claim_stack_size(8); - // For base addressing, use the negative offset - 8. - ASM::mov_base32_reg64(&mut self.buf, offset, reg); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: 8, - owned: true, - }, - ); - } - Some(SymbolStorage::FloatReg(reg)) => { - let offset = self.claim_stack_size(8); - // For base addressing, use the negative offset. - ASM::mov_base32_freg64(&mut self.buf, offset, reg); - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: 8, - owned: true, - }, - ); - } - Some(SymbolStorage::Base { - offset, - size, - owned, - }) => { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size, - owned, - }, - ); - } - Some(SymbolStorage::BaseAndGeneralReg { - offset, - size, - owned, - .. - }) => { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size, - owned, - }, - ); - } - Some(SymbolStorage::BaseAndFloatReg { - offset, - size, - owned, - .. - }) => { - self.symbol_storage_map.insert( - *sym, - SymbolStorage::Base { - offset, - size, - owned, - }, - ); - } - None => internal_error!("Unknown symbol: {}", sym), - } - } - - /// claim_stack_size claims `amount` bytes from the stack. - /// This may be free space in the stack or result in increasing the stack size. - /// It returns base pointer relative offset of the new data. - fn claim_stack_size(&mut self, amount: u32) -> i32 { - debug_assert!(amount > 0); - if let Some(fitting_chunk) = self - .free_stack_chunks - .iter() - .enumerate() - .filter(|(_, (_, size))| *size >= amount) - .min_by_key(|(_, (_, size))| size) - { - let (pos, (offset, size)) = fitting_chunk; - let (offset, size) = (*offset, *size); - if size == amount { - self.free_stack_chunks.remove(pos); - offset - } else { - let (prev_offset, prev_size) = self.free_stack_chunks[pos]; - self.free_stack_chunks[pos] = (prev_offset + amount as i32, prev_size - amount); - prev_offset - } - } else if let Some(new_size) = self.stack_size.checked_add(amount) { - // Since stack size is u32, but the max offset is i32, if we pass i32 max, we have overflowed. - if new_size > i32::MAX as u32 { - internal_error!("Ran out of stack space"); - } else { - self.stack_size = new_size; - -(self.stack_size as i32) - } - } else { - internal_error!("Ran out of stack space"); - } - } - - fn copy_symbol_to_stack_offset(&mut self, to_offset: i32, sym: &Symbol, layout: &Layout<'a>) { - match layout { - Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { - let reg = self.load_to_general_reg(sym); - ASM::mov_base32_reg64(&mut self.buf, to_offset, reg); - } - Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { - let reg = self.load_to_float_reg(sym); - ASM::mov_base32_freg64(&mut self.buf, to_offset, reg); - } - Layout::Struct(_) if layout.safe_to_memcpy() => { - let tmp_reg = self.get_tmp_general_reg(); - if let Some(SymbolStorage::Base { - offset: from_offset, - size, - .. - }) = self.symbol_storage_map.get(sym) - { - debug_assert_eq!( - *size, - layout.stack_size(TARGET_INFO), - "expected struct to have same size as data being stored in it" - ); - for i in 0..layout.stack_size(TARGET_INFO) as i32 { - ASM::mov_reg64_base32(&mut self.buf, tmp_reg, from_offset + i); - ASM::mov_base32_reg64(&mut self.buf, to_offset + i, tmp_reg); - } - } else { - internal_error!("unknown struct: {:?}", sym); - } - } - x => todo!("copying data to the stack with layout, {:?}", x), - } - } - - fn push_used_caller_saved_regs_to_stack(&mut self) { - let old_general_used_regs = std::mem::replace( - &mut self.general_used_regs, - bumpalo::vec![in self.env.arena], - ); - for (reg, saved_sym) in old_general_used_regs.into_iter() { - if CC::general_caller_saved(®) { - self.general_free_regs.push(reg); - self.free_to_stack(&saved_sym); - } else { - self.general_used_regs.push((reg, saved_sym)); - } - } - let old_float_used_regs = - std::mem::replace(&mut self.float_used_regs, bumpalo::vec![in self.env.arena]); - for (reg, saved_sym) in old_float_used_regs.into_iter() { - if CC::float_caller_saved(®) { - self.float_free_regs.push(reg); - self.free_to_stack(&saved_sym); - } else { - self.float_used_regs.push((reg, saved_sym)); - } - } - } - - // Updates a jump instruction to a new offset and returns the number of bytes written. + /// Updates a jump instruction to a new offset and returns the number of bytes written. fn update_jmp_imm32_offset( &mut self, tmp: &mut Vec<'a, u8>, @@ -1654,7 +1020,7 @@ macro_rules! single_register_floats { } #[macro_export] -macro_rules! single_register_builtins { +macro_rules! single_register_layouts { () => { single_register_integers!() | single_register_floats!() }; diff --git a/compiler/gen_dev/src/generic64/storage.rs b/compiler/gen_dev/src/generic64/storage.rs new file mode 100644 index 0000000000..93c3734073 --- /dev/null +++ b/compiler/gen_dev/src/generic64/storage.rs @@ -0,0 +1,1084 @@ +use crate::{ + generic64::{Assembler, CallConv, RegTrait}, + single_register_floats, single_register_integers, single_register_layouts, Env, +}; +use bumpalo::collections::Vec; +use roc_builtins::bitcode::{FloatWidth, IntWidth}; +use roc_collections::all::{MutMap, MutSet}; +use roc_error_macros::internal_error; +use roc_module::symbol::Symbol; +use roc_mono::{ + ir::{JoinPointId, Param}, + layout::{Builtin, Layout}, +}; +use roc_target::TargetInfo; +use std::cmp::max; +use std::marker::PhantomData; +use std::rc::Rc; + +use RegStorage::*; +use StackStorage::*; +use Storage::*; + +#[derive(Copy, Clone, Debug, PartialEq, Eq)] +enum RegStorage { + General(GeneralReg), + Float(FloatReg), +} + +#[derive(Copy, Clone, Debug, PartialEq, Eq)] +enum StackStorage { + /// Primitives are 8 bytes or less. That generally live in registers but can move stored on the stack. + /// Their data, when on the stack, must always be 8 byte aligned and will be moved as a block. + /// They are never part of a struct, union, or more complex value. + /// The rest of the bytes should be the sign extension due to how these are loaded. + Primitive { + // Offset from the base pointer in bytes. + base_offset: i32, + // Optional register also holding the value. + reg: Option>, + }, + /// Referenced Primitives are primitives within a complex structures. + /// They have no guarantees about the bits around them and cannot simply be loaded as an 8 byte value. + /// For example, a U8 in a struct must be loaded as a single byte and sign extended. + /// If it was loaded as an 8 byte value, a bunch of garbage data would be loaded with the U8. + /// After loading, they should just be stored in a register, removing the reference. + ReferencedPrimitive { + // Offset from the base pointer in bytes. + base_offset: i32, + // Size on the stack in bytes. + size: u32, + }, + /// Complex data (lists, unions, structs, str) stored on the stack. + /// Note, this is also used for referencing a value within a struct/union. + /// It has no alignment guarantees. + /// When a primitive value is being loaded from this, it should be moved into a register. + /// To start, the primitive can just be loaded as a ReferencePrimitive. + Complex { + // Offset from the base pointer in bytes. + base_offset: i32, + // Size on the stack in bytes. + size: u32, + // TODO: investigate if storing a reg here for special values is worth it. + // For example, the ptr in list.get/list.set + // Instead, it would probably be better to change the incoming IR to load the pointer once and then use it multiple times. + }, +} + +#[derive(Copy, Clone, Debug, PartialEq, Eq)] +enum Storage { + Reg(RegStorage), + Stack(StackStorage), + NoData, +} + +pub struct StorageManager< + 'a, + GeneralReg: RegTrait, + FloatReg: RegTrait, + ASM: Assembler, + CC: CallConv, +> { + phantom_cc: PhantomData, + phantom_asm: PhantomData, + env: &'a Env<'a>, + target_info: TargetInfo, + // Data about where each symbol is stored. + symbol_storage_map: MutMap>, + + // A map from symbol to its owning allocation. + // This is only used for complex data on the stack and its references. + // In the case that subdata is still referenced from an overall structure, + // We can't free the entire structure until the subdata is no longer needed. + // If a symbol has only one reference, we can free it. + allocation_map: MutMap>, + + // The storage for parameters of a join point. + // When jumping to the join point, the parameters should be setup to match this. + join_param_map: MutMap>>, + + // This should probably be smarter than a vec. + // There are certain registers we should always use first. With pushing and popping, this could get mixed. + general_free_regs: Vec<'a, GeneralReg>, + float_free_regs: Vec<'a, FloatReg>, + + // The last major thing we need is a way to decide what reg to free when all of them are full. + // Theoretically we want a basic lru cache for the currently loaded symbols. + // For now just a vec of used registers and the symbols they contain. + general_used_regs: Vec<'a, (GeneralReg, Symbol)>, + float_used_regs: Vec<'a, (FloatReg, Symbol)>, + + // TODO: it probably would be faster to make these a list that linearly scans rather than hashing. + // used callee saved regs must be tracked for pushing and popping at the beginning/end of the function. + general_used_callee_saved_regs: MutSet, + float_used_callee_saved_regs: MutSet, + + free_stack_chunks: Vec<'a, (i32, u32)>, + stack_size: u32, + + // The amount of extra stack space needed to pass args for function calling. + fn_call_stack_size: u32, +} + +pub fn new_storage_manager< + 'a, + GeneralReg: RegTrait, + FloatReg: RegTrait, + ASM: Assembler, + CC: CallConv, +>( + env: &'a Env, + target_info: TargetInfo, +) -> StorageManager<'a, GeneralReg, FloatReg, ASM, CC> { + StorageManager { + phantom_asm: PhantomData, + phantom_cc: PhantomData, + env, + target_info, + symbol_storage_map: MutMap::default(), + allocation_map: MutMap::default(), + join_param_map: MutMap::default(), + general_free_regs: bumpalo::vec![in env.arena], + general_used_regs: bumpalo::vec![in env.arena], + general_used_callee_saved_regs: MutSet::default(), + float_free_regs: bumpalo::vec![in env.arena], + float_used_regs: bumpalo::vec![in env.arena], + float_used_callee_saved_regs: MutSet::default(), + free_stack_chunks: bumpalo::vec![in env.arena], + stack_size: 0, + fn_call_stack_size: 0, + } +} + +impl< + 'a, + FloatReg: RegTrait, + GeneralReg: RegTrait, + ASM: Assembler, + CC: CallConv, + > StorageManager<'a, GeneralReg, FloatReg, ASM, CC> +{ + pub fn reset(&mut self) { + self.symbol_storage_map.clear(); + self.allocation_map.clear(); + self.join_param_map.clear(); + self.general_used_callee_saved_regs.clear(); + self.general_free_regs.clear(); + self.general_used_regs.clear(); + self.general_free_regs + .extend_from_slice(CC::GENERAL_DEFAULT_FREE_REGS); + self.float_used_callee_saved_regs.clear(); + self.float_free_regs.clear(); + self.float_used_regs.clear(); + self.float_free_regs + .extend_from_slice(CC::FLOAT_DEFAULT_FREE_REGS); + self.free_stack_chunks.clear(); + self.stack_size = 0; + self.fn_call_stack_size = 0; + } + + pub fn stack_size(&self) -> u32 { + self.stack_size + } + + pub fn fn_call_stack_size(&self) -> u32 { + self.fn_call_stack_size + } + + pub fn general_used_callee_saved_regs(&self) -> Vec<'a, GeneralReg> { + let mut used_regs = bumpalo::vec![in self.env.arena]; + used_regs.extend(&self.general_used_callee_saved_regs); + used_regs + } + + pub fn float_used_callee_saved_regs(&self) -> Vec<'a, FloatReg> { + let mut used_regs = bumpalo::vec![in self.env.arena]; + used_regs.extend(&self.float_used_callee_saved_regs); + used_regs + } + + /// Returns true if the symbol is storing a primitive value. + pub fn is_stored_primitive(&self, sym: &Symbol) -> bool { + matches!( + self.get_storage_for_sym(sym), + Reg(_) | Stack(Primitive { .. } | ReferencedPrimitive { .. }) + ) + } + + /// Get a general register from the free list. + /// Will free data to the stack if necessary to get the register. + fn get_general_reg(&mut self, buf: &mut Vec<'a, u8>) -> GeneralReg { + if let Some(reg) = self.general_free_regs.pop() { + if CC::general_callee_saved(®) { + self.general_used_callee_saved_regs.insert(reg); + } + reg + } else if !self.general_used_regs.is_empty() { + let (reg, sym) = self.general_used_regs.remove(0); + self.free_to_stack(buf, &sym, General(reg)); + reg + } else { + internal_error!("completely out of general purpose registers"); + } + } + + /// Get a float register from the free list. + /// Will free data to the stack if necessary to get the register. + fn get_float_reg(&mut self, buf: &mut Vec<'a, u8>) -> FloatReg { + if let Some(reg) = self.float_free_regs.pop() { + if CC::float_callee_saved(®) { + self.float_used_callee_saved_regs.insert(reg); + } + reg + } else if !self.float_used_regs.is_empty() { + let (reg, sym) = self.float_used_regs.remove(0); + self.free_to_stack(buf, &sym, Float(reg)); + reg + } else { + internal_error!("completely out of general purpose registers"); + } + } + + /// Claims a general reg for a specific symbol. + /// They symbol should not already have storage. + pub fn claim_general_reg(&mut self, buf: &mut Vec<'a, u8>, sym: &Symbol) -> GeneralReg { + debug_assert_eq!(self.symbol_storage_map.get(sym), None); + let reg = self.get_general_reg(buf); + self.general_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert(*sym, Reg(General(reg))); + reg + } + + /// Claims a float reg for a specific symbol. + /// They symbol should not already have storage. + pub fn claim_float_reg(&mut self, buf: &mut Vec<'a, u8>, sym: &Symbol) -> FloatReg { + debug_assert_eq!(self.symbol_storage_map.get(sym), None); + let reg = self.get_float_reg(buf); + self.float_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert(*sym, Reg(Float(reg))); + reg + } + + /// This claims a temporary general register and enables is used in the passed in function. + /// Temporary registers are not safe across call instructions. + pub fn with_tmp_general_reg, GeneralReg)>( + &mut self, + buf: &mut Vec<'a, u8>, + callback: F, + ) { + let reg = self.get_general_reg(buf); + callback(self, buf, reg); + self.general_free_regs.push(reg); + } + + #[allow(dead_code)] + /// This claims a temporary float register and enables is used in the passed in function. + /// Temporary registers are not safe across call instructions. + pub fn with_tmp_float_reg, FloatReg)>( + &mut self, + buf: &mut Vec<'a, u8>, + callback: F, + ) { + let reg = self.get_float_reg(buf); + callback(self, buf, reg); + self.float_free_regs.push(reg); + } + + /// Loads a symbol into a general reg and returns that register. + /// The symbol must already be stored somewhere. + /// Will fail on values stored in float regs. + /// Will fail for values that don't fit in a single register. + pub fn load_to_general_reg(&mut self, buf: &mut Vec<'a, u8>, sym: &Symbol) -> GeneralReg { + let storage = self.remove_storage_for_sym(sym); + match storage { + Reg(General(reg)) + | Stack(Primitive { + reg: Some(General(reg)), + .. + }) => { + self.symbol_storage_map.insert(*sym, storage); + reg + } + Reg(Float(_)) + | Stack(Primitive { + reg: Some(Float(_)), + .. + }) => { + internal_error!("Cannot load floating point symbol into GeneralReg: {}", sym) + } + Stack(Primitive { + reg: None, + base_offset, + }) => { + debug_assert_eq!(base_offset % 8, 0); + let reg = self.get_general_reg(buf); + ASM::mov_reg64_base32(buf, reg, base_offset); + self.general_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert( + *sym, + Stack(Primitive { + base_offset, + reg: Some(General(reg)), + }), + ); + reg + } + Stack(ReferencedPrimitive { base_offset, size }) + if base_offset % 8 == 0 && size == 8 => + { + // The primitive is aligned and the data is exactly 8 bytes, treat it like regular stack. + let reg = self.get_general_reg(buf); + ASM::mov_reg64_base32(buf, reg, base_offset); + self.general_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert(*sym, Reg(General(reg))); + self.free_reference(sym); + reg + } + Stack(ReferencedPrimitive { .. }) => { + todo!("loading referenced primitives") + } + Stack(Complex { .. }) => { + internal_error!("Cannot load large values into general registers: {}", sym) + } + NoData => { + internal_error!("Cannot load no data into general registers: {}", sym) + } + } + } + + /// Loads a symbol into a float reg and returns that register. + /// The symbol must already be stored somewhere. + /// Will fail on values stored in general regs. + /// Will fail for values that don't fit in a single register. + pub fn load_to_float_reg(&mut self, buf: &mut Vec<'a, u8>, sym: &Symbol) -> FloatReg { + let storage = self.remove_storage_for_sym(sym); + match storage { + Reg(Float(reg)) + | Stack(Primitive { + reg: Some(Float(reg)), + .. + }) => { + self.symbol_storage_map.insert(*sym, storage); + reg + } + Reg(General(_)) + | Stack(Primitive { + reg: Some(General(_)), + .. + }) => { + internal_error!("Cannot load general symbol into FloatReg: {}", sym) + } + Stack(Primitive { + reg: None, + base_offset, + }) => { + debug_assert_eq!(base_offset % 8, 0); + let reg = self.get_float_reg(buf); + ASM::mov_freg64_base32(buf, reg, base_offset); + self.float_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert( + *sym, + Stack(Primitive { + base_offset, + reg: Some(Float(reg)), + }), + ); + reg + } + Stack(ReferencedPrimitive { base_offset, size }) + if base_offset % 8 == 0 && size == 8 => + { + // The primitive is aligned and the data is exactly 8 bytes, treat it like regular stack. + let reg = self.get_float_reg(buf); + ASM::mov_freg64_base32(buf, reg, base_offset); + self.float_used_regs.push((reg, *sym)); + self.symbol_storage_map.insert(*sym, Reg(Float(reg))); + self.free_reference(sym); + reg + } + Stack(ReferencedPrimitive { .. }) => { + todo!("loading referenced primitives") + } + Stack(Complex { .. }) => { + internal_error!("Cannot load large values into float registers: {}", sym) + } + NoData => { + internal_error!("Cannot load no data into general registers: {}", sym) + } + } + } + + /// Loads the symbol to the specified register. + /// It will fail if the symbol is stored in a float register. + /// This is only made to be used in special cases where exact regs are needed (function args and returns). + /// It will not try to free the register first. + /// This will not track the symbol change (it makes no assumptions about the new reg). + pub fn load_to_specified_general_reg( + &self, + buf: &mut Vec<'a, u8>, + sym: &Symbol, + reg: GeneralReg, + ) { + match self.get_storage_for_sym(sym) { + Reg(General(old_reg)) + | Stack(Primitive { + reg: Some(General(old_reg)), + .. + }) => { + if *old_reg == reg { + return; + } + ASM::mov_reg64_reg64(buf, reg, *old_reg); + } + Reg(Float(_)) + | Stack(Primitive { + reg: Some(Float(_)), + .. + }) => { + internal_error!("Cannot load floating point symbol into GeneralReg: {}", sym) + } + Stack(Primitive { + reg: None, + base_offset, + }) => { + debug_assert_eq!(base_offset % 8, 0); + ASM::mov_reg64_base32(buf, reg, *base_offset); + } + Stack(ReferencedPrimitive { base_offset, size }) + if base_offset % 8 == 0 && *size == 8 => + { + // The primitive is aligned and the data is exactly 8 bytes, treat it like regular stack. + ASM::mov_reg64_base32(buf, reg, *base_offset); + } + Stack(ReferencedPrimitive { .. }) => { + todo!("loading referenced primitives") + } + Stack(Complex { .. }) => { + internal_error!("Cannot load large values into general registers: {}", sym) + } + NoData => { + internal_error!("Cannot load no data into general registers: {}", sym) + } + } + } + + /// Loads the symbol to the specified register. + /// It will fail if the symbol is stored in a general register. + /// This is only made to be used in special cases where exact regs are needed (function args and returns). + /// It will not try to free the register first. + /// This will not track the symbol change (it makes no assumptions about the new reg). + pub fn load_to_specified_float_reg(&self, buf: &mut Vec<'a, u8>, sym: &Symbol, reg: FloatReg) { + match self.get_storage_for_sym(sym) { + Reg(Float(old_reg)) + | Stack(Primitive { + reg: Some(Float(old_reg)), + .. + }) => { + if *old_reg == reg { + return; + } + ASM::mov_freg64_freg64(buf, reg, *old_reg); + } + Reg(General(_)) + | Stack(Primitive { + reg: Some(General(_)), + .. + }) => { + internal_error!("Cannot load general symbol into FloatReg: {}", sym) + } + Stack(Primitive { + reg: None, + base_offset, + }) => { + debug_assert_eq!(base_offset % 8, 0); + ASM::mov_freg64_base32(buf, reg, *base_offset); + } + Stack(ReferencedPrimitive { base_offset, size }) + if base_offset % 8 == 0 && *size == 8 => + { + // The primitive is aligned and the data is exactly 8 bytes, treat it like regular stack. + ASM::mov_freg64_base32(buf, reg, *base_offset); + } + Stack(ReferencedPrimitive { .. }) => { + todo!("loading referenced primitives") + } + Stack(Complex { .. }) => { + internal_error!("Cannot load large values into float registers: {}", sym) + } + NoData => { + internal_error!("Cannot load no data into general registers: {}", sym) + } + } + } + + /// Loads a field from a struct or tag union. + /// This is lazy by default. It will not copy anything around. + pub fn load_field_at_index( + &mut self, + sym: &Symbol, + structure: &Symbol, + index: u64, + field_layouts: &'a [Layout<'a>], + ) { + debug_assert!(index < field_layouts.len() as u64); + // This must be removed and reinserted for ownership and mutability reasons. + let owned_data = if let Some(owned_data) = self.allocation_map.remove(structure) { + owned_data + } else { + internal_error!("Unknown symbol: {}", structure); + }; + self.allocation_map + .insert(*structure, Rc::clone(&owned_data)); + match self.get_storage_for_sym(structure) { + Stack(Complex { base_offset, size }) => { + let (base_offset, size) = (*base_offset, *size); + let mut data_offset = base_offset; + for layout in field_layouts.iter().take(index as usize) { + let field_size = layout.stack_size(self.target_info); + data_offset += field_size as i32; + } + debug_assert!(data_offset < base_offset + size as i32); + self.allocation_map.insert(*sym, owned_data); + let layout = field_layouts[index as usize]; + let size = layout.stack_size(self.target_info); + self.symbol_storage_map.insert( + *sym, + Stack(if is_primitive(&layout) { + ReferencedPrimitive { + base_offset: data_offset, + size, + } + } else { + Complex { + base_offset: data_offset, + size, + } + }), + ); + } + storage => { + internal_error!( + "Cannot load field from data with storage type: {:?}", + storage + ); + } + } + } + + /// Creates a struct on the stack, moving the data in fields into the struct. + pub fn create_struct( + &mut self, + buf: &mut Vec<'a, u8>, + sym: &Symbol, + layout: &Layout<'a>, + fields: &'a [Symbol], + ) { + let struct_size = layout.stack_size(self.target_info); + if struct_size == 0 { + self.symbol_storage_map.insert(*sym, NoData); + return; + } + let base_offset = self.claim_stack_area(sym, struct_size); + + if let Layout::Struct(field_layouts) = layout { + let mut current_offset = base_offset; + for (field, field_layout) in fields.iter().zip(field_layouts.iter()) { + self.copy_symbol_to_stack_offset(buf, current_offset, field, field_layout); + let field_size = field_layout.stack_size(self.target_info); + current_offset += field_size as i32; + } + } else { + // This is a single element struct. Just copy the single field to the stack. + debug_assert_eq!(fields.len(), 1); + self.copy_symbol_to_stack_offset(buf, base_offset, &fields[0], layout); + } + } + + /// Copies a symbol to the specified stack offset. This is used for things like filling structs. + /// The offset is not guarenteed to be perfectly aligned, it follows Roc's alignment plan. + /// This means that, for example 2 I32s might be back to back on the stack. + /// Always interact with the stack using aligned 64bit movement. + fn copy_symbol_to_stack_offset( + &mut self, + buf: &mut Vec<'a, u8>, + to_offset: i32, + sym: &Symbol, + layout: &Layout<'a>, + ) { + match layout { + Layout::Builtin(Builtin::Int(IntWidth::I64 | IntWidth::U64)) => { + debug_assert_eq!(to_offset % 8, 0); + let reg = self.load_to_general_reg(buf, sym); + ASM::mov_base32_reg64(buf, to_offset, reg); + } + Layout::Builtin(Builtin::Float(FloatWidth::F64)) => { + debug_assert_eq!(to_offset % 8, 0); + let reg = self.load_to_float_reg(buf, sym); + ASM::mov_base32_freg64(buf, to_offset, reg); + } + // Layout::Struct(_) if layout.safe_to_memcpy() => { + // // self.storage_manager.with_tmp_float_reg(&mut self.buf, |buf, storage, ) + // // if let Some(SymbolStorage::Base { + // // offset: from_offset, + // // size, + // // .. + // // }) = self.symbol_storage_map.get(sym) + // // { + // // debug_assert_eq!( + // // *size, + // // layout.stack_size(self.target_info), + // // "expected struct to have same size as data being stored in it" + // // ); + // // for i in 0..layout.stack_size(self.target_info) as i32 { + // // ASM::mov_reg64_base32(&mut self.buf, tmp_reg, from_offset + i); + // // ASM::mov_base32_reg64(&mut self.buf, to_offset + i, tmp_reg); + // // } + // todo!() + // } else { + // internal_error!("unknown struct: {:?}", sym); + // } + // } + x => todo!("copying data to the stack with layout, {:?}", x), + } + } + + /// Ensures that a register is free. If it is not free, data will be moved to make it free. + fn ensure_reg_free( + &mut self, + buf: &mut Vec<'a, u8>, + wanted_reg: RegStorage, + ) { + match wanted_reg { + General(reg) => { + if self.general_free_regs.contains(®) { + return; + } + match self + .general_used_regs + .iter() + .position(|(used_reg, _sym)| reg == *used_reg) + { + Some(position) => { + let (used_reg, sym) = self.general_used_regs.remove(position); + self.free_to_stack(buf, &sym, wanted_reg); + self.general_free_regs.push(used_reg); + } + None => { + internal_error!("wanted register ({:?}) is not used or free", wanted_reg); + } + } + } + Float(reg) => { + if self.float_free_regs.contains(®) { + return; + } + match self + .float_used_regs + .iter() + .position(|(used_reg, _sym)| reg == *used_reg) + { + Some(position) => { + let (used_reg, sym) = self.float_used_regs.remove(position); + self.free_to_stack(buf, &sym, wanted_reg); + self.float_free_regs.push(used_reg); + } + None => { + internal_error!("wanted register ({:?}) is not used or free", wanted_reg); + } + } + } + } + } + + /// Frees `wanted_reg` which is currently owned by `sym` by making sure the value is loaded on the stack. + /// Note, used and free regs are expected to be updated outside of this function. + fn free_to_stack( + &mut self, + buf: &mut Vec<'a, u8>, + sym: &Symbol, + wanted_reg: RegStorage, + ) { + match self.remove_storage_for_sym(sym) { + Reg(reg_storage) => { + debug_assert_eq!(reg_storage, wanted_reg); + let base_offset = self.claim_stack_size(8); + match reg_storage { + General(reg) => ASM::mov_base32_reg64(buf, base_offset, reg), + Float(reg) => ASM::mov_base32_freg64(buf, base_offset, reg), + } + self.symbol_storage_map.insert( + *sym, + Stack(Primitive { + base_offset, + reg: None, + }), + ); + } + Stack(Primitive { + reg: Some(reg_storage), + base_offset, + }) => { + debug_assert_eq!(reg_storage, wanted_reg); + self.symbol_storage_map.insert( + *sym, + Stack(Primitive { + base_offset, + reg: None, + }), + ); + } + NoData + | Stack(Complex { .. } | Primitive { reg: None, .. } | ReferencedPrimitive { .. }) => { + internal_error!("Cannot free reg from symbol without a reg: {}", sym) + } + } + } + + /// gets the stack offset and size of the specified symbol. + /// the symbol must already be stored on the stack. + pub fn stack_offset_and_size(&self, sym: &Symbol) -> (i32, u32) { + match self.get_storage_for_sym(sym) { + Stack(Primitive { base_offset, .. }) => (*base_offset, 8), + Stack(ReferencedPrimitive { base_offset, size } | Complex { base_offset, size }) => { + (*base_offset, *size) + } + storage => { + internal_error!( + "Data not on the stack for sym ({}) with storage ({:?})", + sym, + storage + ) + } + } + } + + /// Specifies a symbol is loaded at the specified general register. + pub fn general_reg_arg(&mut self, sym: &Symbol, reg: GeneralReg) { + self.symbol_storage_map.insert(*sym, Reg(General(reg))); + self.general_free_regs.retain(|r| *r != reg); + self.general_used_regs.push((reg, *sym)); + } + + /// Specifies a symbol is loaded at the specified float register. + pub fn float_reg_arg(&mut self, sym: &Symbol, reg: FloatReg) { + self.symbol_storage_map.insert(*sym, Reg(Float(reg))); + self.float_free_regs.retain(|r| *r != reg); + self.float_used_regs.push((reg, *sym)); + } + + /// Specifies a primitive is loaded at the specific base offset. + pub fn primitive_stack_arg(&mut self, sym: &Symbol, base_offset: i32) { + self.symbol_storage_map.insert( + *sym, + Stack(Primitive { + base_offset, + reg: None, + }), + ); + } + + /// Loads the arg pointer symbol to the specified general reg. + pub fn ret_pointer_arg(&mut self, reg: GeneralReg) { + self.symbol_storage_map + .insert(Symbol::RET_POINTER, Reg(General(reg))); + } + + /// updates the function call stack size to the max of its current value and the size need for this call. + pub fn update_fn_call_stack_size(&mut self, tmp_size: u32) { + self.fn_call_stack_size = max(self.fn_call_stack_size, tmp_size); + } + + /// Setups a join point. + /// To do this, each of the join pionts params are given a storage location. + /// Then those locations are stored. + /// Later jumps to the join point can overwrite the stored locations to pass parameters. + pub fn setup_joinpoint( + &mut self, + buf: &mut Vec<'a, u8>, + id: &JoinPointId, + params: &'a [Param<'a>], + ) { + let mut param_storage = bumpalo::vec![in self.env.arena]; + param_storage.reserve(params.len()); + for Param { + symbol, + borrow, + layout, + } in params + { + if *borrow { + // These probably need to be passed by pointer/reference? + // Otherwise, we probably need to copy back to the param at the end of the joinpoint. + todo!("joinpoints with borrowed parameters"); + } + // Claim a location for every join point parameter to be loaded at. + match layout { + single_register_integers!() => { + self.claim_general_reg(buf, symbol); + } + single_register_floats!() => { + self.claim_float_reg(buf, symbol); + } + _ => { + let stack_size = layout.stack_size(self.target_info); + if stack_size == 0 { + self.symbol_storage_map.insert(*symbol, NoData); + } else { + self.claim_stack_area(symbol, stack_size); + } + } + } + param_storage.push(*self.get_storage_for_sym(symbol)); + } + self.join_param_map.insert(*id, param_storage); + } + + /// Setup jump loads the parameters for the joinpoint. + /// This enables the jump to correctly passe arguments to the joinpoint. + pub fn setup_jump( + &mut self, + buf: &mut Vec<'a, u8>, + id: &JoinPointId, + args: &'a [Symbol], + arg_layouts: &[Layout<'a>], + ) { + // TODO: remove was use here and for current_storage to deal with borrow checker. + // See if we can do this better. + let param_storage = match self.join_param_map.remove(id) { + Some(storages) => storages, + None => internal_error!("Jump: unknown point specified to jump to: {:?}", id), + }; + for ((sym, layout), wanted_storage) in + args.iter().zip(arg_layouts).zip(param_storage.iter()) + { + // Note: it is possible that the storage we want to move to is in use by one of the args we want to pass. + if self.get_storage_for_sym(sym) == wanted_storage { + continue; + } + match wanted_storage { + Reg(General(reg)) => { + // Ensure the reg is free, if not free it. + self.ensure_reg_free(buf, General(*reg)); + // Copy the value over to the reg. + self.load_to_specified_general_reg(buf, sym, *reg) + } + Reg(Float(reg)) => { + // Ensure the reg is free, if not free it. + self.ensure_reg_free(buf, Float(*reg)); + // Copy the value over to the reg. + self.load_to_specified_float_reg(buf, sym, *reg) + } + Stack(ReferencedPrimitive { base_offset, .. } | Complex { base_offset, .. }) => { + // TODO: This might be better not to call. + // Maybe we want a more memcpy like method to directly get called here. + // That would also be capable of asserting the size. + // Maybe copy stack to stack or something. + self.copy_symbol_to_stack_offset(buf, *base_offset, sym, layout); + } + NoData => {} + Stack(Primitive { .. }) => { + internal_error!("Primitive stack storage is not allowed for jumping") + } + } + } + self.join_param_map.insert(*id, param_storage); + } + + /// claim_stack_area is the public wrapper around claim_stack_size. + /// It also deals with updating symbol storage. + /// It returns the base offset of the stack area. + /// It should only be used for complex data and not primitives. + pub fn claim_stack_area(&mut self, sym: &Symbol, size: u32) -> i32 { + let base_offset = self.claim_stack_size(size); + self.symbol_storage_map + .insert(*sym, Stack(Complex { base_offset, size })); + self.allocation_map + .insert(*sym, Rc::new((base_offset, size))); + base_offset + } + + /// claim_stack_size claims `amount` bytes from the stack alignind to 8. + /// This may be free space in the stack or result in increasing the stack size. + /// It returns base pointer relative offset of the new data. + fn claim_stack_size(&mut self, amount: u32) -> i32 { + debug_assert!(amount > 0); + // round value to 8 byte alignment. + let amount = if amount % 8 != 0 { + amount + 8 - (amount % 8) + } else { + amount + }; + if let Some(fitting_chunk) = self + .free_stack_chunks + .iter() + .enumerate() + .filter(|(_, (_, size))| *size >= amount) + .min_by_key(|(_, (_, size))| size) + { + let (pos, (offset, size)) = fitting_chunk; + let (offset, size) = (*offset, *size); + if size == amount { + self.free_stack_chunks.remove(pos); + offset + } else { + let (prev_offset, prev_size) = self.free_stack_chunks[pos]; + self.free_stack_chunks[pos] = (prev_offset + amount as i32, prev_size - amount); + prev_offset + } + } else if let Some(new_size) = self.stack_size.checked_add(amount) { + // Since stack size is u32, but the max offset is i32, if we pass i32 max, we have overflowed. + if new_size > i32::MAX as u32 { + internal_error!("Ran out of stack space"); + } else { + self.stack_size = new_size; + -(self.stack_size as i32) + } + } else { + internal_error!("Ran out of stack space"); + } + } + + pub fn free_symbol(&mut self, sym: &Symbol) { + if self.join_param_map.remove(&JoinPointId(*sym)).is_some() { + // This is a join point and will not be in the storage map. + return; + } + match self.symbol_storage_map.remove(sym) { + // Free stack chunck if this is the last reference to the chunk. + Some(Stack(Primitive { base_offset, .. })) => { + self.free_stack_chunk(base_offset, 8); + } + Some(Stack(Complex { .. } | ReferencedPrimitive { .. })) => { + self.free_reference(sym); + } + _ => {} + } + for i in 0..self.general_used_regs.len() { + let (reg, saved_sym) = self.general_used_regs[i]; + if saved_sym == *sym { + self.general_free_regs.push(reg); + self.general_used_regs.remove(i); + break; + } + } + for i in 0..self.float_used_regs.len() { + let (reg, saved_sym) = self.float_used_regs[i]; + if saved_sym == *sym { + self.float_free_regs.push(reg); + self.float_used_regs.remove(i); + break; + } + } + } + + /// Frees an reference and release an allocation if it is no longer used. + fn free_reference(&mut self, sym: &Symbol) { + let owned_data = if let Some(owned_data) = self.allocation_map.remove(sym) { + owned_data + } else { + internal_error!("Unknown symbol: {:?}", sym); + }; + if Rc::strong_count(&owned_data) == 1 { + self.free_stack_chunk(owned_data.0, owned_data.1); + } + } + + fn free_stack_chunk(&mut self, base_offset: i32, size: u32) { + let loc = (base_offset, size); + // Note: this position current points to the offset following the specified location. + // If loc was inserted at this position, it would shift the data at this position over by 1. + let pos = self + .free_stack_chunks + .binary_search(&loc) + .unwrap_or_else(|e| e); + + // Check for overlap with previous and next free chunk. + let merge_with_prev = if pos > 0 { + if let Some((prev_offset, prev_size)) = self.free_stack_chunks.get(pos - 1) { + let prev_end = *prev_offset + *prev_size as i32; + if prev_end > base_offset { + internal_error!("Double free? A previously freed stack location overlaps with the currently freed stack location."); + } + prev_end == base_offset + } else { + false + } + } else { + false + }; + let merge_with_next = if let Some((next_offset, _)) = self.free_stack_chunks.get(pos) { + let current_end = base_offset + size as i32; + if current_end > *next_offset { + internal_error!("Double free? A previously freed stack location overlaps with the currently freed stack location."); + } + current_end == *next_offset + } else { + false + }; + + match (merge_with_prev, merge_with_next) { + (true, true) => { + let (prev_offset, prev_size) = self.free_stack_chunks[pos - 1]; + let (_, next_size) = self.free_stack_chunks[pos]; + self.free_stack_chunks[pos - 1] = (prev_offset, prev_size + size + next_size); + self.free_stack_chunks.remove(pos); + } + (true, false) => { + let (prev_offset, prev_size) = self.free_stack_chunks[pos - 1]; + self.free_stack_chunks[pos - 1] = (prev_offset, prev_size + size); + } + (false, true) => { + let (_, next_size) = self.free_stack_chunks[pos]; + self.free_stack_chunks[pos] = (base_offset, next_size + size); + } + (false, false) => self.free_stack_chunks.insert(pos, loc), + } + } + + pub fn push_used_caller_saved_regs_to_stack(&mut self, buf: &mut Vec<'a, u8>) { + let old_general_used_regs = std::mem::replace( + &mut self.general_used_regs, + bumpalo::vec![in self.env.arena], + ); + for (reg, saved_sym) in old_general_used_regs.into_iter() { + if CC::general_caller_saved(®) { + self.general_free_regs.push(reg); + self.free_to_stack(buf, &saved_sym, General(reg)); + } else { + self.general_used_regs.push((reg, saved_sym)); + } + } + let old_float_used_regs = + std::mem::replace(&mut self.float_used_regs, bumpalo::vec![in self.env.arena]); + for (reg, saved_sym) in old_float_used_regs.into_iter() { + if CC::float_caller_saved(®) { + self.float_free_regs.push(reg); + self.free_to_stack(buf, &saved_sym, Float(reg)); + } else { + self.float_used_regs.push((reg, saved_sym)); + } + } + } + + /// Gets a value from storage. They index symbol must be defined. + fn get_storage_for_sym(&self, sym: &Symbol) -> &Storage { + if let Some(storage) = self.symbol_storage_map.get(sym) { + storage + } else { + internal_error!("Unknown symbol: {:?}", sym); + } + } + + /// Removes and returns a value from storage. They index symbol must be defined. + fn remove_storage_for_sym(&mut self, sym: &Symbol) -> Storage { + if let Some(storage) = self.symbol_storage_map.remove(sym) { + storage + } else { + internal_error!("Unknown symbol: {:?}", sym); + } + } +} + +fn is_primitive(layout: &Layout<'_>) -> bool { + matches!(layout, single_register_layouts!()) +} diff --git a/compiler/gen_dev/src/generic64/x86_64.rs b/compiler/gen_dev/src/generic64/x86_64.rs index 359dd4670f..70ac3f2c47 100644 --- a/compiler/gen_dev/src/generic64/x86_64.rs +++ b/compiler/gen_dev/src/generic64/x86_64.rs @@ -1,13 +1,15 @@ -use crate::generic64::{Assembler, CallConv, RegTrait, SymbolStorage, TARGET_INFO}; +use crate::generic64::{storage::StorageManager, Assembler, CallConv, RegTrait}; use crate::{ - single_register_builtins, single_register_floats, single_register_integers, Relocation, + single_register_floats, single_register_integers, single_register_layouts, Relocation, }; use bumpalo::collections::Vec; use roc_builtins::bitcode::{FloatWidth, IntWidth}; -use roc_collections::all::MutMap; use roc_error_macros::internal_error; use roc_module::symbol::Symbol; use roc_mono::layout::{Builtin, Layout}; +use roc_target::TargetInfo; + +const TARGET_INFO: TargetInfo = TargetInfo::default_x86_64(); // Not sure exactly how I want to represent registers. // If we want max speed, we would likely make them structs that impl the same trait to avoid ifs. @@ -67,7 +69,7 @@ pub struct X86_64SystemV {} const STACK_ALIGNMENT: u8 = 16; -impl CallConv for X86_64SystemV { +impl CallConv for X86_64SystemV { const BASE_PTR_REG: X86_64GeneralReg = X86_64GeneralReg::RBP; const STACK_PTR_REG: X86_64GeneralReg = X86_64GeneralReg::RSP; @@ -161,13 +163,15 @@ impl CallConv for X86_64SystemV { #[inline(always)] fn setup_stack<'a>( buf: &mut Vec<'a, u8>, - general_saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], requested_stack_size: i32, fn_call_stack_size: i32, ) -> i32 { x86_64_generic_setup_stack( buf, - general_saved_regs, + saved_general_regs, + saved_float_regs, requested_stack_size, fn_call_stack_size, ) @@ -176,13 +180,15 @@ impl CallConv for X86_64SystemV { #[inline(always)] fn cleanup_stack<'a>( buf: &mut Vec<'a, u8>, - general_saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], aligned_stack_size: i32, fn_call_stack_size: i32, ) { x86_64_generic_cleanup_stack( buf, - general_saved_regs, + saved_general_regs, + saved_float_regs, aligned_stack_size, fn_call_stack_size, ) @@ -191,271 +197,230 @@ impl CallConv for X86_64SystemV { #[inline(always)] fn load_args<'a>( buf: &mut Vec<'a, u8>, - symbol_map: &mut MutMap>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64SystemV, + >, args: &'a [(Layout<'a>, Symbol)], ret_layout: &Layout<'a>, - mut stack_size: u32, - ) -> u32 { - let mut arg_offset = Self::SHADOW_SPACE_SIZE as i32 + 8; // 8 is the size of the pushed base pointer. + ) { + let mut arg_offset = Self::SHADOW_SPACE_SIZE as i32 + 16; // 16 is the size of the pushed return address and base pointer. let mut general_i = 0; let mut float_i = 0; if X86_64SystemV::returns_via_arg_pointer(ret_layout) { - symbol_map.insert( - Symbol::RET_POINTER, - SymbolStorage::GeneralReg(Self::GENERAL_PARAM_REGS[general_i]), - ); + storage_manager.ret_pointer_arg(Self::GENERAL_PARAM_REGS[0]); general_i += 1; } for (layout, sym) in args.iter() { match layout { single_register_integers!() => { if general_i < Self::GENERAL_PARAM_REGS.len() { - symbol_map.insert( - *sym, - SymbolStorage::GeneralReg(Self::GENERAL_PARAM_REGS[general_i]), - ); + storage_manager.general_reg_arg(sym, Self::GENERAL_PARAM_REGS[general_i]); general_i += 1; } else { + storage_manager.primitive_stack_arg(sym, arg_offset); arg_offset += 8; - symbol_map.insert( - *sym, - SymbolStorage::Base { - offset: arg_offset, - size: 8, - owned: true, - }, - ); } } single_register_floats!() => { if float_i < Self::FLOAT_PARAM_REGS.len() { - symbol_map.insert( - *sym, - SymbolStorage::FloatReg(Self::FLOAT_PARAM_REGS[float_i]), - ); + storage_manager.float_reg_arg(sym, Self::FLOAT_PARAM_REGS[float_i]); float_i += 1; } else { + storage_manager.primitive_stack_arg(sym, arg_offset); arg_offset += 8; - symbol_map.insert( - *sym, - SymbolStorage::Base { - offset: arg_offset, - size: 8, - owned: true, - }, - ); } } - Layout::Builtin(Builtin::Str) => { + Layout::Builtin(Builtin::Str | Builtin::List(_)) => { if general_i + 1 < Self::GENERAL_PARAM_REGS.len() { // Load the value from the param reg into a useable base offset. let src1 = Self::GENERAL_PARAM_REGS[general_i]; let src2 = Self::GENERAL_PARAM_REGS[general_i + 1]; - stack_size += 16; - let offset = -(stack_size as i32); - X86_64Assembler::mov_base32_reg64(buf, offset, src1); - X86_64Assembler::mov_base32_reg64(buf, offset + 8, src2); - symbol_map.insert( - *sym, - SymbolStorage::Base { - offset, - size: 16, - owned: true, - }, - ); + let base_offset = storage_manager.claim_stack_area(sym, 16); + X86_64Assembler::mov_base32_reg64(buf, base_offset, src1); + X86_64Assembler::mov_base32_reg64(buf, base_offset + 8, src2); general_i += 2; } else { - todo!("loading strings args on the stack"); + todo!("loading lists and strings args on the stack"); } } - Layout::Struct(&[]) => {} + x if x.stack_size(TARGET_INFO) == 0 => {} x => { todo!("Loading args with layout {:?}", x); } } } - stack_size } #[inline(always)] fn store_args<'a>( buf: &mut Vec<'a, u8>, - symbol_map: &MutMap>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64SystemV, + >, args: &'a [Symbol], arg_layouts: &[Layout<'a>], ret_layout: &Layout<'a>, - ) -> u32 { - let mut stack_offset = Self::SHADOW_SPACE_SIZE as i32; + ) { + let mut tmp_stack_offset = Self::SHADOW_SPACE_SIZE as i32; + if Self::returns_via_arg_pointer(ret_layout) { + // Save space on the stack for the arg we will return. + storage_manager + .claim_stack_area(&Symbol::RET_POINTER, ret_layout.stack_size(TARGET_INFO)); + todo!("claim first parama reg for the address"); + } let mut general_i = 0; let mut float_i = 0; - // For most return layouts we will do nothing. - // In some cases, we need to put the return address as the first arg. - match ret_layout { - single_register_builtins!() | Layout::Builtin(Builtin::Str) | Layout::Struct([]) => { - // Nothing needs to be done for any of these cases. - } - x => { - todo!("receiving return type, {:?}", x); - } - } - for (i, layout) in arg_layouts.iter().enumerate() { + for (sym, layout) in args.iter().zip(arg_layouts.iter()) { match layout { single_register_integers!() => { - let storage = match symbol_map.get(&args[i]) { - Some(storage) => storage, - None => { - internal_error!("function argument does not reference any symbol") - } - }; if general_i < Self::GENERAL_PARAM_REGS.len() { - // Load the value to the param reg. - let dst = Self::GENERAL_PARAM_REGS[general_i]; - match storage { - SymbolStorage::GeneralReg(reg) - | SymbolStorage::BaseAndGeneralReg { reg, .. } => { - X86_64Assembler::mov_reg64_reg64(buf, dst, *reg); - } - SymbolStorage::Base { offset, .. } => { - X86_64Assembler::mov_reg64_base32(buf, dst, *offset); - } - SymbolStorage::FloatReg(_) | SymbolStorage::BaseAndFloatReg { .. } => { - internal_error!("Cannot load floating point symbol into GeneralReg") - } - } + storage_manager.load_to_specified_general_reg( + buf, + sym, + Self::GENERAL_PARAM_REGS[general_i], + ); general_i += 1; } else { - // Load the value to the stack. - match storage { - SymbolStorage::GeneralReg(reg) - | SymbolStorage::BaseAndGeneralReg { reg, .. } => { - X86_64Assembler::mov_stack32_reg64(buf, stack_offset, *reg); - } - SymbolStorage::Base { offset, .. } => { - // Use RAX as a tmp reg because it will be free before function calls. - X86_64Assembler::mov_reg64_base32( - buf, - X86_64GeneralReg::RAX, - *offset, - ); - X86_64Assembler::mov_stack32_reg64( - buf, - stack_offset, - X86_64GeneralReg::RAX, - ); - } - SymbolStorage::FloatReg(_) | SymbolStorage::BaseAndFloatReg { .. } => { - internal_error!("Cannot load floating point symbol into GeneralReg") - } - } - stack_offset += 8; + // Copy to stack using return reg as buffer. + storage_manager.load_to_specified_general_reg( + buf, + sym, + Self::GENERAL_RETURN_REGS[0], + ); + X86_64Assembler::mov_stack32_reg64( + buf, + tmp_stack_offset, + Self::GENERAL_RETURN_REGS[0], + ); + tmp_stack_offset += 8; } } single_register_floats!() => { - let storage = match symbol_map.get(&args[i]) { - Some(storage) => storage, - None => { - internal_error!("function argument does not reference any symbol") - } - }; if float_i < Self::FLOAT_PARAM_REGS.len() { - // Load the value to the param reg. - let dst = Self::FLOAT_PARAM_REGS[float_i]; - match storage { - SymbolStorage::FloatReg(reg) - | SymbolStorage::BaseAndFloatReg { reg, .. } => { - X86_64Assembler::mov_freg64_freg64(buf, dst, *reg); - } - SymbolStorage::Base { offset, .. } => { - X86_64Assembler::mov_freg64_base32(buf, dst, *offset); - } - SymbolStorage::GeneralReg(_) - | SymbolStorage::BaseAndGeneralReg { .. } => { - internal_error!("Cannot load general symbol into FloatReg") - } - } + storage_manager.load_to_specified_float_reg( + buf, + sym, + Self::FLOAT_PARAM_REGS[float_i], + ); float_i += 1; } else { - // Load the value to the stack. - match storage { - SymbolStorage::FloatReg(reg) - | SymbolStorage::BaseAndFloatReg { reg, .. } => { - X86_64Assembler::mov_stack32_freg64(buf, stack_offset, *reg); - } - SymbolStorage::Base { offset, .. } => { - // Use XMM0 as a tmp reg because it will be free before function calls. - X86_64Assembler::mov_freg64_base32( - buf, - X86_64FloatReg::XMM0, - *offset, - ); - X86_64Assembler::mov_stack32_freg64( - buf, - stack_offset, - X86_64FloatReg::XMM0, - ); - } - SymbolStorage::GeneralReg(_) - | SymbolStorage::BaseAndGeneralReg { .. } => { - internal_error!("Cannot load general symbol into FloatReg") - } - } - stack_offset += 8; + // Copy to stack using return reg as buffer. + storage_manager.load_to_specified_float_reg( + buf, + sym, + Self::FLOAT_RETURN_REGS[0], + ); + X86_64Assembler::mov_stack32_freg64( + buf, + tmp_stack_offset, + Self::FLOAT_RETURN_REGS[0], + ); + tmp_stack_offset += 8; } } Layout::Builtin(Builtin::Str) => { - let storage = match symbol_map.get(&args[i]) { - Some(storage) => storage, - None => { - internal_error!("function argument does not reference any symbol") - } - }; if general_i + 1 < Self::GENERAL_PARAM_REGS.len() { - // Load the value to the param reg. - let dst1 = Self::GENERAL_PARAM_REGS[general_i]; - let dst2 = Self::GENERAL_PARAM_REGS[general_i + 1]; - match storage { - SymbolStorage::Base { offset, .. } => { - X86_64Assembler::mov_reg64_base32(buf, dst1, *offset); - X86_64Assembler::mov_reg64_base32(buf, dst2, *offset + 8); - } - _ => { - internal_error!( - "Strings only support being loaded from base offsets" - ); - } - } + let (base_offset, _size) = storage_manager.stack_offset_and_size(sym); + debug_assert_eq!(base_offset % 8, 0); + X86_64Assembler::mov_reg64_base32( + buf, + Self::GENERAL_PARAM_REGS[general_i], + base_offset, + ); + X86_64Assembler::mov_reg64_base32( + buf, + Self::GENERAL_PARAM_REGS[general_i + 1], + base_offset + 8, + ); general_i += 2; } else { todo!("calling functions with strings on the stack"); } } - Layout::Struct(&[]) => {} + x if x.stack_size(TARGET_INFO) == 0 => {} x => { todo!("calling with arg type, {:?}", x); } } } - stack_offset as u32 + storage_manager.update_fn_call_stack_size(tmp_stack_offset as u32); } - fn return_struct<'a>( - _buf: &mut Vec<'a, u8>, - _struct_offset: i32, - _struct_size: u32, - _field_layouts: &[Layout<'a>], - _ret_reg: Option, + fn return_complex_symbol<'a>( + buf: &mut Vec<'a, u8>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64SystemV, + >, + sym: &Symbol, + layout: &Layout<'a>, ) { - todo!("Returning structs for X86_64"); + match layout { + single_register_layouts!() => { + internal_error!("single register layouts are not complex symbols"); + } + Layout::Builtin(Builtin::Str | Builtin::List(_)) => { + let (base_offset, _size) = storage_manager.stack_offset_and_size(sym); + debug_assert_eq!(base_offset % 8, 0); + X86_64Assembler::mov_reg64_base32(buf, Self::GENERAL_RETURN_REGS[0], base_offset); + X86_64Assembler::mov_reg64_base32( + buf, + Self::GENERAL_RETURN_REGS[1], + base_offset + 8, + ); + } + x if x.stack_size(TARGET_INFO) == 0 => {} + x => todo!("returning complex type, {:?}", x), + } } + fn load_returned_complex_symbol<'a>( + buf: &mut Vec<'a, u8>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64SystemV, + >, + sym: &Symbol, + layout: &Layout<'a>, + ) { + match layout { + single_register_layouts!() => { + internal_error!("single register layouts are not complex symbols"); + } + Layout::Builtin(Builtin::Str | Builtin::List(_)) => { + let offset = storage_manager.claim_stack_area(sym, 16); + X86_64Assembler::mov_base32_reg64(buf, offset, Self::GENERAL_RETURN_REGS[0]); + X86_64Assembler::mov_base32_reg64(buf, offset + 8, Self::GENERAL_RETURN_REGS[1]); + } + x if x.stack_size(TARGET_INFO) == 0 => {} + x => todo!("receiving complex return type, {:?}", x), + } + } +} + +impl X86_64SystemV { fn returns_via_arg_pointer(ret_layout: &Layout) -> bool { - // TODO: This may need to be more complex/extended to fully support the calling convention. + // TODO: This will need to be more complex/extended to fully support the calling convention. // details here: https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf ret_layout.stack_size(TARGET_INFO) > 16 } } -impl CallConv for X86_64WindowsFastcall { +impl CallConv for X86_64WindowsFastcall { const BASE_PTR_REG: X86_64GeneralReg = X86_64GeneralReg::RBP; const STACK_PTR_REG: X86_64GeneralReg = X86_64GeneralReg::RSP; @@ -553,225 +518,202 @@ impl CallConv for X86_64WindowsFastcall { #[inline(always)] fn setup_stack<'a>( buf: &mut Vec<'a, u8>, - saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], requested_stack_size: i32, fn_call_stack_size: i32, ) -> i32 { - x86_64_generic_setup_stack(buf, saved_regs, requested_stack_size, fn_call_stack_size) + x86_64_generic_setup_stack( + buf, + saved_general_regs, + saved_float_regs, + requested_stack_size, + fn_call_stack_size, + ) } #[inline(always)] fn cleanup_stack<'a>( buf: &mut Vec<'a, u8>, - saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], aligned_stack_size: i32, fn_call_stack_size: i32, ) { - x86_64_generic_cleanup_stack(buf, saved_regs, aligned_stack_size, fn_call_stack_size) + x86_64_generic_cleanup_stack( + buf, + saved_general_regs, + saved_float_regs, + aligned_stack_size, + fn_call_stack_size, + ) } #[inline(always)] fn load_args<'a>( _buf: &mut Vec<'a, u8>, - symbol_map: &mut MutMap>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64WindowsFastcall, + >, args: &'a [(Layout<'a>, Symbol)], ret_layout: &Layout<'a>, - stack_size: u32, - ) -> u32 { - let mut arg_offset = Self::SHADOW_SPACE_SIZE as i32 + 8; // 8 is the size of the pushed base pointer. + ) { + let mut arg_offset = Self::SHADOW_SPACE_SIZE as i32 + 16; // 16 is the size of the pushed return address and base pointer. let mut i = 0; if X86_64WindowsFastcall::returns_via_arg_pointer(ret_layout) { - symbol_map.insert( - Symbol::RET_POINTER, - SymbolStorage::GeneralReg(Self::GENERAL_PARAM_REGS[i]), - ); + storage_manager.ret_pointer_arg(Self::GENERAL_PARAM_REGS[i]); i += 1; } for (layout, sym) in args.iter() { if i < Self::GENERAL_PARAM_REGS.len() { match layout { single_register_integers!() => { - symbol_map - .insert(*sym, SymbolStorage::GeneralReg(Self::GENERAL_PARAM_REGS[i])); + storage_manager.general_reg_arg(sym, Self::GENERAL_PARAM_REGS[i]); i += 1; } single_register_floats!() => { - symbol_map.insert(*sym, SymbolStorage::FloatReg(Self::FLOAT_PARAM_REGS[i])); + storage_manager.float_reg_arg(sym, Self::FLOAT_PARAM_REGS[i]); i += 1; } Layout::Builtin(Builtin::Str) => { // I think this just needs to be passed on the stack, so not a huge deal. todo!("Passing str args with Windows fast call"); } - Layout::Struct(&[]) => {} + x if x.stack_size(TARGET_INFO) == 0 => {} x => { todo!("Loading args with layout {:?}", x); } } } else { - arg_offset += match layout { - single_register_builtins!() => 8, + match layout { + single_register_layouts!() => { + storage_manager.primitive_stack_arg(sym, arg_offset); + arg_offset += 8; + } x => { todo!("Loading args with layout {:?}", x); } }; - symbol_map.insert( - *sym, - SymbolStorage::Base { - offset: arg_offset, - size: 8, - owned: true, - }, - ); } } - stack_size } #[inline(always)] fn store_args<'a>( buf: &mut Vec<'a, u8>, - symbol_map: &MutMap>, + storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64WindowsFastcall, + >, args: &'a [Symbol], arg_layouts: &[Layout<'a>], ret_layout: &Layout<'a>, - ) -> u32 { - let mut stack_offset = Self::SHADOW_SPACE_SIZE as i32; - // For most return layouts we will do nothing. - // In some cases, we need to put the return address as the first arg. - match ret_layout { - single_register_builtins!() | Layout::Struct([]) => { - // Nothing needs to be done for any of these cases. - } - x => { - todo!("receiving return type, {:?}", x); - } + ) { + let mut tmp_stack_offset = Self::SHADOW_SPACE_SIZE as i32; + if Self::returns_via_arg_pointer(ret_layout) { + // Save space on the stack for the arg we will return. + storage_manager + .claim_stack_area(&Symbol::RET_POINTER, ret_layout.stack_size(TARGET_INFO)); + todo!("claim first parama reg for the address"); } - for (i, layout) in arg_layouts.iter().enumerate() { + for (i, (sym, layout)) in args.iter().zip(arg_layouts.iter()).enumerate() { match layout { single_register_integers!() => { - let storage = match symbol_map.get(&args[i]) { - Some(storage) => storage, - None => { - internal_error!("function argument does not reference any symbol") - } - }; if i < Self::GENERAL_PARAM_REGS.len() { - // Load the value to the param reg. - let dst = Self::GENERAL_PARAM_REGS[i]; - match storage { - SymbolStorage::GeneralReg(reg) - | SymbolStorage::BaseAndGeneralReg { reg, .. } => { - X86_64Assembler::mov_reg64_reg64(buf, dst, *reg); - } - SymbolStorage::Base { offset, .. } => { - X86_64Assembler::mov_reg64_base32(buf, dst, *offset); - } - SymbolStorage::FloatReg(_) | SymbolStorage::BaseAndFloatReg { .. } => { - internal_error!("Cannot load floating point symbol into GeneralReg") - } - } + storage_manager.load_to_specified_general_reg( + buf, + sym, + Self::GENERAL_PARAM_REGS[i], + ); } else { - // Load the value to the stack. - match storage { - SymbolStorage::GeneralReg(reg) - | SymbolStorage::BaseAndGeneralReg { reg, .. } => { - X86_64Assembler::mov_stack32_reg64(buf, stack_offset, *reg); - } - SymbolStorage::Base { offset, .. } => { - // Use RAX as a tmp reg because it will be free before function calls. - X86_64Assembler::mov_reg64_base32( - buf, - X86_64GeneralReg::RAX, - *offset, - ); - X86_64Assembler::mov_stack32_reg64( - buf, - stack_offset, - X86_64GeneralReg::RAX, - ); - } - SymbolStorage::FloatReg(_) | SymbolStorage::BaseAndFloatReg { .. } => { - internal_error!("Cannot load floating point symbol into GeneralReg") - } - } - stack_offset += 8; + // Copy to stack using return reg as buffer. + storage_manager.load_to_specified_general_reg( + buf, + sym, + Self::GENERAL_RETURN_REGS[0], + ); + X86_64Assembler::mov_stack32_reg64( + buf, + tmp_stack_offset, + Self::GENERAL_RETURN_REGS[0], + ); + tmp_stack_offset += 8; } } single_register_floats!() => { - let storage = match symbol_map.get(&args[i]) { - Some(storage) => storage, - None => { - internal_error!("function argument does not reference any symbol") - } - }; if i < Self::FLOAT_PARAM_REGS.len() { - // Load the value to the param reg. - let dst = Self::FLOAT_PARAM_REGS[i]; - match storage { - SymbolStorage::FloatReg(reg) - | SymbolStorage::BaseAndFloatReg { reg, .. } => { - X86_64Assembler::mov_freg64_freg64(buf, dst, *reg); - } - SymbolStorage::Base { offset, .. } => { - X86_64Assembler::mov_freg64_base32(buf, dst, *offset); - } - SymbolStorage::GeneralReg(_) - | SymbolStorage::BaseAndGeneralReg { .. } => { - internal_error!("Cannot load general symbol into FloatReg") - } - } + storage_manager.load_to_specified_float_reg( + buf, + sym, + Self::FLOAT_PARAM_REGS[i], + ); } else { - // Load the value to the stack. - match storage { - SymbolStorage::FloatReg(reg) - | SymbolStorage::BaseAndFloatReg { reg, .. } => { - X86_64Assembler::mov_stack32_freg64(buf, stack_offset, *reg); - } - SymbolStorage::Base { offset, .. } => { - // Use XMM0 as a tmp reg because it will be free before function calls. - X86_64Assembler::mov_freg64_base32( - buf, - X86_64FloatReg::XMM0, - *offset, - ); - X86_64Assembler::mov_stack32_freg64( - buf, - stack_offset, - X86_64FloatReg::XMM0, - ); - } - SymbolStorage::GeneralReg(_) - | SymbolStorage::BaseAndGeneralReg { .. } => { - internal_error!("Cannot load general symbol into FloatReg") - } - } - stack_offset += 8; + // Copy to stack using return reg as buffer. + storage_manager.load_to_specified_float_reg( + buf, + sym, + Self::FLOAT_RETURN_REGS[0], + ); + X86_64Assembler::mov_stack32_freg64( + buf, + tmp_stack_offset, + Self::FLOAT_RETURN_REGS[0], + ); + tmp_stack_offset += 8; } } Layout::Builtin(Builtin::Str) => { // I think this just needs to be passed on the stack, so not a huge deal. todo!("Passing str args with Windows fast call"); } - Layout::Struct(&[]) => {} + x if x.stack_size(TARGET_INFO) == 0 => {} x => { todo!("calling with arg type, {:?}", x); } } } - stack_offset as u32 + storage_manager.update_fn_call_stack_size(tmp_stack_offset as u32); } - fn return_struct<'a>( + fn return_complex_symbol<'a>( _buf: &mut Vec<'a, u8>, - _struct_offset: i32, - _struct_size: u32, - _field_layouts: &[Layout<'a>], - _ret_reg: Option, + _storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64WindowsFastcall, + >, + _sym: &Symbol, + _layout: &Layout<'a>, ) { - todo!("Returning structs for X86_64WindowsFastCall"); + todo!("Returning complex symbols for X86_64"); } + fn load_returned_complex_symbol<'a>( + _buf: &mut Vec<'a, u8>, + _storage_manager: &mut StorageManager< + 'a, + X86_64GeneralReg, + X86_64FloatReg, + X86_64Assembler, + X86_64WindowsFastcall, + >, + _sym: &Symbol, + _layout: &Layout<'a>, + ) { + todo!("Loading returned complex symbols for X86_64"); + } +} + +impl X86_64WindowsFastcall { fn returns_via_arg_pointer(ret_layout: &Layout) -> bool { // TODO: This is not fully correct there are some exceptions for "vector" types. // details here: https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#return-values @@ -782,7 +724,8 @@ impl CallConv for X86_64WindowsFastcall { #[inline(always)] fn x86_64_generic_setup_stack<'a>( buf: &mut Vec<'a, u8>, - saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], requested_stack_size: i32, fn_call_stack_size: i32, ) -> i32 { @@ -790,7 +733,7 @@ fn x86_64_generic_setup_stack<'a>( X86_64Assembler::mov_reg64_reg64(buf, X86_64GeneralReg::RBP, X86_64GeneralReg::RSP); let full_stack_size = match requested_stack_size - .checked_add(8 * saved_regs.len() as i32) + .checked_add(8 * (saved_general_regs.len() + saved_float_regs.len()) as i32) .and_then(|size| size.checked_add(fn_call_stack_size)) { Some(size) => size, @@ -817,10 +760,14 @@ fn x86_64_generic_setup_stack<'a>( // Put values at the top of the stack to avoid conflicts with previously saved variables. let mut offset = aligned_stack_size - fn_call_stack_size; - for reg in saved_regs { + for reg in saved_general_regs { X86_64Assembler::mov_base32_reg64(buf, -offset, *reg); offset -= 8; } + for reg in saved_float_regs { + X86_64Assembler::mov_base32_freg64(buf, -offset, *reg); + offset -= 8; + } aligned_stack_size } else { 0 @@ -834,16 +781,21 @@ fn x86_64_generic_setup_stack<'a>( #[allow(clippy::unnecessary_wraps)] fn x86_64_generic_cleanup_stack<'a>( buf: &mut Vec<'a, u8>, - saved_regs: &[X86_64GeneralReg], + saved_general_regs: &[X86_64GeneralReg], + saved_float_regs: &[X86_64FloatReg], aligned_stack_size: i32, fn_call_stack_size: i32, ) { if aligned_stack_size > 0 { let mut offset = aligned_stack_size - fn_call_stack_size; - for reg in saved_regs { + for reg in saved_general_regs { X86_64Assembler::mov_reg64_base32(buf, *reg, -offset); offset -= 8; } + for reg in saved_float_regs { + X86_64Assembler::mov_freg64_base32(buf, *reg, -offset); + offset -= 8; + } X86_64Assembler::add_reg64_reg64_imm32( buf, X86_64GeneralReg::RSP, @@ -1429,6 +1381,9 @@ fn mov_reg64_base64_offset32( /// `MOVSD xmm1,xmm2` -> Move scalar double-precision floating-point value from xmm2 to xmm1 register. #[inline(always)] fn movsd_freg64_freg64(buf: &mut Vec<'_, u8>, dst: X86_64FloatReg, src: X86_64FloatReg) { + if dst == src { + return; + } let dst_high = dst as u8 > 7; let dst_mod = dst as u8 % 8; let src_high = src as u8 > 7; @@ -2161,10 +2116,7 @@ mod tests { let arena = bumpalo::Bump::new(); let mut buf = bumpalo::vec![in &arena]; for ((dst, src), expected) in &[ - ( - (X86_64FloatReg::XMM0, X86_64FloatReg::XMM0), - vec![0xF2, 0x0F, 0x10, 0xC0], - ), + ((X86_64FloatReg::XMM0, X86_64FloatReg::XMM0), vec![]), ( (X86_64FloatReg::XMM0, X86_64FloatReg::XMM15), vec![0xF2, 0x41, 0x0F, 0x10, 0xC7], @@ -2173,10 +2125,7 @@ mod tests { (X86_64FloatReg::XMM15, X86_64FloatReg::XMM0), vec![0xF2, 0x44, 0x0F, 0x10, 0xF8], ), - ( - (X86_64FloatReg::XMM15, X86_64FloatReg::XMM15), - vec![0xF2, 0x45, 0x0F, 0x10, 0xFF], - ), + ((X86_64FloatReg::XMM15, X86_64FloatReg::XMM15), vec![]), ] { buf.clear(); movsd_freg64_freg64(&mut buf, *dst, *src); diff --git a/compiler/gen_dev/src/lib.rs b/compiler/gen_dev/src/lib.rs index af382ee988..2971d30ddb 100644 --- a/compiler/gen_dev/src/lib.rs +++ b/compiler/gen_dev/src/lib.rs @@ -694,16 +694,8 @@ trait Backend<'a> { fn free_symbol(&mut self, sym: &Symbol); /// set_last_seen sets the statement a symbol was last seen in. - fn set_last_seen( - &mut self, - sym: Symbol, - stmt: &Stmt<'a>, - owning_symbol: &MutMap, - ) { + fn set_last_seen(&mut self, sym: Symbol, stmt: &Stmt<'a>) { self.last_seen_map().insert(sym, stmt); - if let Some(parent) = owning_symbol.get(&sym) { - self.last_seen_map().insert(*parent, stmt); - } } /// last_seen_map gets the map from symbol to when it is last seen in the function. @@ -749,45 +741,39 @@ trait Backend<'a> { /// scan_ast runs through the ast and fill the last seen map. /// This must iterate through the ast in the same way that build_stmt does. i.e. then before else. fn scan_ast(&mut self, stmt: &Stmt<'a>) { - // This keeps track of symbols that depend on other symbols. - // The main case of this is data in structures and tagged unions. - // This data must extend the lifetime of the original structure or tagged union. - // For arrays the loading is always done through low levels and does not depend on the underlying array's lifetime. - let mut owning_symbol: MutMap = MutMap::default(); + // Join map keeps track of join point parameters so that we can keep them around while they still might be jumped to. + let mut join_map: MutMap]> = MutMap::default(); match stmt { Stmt::Let(sym, expr, _, following) => { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); match expr { Expr::Literal(_) => {} - Expr::Call(call) => self.scan_ast_call(call, stmt, &owning_symbol), + Expr::Call(call) => self.scan_ast_call(call, stmt), Expr::Tag { arguments, .. } => { for sym in *arguments { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } } Expr::Struct(syms) => { for sym in *syms { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } } Expr::StructAtIndex { structure, .. } => { - self.set_last_seen(*structure, stmt, &owning_symbol); - owning_symbol.insert(*sym, *structure); + self.set_last_seen(*structure, stmt); } Expr::GetTagId { structure, .. } => { - self.set_last_seen(*structure, stmt, &owning_symbol); - owning_symbol.insert(*sym, *structure); + self.set_last_seen(*structure, stmt); } Expr::UnionAtIndex { structure, .. } => { - self.set_last_seen(*structure, stmt, &owning_symbol); - owning_symbol.insert(*sym, *structure); + self.set_last_seen(*structure, stmt); } Expr::Array { elems, .. } => { for elem in *elems { if let ListLiteralElement::Symbol(sym) = elem { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } } } @@ -797,22 +783,22 @@ trait Backend<'a> { tag_name, .. } => { - self.set_last_seen(*symbol, stmt, &owning_symbol); + self.set_last_seen(*symbol, stmt); match tag_name { TagName::Closure(sym) => { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } TagName::Private(sym) => { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } TagName::Global(_) => {} } for sym in *arguments { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } } Expr::Reset { symbol, .. } => { - self.set_last_seen(*symbol, stmt, &owning_symbol); + self.set_last_seen(*symbol, stmt); } Expr::EmptyArray => {} Expr::RuntimeErrorFunction(_) => {} @@ -826,56 +812,59 @@ trait Backend<'a> { default_branch, .. } => { - self.set_last_seen(*cond_symbol, stmt, &owning_symbol); + self.set_last_seen(*cond_symbol, stmt); for (_, _, branch) in *branches { self.scan_ast(branch); } self.scan_ast(default_branch.1); } Stmt::Ret(sym) => { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } Stmt::Refcounting(modify, following) => { let sym = modify.get_symbol(); - self.set_last_seen(sym, stmt, &owning_symbol); + self.set_last_seen(sym, stmt); self.scan_ast(following); } Stmt::Join { parameters, body: continuation, remainder, + id, .. } => { + join_map.insert(*id, parameters); for param in *parameters { - self.set_last_seen(param.symbol, stmt, &owning_symbol); + self.set_last_seen(param.symbol, stmt); } self.scan_ast(continuation); self.scan_ast(remainder); } Stmt::Jump(JoinPointId(sym), symbols) => { - self.set_last_seen(*sym, stmt, &owning_symbol); + if let Some(parameters) = join_map.get(&JoinPointId(*sym)) { + // Keep the parameters around. They will be overwritten when jumping. + for param in *parameters { + self.set_last_seen(param.symbol, stmt); + } + } + self.set_last_seen(*sym, stmt); for sym in *symbols { - self.set_last_seen(*sym, stmt, &owning_symbol); + self.set_last_seen(*sym, stmt); } } Stmt::RuntimeError(_) => {} } } - fn scan_ast_call( - &mut self, - call: &roc_mono::ir::Call, - stmt: &roc_mono::ir::Stmt<'a>, - owning_symbol: &MutMap, - ) { + fn scan_ast_call(&mut self, call: &roc_mono::ir::Call, stmt: &roc_mono::ir::Stmt<'a>) { let roc_mono::ir::Call { call_type, arguments, } = call; for sym in *arguments { - self.set_last_seen(*sym, stmt, owning_symbol); + self.set_last_seen(*sym, stmt); } match call_type { diff --git a/compiler/gen_dev/src/object_builder.rs b/compiler/gen_dev/src/object_builder.rs index d8675c880d..eec99ad63f 100644 --- a/compiler/gen_dev/src/object_builder.rs +++ b/compiler/gen_dev/src/object_builder.rs @@ -13,6 +13,7 @@ use roc_module::symbol; use roc_module::symbol::Interns; use roc_mono::ir::{Proc, ProcLayout}; use roc_mono::layout::LayoutIds; +use roc_target::TargetInfo; use target_lexicon::{Architecture as TargetArch, BinaryFormat as TargetBF, Triple}; // This is used by some code below which is currently commented out. @@ -38,7 +39,7 @@ pub fn build_module<'a>( x86_64::X86_64FloatReg, x86_64::X86_64Assembler, x86_64::X86_64SystemV, - >(env, interns); + >(env, TargetInfo::default_x86_64(), interns); build_object( procedures, backend, @@ -55,7 +56,7 @@ pub fn build_module<'a>( x86_64::X86_64FloatReg, x86_64::X86_64Assembler, x86_64::X86_64SystemV, - >(env, interns); + >(env, TargetInfo::default_x86_64(), interns); build_object( procedures, backend, @@ -76,7 +77,7 @@ pub fn build_module<'a>( aarch64::AArch64FloatReg, aarch64::AArch64Assembler, aarch64::AArch64Call, - >(env, interns); + >(env, TargetInfo::default_aarch64(), interns); build_object( procedures, backend, @@ -93,7 +94,7 @@ pub fn build_module<'a>( aarch64::AArch64FloatReg, aarch64::AArch64Assembler, aarch64::AArch64Call, - >(env, interns); + >(env, TargetInfo::default_aarch64(), interns); build_object( procedures, backend, diff --git a/compiler/roc_target/src/lib.rs b/compiler/roc_target/src/lib.rs index 68269a3849..5174d4082e 100644 --- a/compiler/roc_target/src/lib.rs +++ b/compiler/roc_target/src/lib.rs @@ -12,6 +12,12 @@ impl TargetInfo { self.architecture.ptr_width() } + pub const fn default_aarch64() -> Self { + TargetInfo { + architecture: Architecture::Aarch64, + } + } + pub const fn default_x86_64() -> Self { TargetInfo { architecture: Architecture::X86_64, diff --git a/www/public/index.html b/www/public/index.html index 832c81ef38..8b6d782d41 100644 --- a/www/public/index.html +++ b/www/public/index.html @@ -8,7 +8,17 @@ The Roc Programming Language - +