Merge pull request #5120 from dankeyy/jvm-interop

JVM Interop
This commit is contained in:
Brendan Hansknecht 2023-03-22 20:02:05 +00:00 committed by GitHub
commit 913f2e7ae2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 719 additions and 0 deletions

7
examples/jvm-interop/.gitignore vendored Normal file
View file

@ -0,0 +1,7 @@
*.log
*.class
*.o
*.so
*.jar
libhello*
*.h

View file

@ -0,0 +1,210 @@
# JVM interop
This is a demo for calling Roc code from Java, and some other JVM languages.
## Prerequisites
The following was tested on NixOS, with `openjdk 17.0.5` and `clang 13.0.1` but should work with most recent versions of those (jdk>=10) on most modern Linux and MacOS.\
You're welcome to test on your machine and tell me (via [Zulip](https://roc.zulipchat.com/#narrow/pm-with/583319-dank)) if you ran into any issues or limitations.
## Goal
We'll make a few examples, showing basic data type convertions and function calls between Roc and Java (and later some other JVM languages):
- A string formatter.
- A Function that multiples an array by a scalar.
- A factorial function that, for the sake of demonstration, throws a RuntimeException for negative integers.
This will be done with the help of [Java Native Interface](https://docs.oracle.com/javase/8/docs/technotes/guides/jni/).
We will be using C to bridge between Java and Roc.
## Structure
As the time of writing this post, the following is the current bare bones tree of a jvm-interop:
``` console
.
├── impl.roc
├── platform.roc
├── bridge.c
└── javaSource
└── Demo.java
```
impl.roc is the application where we actually implement our native Roc functions.\
platform.roc as the name suggests contains platform logic, (but doesn't really have much here, mostly just) exposes functions to the host - bridge.c\
bridge.c is the JNI bridge, it's the host that implements the Roc functions (e.g roc_alloc) and the JNI functions that act like the bridge between Roc and Java (bridge as in, doing type conversions between the languages, needed jvm boilerplate, etc).
For each of our native Roc functions, in the application (impl.roc), we have a corresponding `Java_javaSource_Demo_FUNC` C function that handles the "behind the scenes", this includes type conversions between the languages, transforming roc panics into java exceptions and basically all the glue code necessary.
Just so you know what to expect, our Roc functions look like this;
``` coffee
interpolateString : Str -> Str
interpolateString = \name ->
"Hello from Roc \(name)!!!🤘🤘🤘"
mulArrByScalar : List I32, I32 -> List I32
mulArrByScalar = \arr, scalar ->
List.map arr \x -> x * scalar
factorial : I64 -> I64
factorial = \n ->
if n < 0 then
# while we get the chance, examplify a roc panic in an interop
crash "No negatives here!!!"
else if n == 0 then
1
else
n * (factorial (n - 1))
```
Nothing too crazy. Again, do note how we crash if n < 0, see how this would play out from the Java side.
Now let's take a quick look on the Java side of things;
``` java
public class Demo {
static {
System.loadLibrary("interop");
}
public static native String sayHello(String num);
public static native int[] mulArrByScalar(int[] arr, int scalar);
public static native long factorial(long n) throws RuntimeException;
public static void main(String[] args) {
// string demo
System.out.println(sayHello("Brendan") + "\n");
// array demo
int[] arr = {10, 20, 30, 40};
int x = 3;
System.out.println(Arrays.toString(arr) +
" multiplied by " + x +
" results in " + Arrays.toString(mulArrByScalar(arr, x)) +
"\n");
// number + panic demo
long n = 5;
System.out.println("Factorial of " + n + " is " + factorial(n));
}
}
```
First we load our library - "interop", which is a shared library (`.so` file) that our Roc+C code compiles to.\
Then, we declare our native functions with suitable types and throws annotation.\
Finally in main we test it out with some inputs.
## See it in action
##### For brevity's sake we'll run the build script and omit some of its (intentionally) verbose output:
```console
[nix-shell:~/dev/roc/examples/jvm-interop]$ ./build.sh && java javaSource.Greeter
Hello from Roc Brendan!!!🤘🤘🤘
[10, 20, 30, 40] multiplied by 3 results in [30, 60, 90, 120]
Factorial of 5 is 120
```
That's pretty cool!\
Let's also see what happens if in the code above we define n to be -1:
``` console
[nix-shell:~/dev/roc/examples/jvm-interop]$ ./build.sh && java javaSource.Greeter
Hello from Roc Brendan!!!🤘🤘🤘
[10, 20, 30, 40] multiplied by 3 results in [30, 60, 90, 120]
Exception in thread "main" java.lang.RuntimeException: No negatives here!!!
at javaSource.Demo.factorial(Native Method)
at javaSource.Demo.main(Demo.java:36)
```
And as we expected, it runs the first two examples fine, throws a RuntimeException on the third.
Since we're talking JVM Bytecode, we can pretty much call our native function from any language that speaks JVM Bytecode.
Note: The JNI code depends on a dynamic lib, containing our native implementation, that now resides in our working directory.\
So in the following examples, we'll make sure that our working directory is in LD_LIBRARY_PATH.\
Generally speaking, you'd paobably add your dynamic library to a spot that's already on your path, for convenience sake.\
So first, we run:
```console
[nix-shell:~/dev/roc/examples/jvm-interop]$ export LD_LIBRARY_PATH=$(pwd):$LD_LIBRARY_PATH
```
Now, let's try Kotlin!
```console
[nix-shell:~/dev/roc/examples/jvm-interop]$ kotlin
Welcome to Kotlin version 1.7.20 (JRE 17.0.5+8-nixos)
Type :help for help, :quit for quit
>>> import javaSource.Demo
>>> Demo.sayHello("Kotlin Users")
res1: kotlin.String = Hello from Roc Kotlin Users!!!🤘🤘🤘
>>> Demo.mulArrByScalar(intArrayOf(10, 20, 30, 40), 101).contentToString()
res2: kotlin.String = [1010, 2020, 3030, 4040]
>>> Demo.factorial(10)
res3: kotlin.Long = 3628800
```
And it just works, out of the box!
Now let's do Scala
```console
[nix-shell:~/dev/roc/examples/jvm-interop]$ scala
Welcome to Scala 2.13.10 (OpenJDK 64-Bit Server VM, Java 17.0.5).
Type in expressions for evaluation. Or try :help.
scala> import javaSource.Demo
import javaSource.Demo
scala> Demo.sayHello("Scala Users")
val res0: String = Hello from Roc Scala Users!!!🤘🤘🤘
scala> Demo.mulArrByScalar(Array(10, 20, 30, 40), 1001)
val res1: Array[Int] = Array(10010, 20020, 30030, 40040)
scala> Demo.factorial(-2023)
java.lang.RuntimeException: No negatives here!!!
at javaSource.Demo.factorial(Native Method)
... 32 elided
```
And it also works beautifully.
Last one - Clojure
Do note that in Clojure you need to add a `-Sdeps '{:paths ["."]}'` flag to add the working directory to paths.
``` console
[nix-shell:~/dev/roc/examples/jvm-interop]$ clj -Sdeps '{:paths ["."]}'
Clojure 1.11.1
user=> (import 'javaSource.Demo)
javaSource.Demo
user=> (Demo/sayHello "Clojure Users")
"Hello from Roc Clojure Users!!!🤘🤘🤘"
user=> (seq (Demo/mulArrByScalar (int-array [10 20 30]) 9)) ; seq to pretty-print
(90 180 270)
user=> (Demo/factorial 15)
1307674368000
```
Test it out on your favorite JVM lang!\
And again, if anything goes not according to plan, tell me in the link above and we'll figure it out.
## Notes on building
The process is basically the following:
1. Build our application + platform .roc files with (`roc build impl.roc --no-link`) into an object file
2. Generate a C header file (for bridge.c's) using java.
3. Bundle up the C bridge together with our object file into a shared object.
And that's it, use that shared object from your JVM language. Note every JVM language has its own way to declare that native library so you may want to look at it, or do like in the demo and declare it in java and use the binding from anywhere.
I suggest reading the build script (build.sh) and adjusting according to your setup.

View file

@ -0,0 +1,382 @@
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <stdalign.h>
#include <stdint.h>
#include <setjmp.h>
#include <jni.h>
#include "javaSource_Demo.h"
JavaVM* vm;
#define ERR_MSG_MAX_SIZE 256
jmp_buf exception_buffer;
char* err_msg[ERR_MSG_MAX_SIZE] = {0};
jint JNI_OnLoad(JavaVM *loadedVM, void *reserved)
{
// https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html
vm = loadedVM;
return JNI_VERSION_1_2;
}
void *roc_alloc(size_t size, unsigned int alignment)
{
return malloc(size);
}
void *roc_realloc(void *ptr, size_t new_size, size_t old_size,
unsigned int alignment)
{
return realloc(ptr, new_size);
}
void roc_dealloc(void *ptr, unsigned int alignment)
{
free(ptr);
}
void *roc_memcpy(void *dest, const void *src, size_t n)
{
return memcpy(dest, src, n);
}
void *roc_memset(void *str, int c, size_t n)
{
return memset(str, c, n);
}
// Reference counting
// If the refcount is set to this, that means the allocation is
// stored in readonly memory in the binary, and we must not
// attempt to increment or decrement it; if we do, we'll segfault!
const ssize_t REFCOUNT_READONLY = 0;
const ssize_t REFCOUNT_ONE = (ssize_t)PTRDIFF_MIN;
const size_t MASK = (size_t)PTRDIFF_MIN;
// Increment reference count, given a pointer to the first element in a collection.
// We don't need to check for overflow because in order to overflow a usize worth of refcounts,
// you'd need to somehow have more pointers in memory than the OS's virtual address space can hold.
void incref(uint8_t* bytes, uint32_t alignment)
{
ssize_t *refcount_ptr = ((ssize_t *)bytes) - 1;
ssize_t refcount = *refcount_ptr;
if (refcount != REFCOUNT_READONLY) {
*refcount_ptr = refcount + 1;
}
}
// Decrement reference count, given a pointer to the first element in a collection.
// Then call roc_dealloc if nothing is referencing this collection anymore.
void decref(uint8_t* bytes, uint32_t alignment)
{
if (bytes == NULL) {
return;
}
size_t extra_bytes = (sizeof(size_t) >= (size_t)alignment) ? sizeof(size_t) : (size_t)alignment;
ssize_t *refcount_ptr = ((ssize_t *)bytes) - 1;
ssize_t refcount = *refcount_ptr;
if (refcount != REFCOUNT_READONLY) {
*refcount_ptr = refcount - 1;
if (refcount == REFCOUNT_ONE) {
void *original_allocation = (void *)(refcount_ptr - (extra_bytes - sizeof(size_t)));
roc_dealloc(original_allocation, alignment);
}
}
}
struct RocListI32
{
int32_t *bytes;
size_t len;
size_t capacity;
};
struct RocListI32 init_roclist_i32(int32_t *bytes, size_t len)
{
if (len == 0)
{
struct RocListI32 ret = {
.len = 0,
.bytes = NULL,
.capacity = 0,
};
return ret;
}
else
{
size_t refcount_size = sizeof(size_t);
ssize_t* data = (ssize_t*)roc_alloc(len + refcount_size, alignof(size_t));
data[0] = REFCOUNT_ONE;
int32_t *new_content = (int32_t *)(data + 1);
struct RocListI32 ret;
memcpy(new_content, bytes, len * sizeof(int32_t));
ret.bytes = new_content;
ret.len = len;
ret.capacity = len;
return ret;
}
}
// RocListU8 (List U8)
struct RocListU8
{
uint8_t *bytes;
size_t len;
size_t capacity;
};
struct RocListU8 init_roclist_u8(uint8_t *bytes, size_t len)
{
if (len == 0)
{
struct RocListU8 ret = {
.len = 0,
.bytes = NULL,
.capacity = 0,
};
return ret;
}
else
{
size_t refcount_size = sizeof(size_t);
ssize_t* data = (ssize_t*)roc_alloc(len + refcount_size, alignof(size_t));
data[0] = REFCOUNT_ONE;
uint8_t *new_content = (uint8_t *)(data + 1);
struct RocListU8 ret;
memcpy(new_content, bytes, len * sizeof(uint8_t));
ret.bytes = new_content;
ret.len = len;
ret.capacity = len;
return ret;
}
}
// RocStr
struct RocStr
{
uint8_t *bytes;
size_t len;
size_t capacity;
};
struct RocStr init_rocstr(uint8_t *bytes, size_t len)
{
if (len < sizeof(struct RocStr))
{
// Start out with zeroed memory, so that
// if we end up comparing two small RocStr values
// for equality, we won't risk memory garbage resulting
// in two equal strings appearing unequal.
struct RocStr ret = {
.len = 0,
.bytes = NULL,
.capacity = 0,
};
// Copy the bytes into the stack allocation
memcpy(&ret, bytes, len);
// Record the string's len in the last byte of the stack allocation
((uint8_t *)&ret)[sizeof(struct RocStr) - 1] = (uint8_t)len | 0b10000000;
return ret;
}
else
{
// A large RocStr is the same as a List U8 (aka RocListU8) in memory.
struct RocListU8 roc_bytes = init_roclist_u8(bytes, len);
struct RocStr ret = {
.len = roc_bytes.len,
.bytes = roc_bytes.bytes,
.capacity = roc_bytes.capacity,
};
return ret;
}
}
bool is_small_str(struct RocStr str)
{
return ((ssize_t)str.capacity) < 0;
}
bool is_seamless_str_slice(struct RocStr str)
{
return ((ssize_t)str.len) < 0;
}
bool is_seamless_listi32_slice(struct RocListI32 list)
{
return ((ssize_t)list.capacity) < 0;
}
// Determine the len of the string, taking into
// account the small string optimization
size_t roc_str_len(struct RocStr str)
{
uint8_t *bytes = (uint8_t *)&str;
uint8_t last_byte = bytes[sizeof(str) - 1];
uint8_t last_byte_xored = last_byte ^ 0b10000000;
size_t small_len = (size_t)(last_byte_xored);
size_t big_len = str.len;
// Avoid branch misprediction costs by always
// determining both small_len and big_len,
// so this compiles to a cmov instruction.
if (is_small_str(str))
{
return small_len;
}
else
{
return big_len;
}
}
__attribute__((noreturn)) void roc_panic(struct RocStr *msg, unsigned int tag_id)
{
char* bytes = is_small_str(*msg) ? (char*)msg : (char*)msg->bytes;
const size_t str_len = roc_str_len(*msg);
int len = str_len > ERR_MSG_MAX_SIZE ? ERR_MSG_MAX_SIZE : str_len;
strncpy((char*)err_msg, bytes, len);
// Free the underlying allocation if needed.
if (!is_small_str(*msg)) {
if (is_seamless_str_slice(*msg)){
decref((uint8_t *)(msg->capacity << 1), alignof(uint8_t *));
}
else {
decref(msg->bytes, alignof(uint8_t *));
}
}
longjmp(exception_buffer, 1);
}
extern void roc__programForHost_1__InterpolateString_caller(struct RocStr *name, char *closure_data, struct RocStr *ret);
extern void roc__programForHost_1__MulArrByScalar_caller(struct RocListI32 *arr, int32_t *scalar, char *closure_data, struct RocListI32 *ret);
extern void roc__programForHost_1__Factorial_caller(int64_t *scalar, char *closure_data, int64_t *ret);
JNIEXPORT jstring JNICALL Java_javaSource_Demo_sayHello
(JNIEnv *env, jobject thisObj, jstring name)
{
const char *jnameChars = (*env)->GetStringUTFChars(env, name, 0);
// we copy just in case the jvm would try to reclaim that mem
uint8_t *cnameChars = (uint8_t *)strdup(jnameChars);
size_t nameLength = (size_t) (*env)->GetStringLength(env, name);
(*env)->ReleaseStringUTFChars(env, name, jnameChars);
struct RocStr rocName = init_rocstr(cnameChars, nameLength);
struct RocStr ret = {0};
// Call the Roc function to populate `ret`'s bytes.
roc__programForHost_1__InterpolateString_caller(&rocName, 0, &ret);
jbyte *bytes = (jbyte*)(is_small_str(ret) ? (uint8_t*)&ret : ret.bytes);
// java being java making this a lot harder than it needs to be
// https://stackoverflow.com/questions/32205446/getting-true-utf-8-characters-in-java-jni
// https://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/types.html#wp16542
// but as i refuse converting those manually to their correct form, we just let the jvm handle the conversion
// by first making a java byte array then converting the byte array to our final jstring
jbyteArray byteArray = (*env)->NewByteArray(env, ret.len);
(*env)->SetByteArrayRegion(env, byteArray, 0, ret.len, bytes);
jstring charsetName = (*env)->NewStringUTF(env, "UTF-8");
jclass stringClass = (*env)->FindClass(env, "java/lang/String");
// https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/doc-files/signature.html
jmethodID stringConstructor = (*env)->GetMethodID(env, stringClass, "<init>", "([BLjava/lang/String;)V");
jstring result = (*env)->NewObject(env, stringClass, stringConstructor, byteArray, charsetName);
// cleanup
if (!is_seamless_str_slice(ret)) {
decref(ret.bytes, alignof(uint8_t *));
}
(*env)->DeleteLocalRef(env, charsetName);
(*env)->DeleteLocalRef(env, byteArray);
free(cnameChars);
return result;
}
JNIEXPORT jintArray JNICALL Java_javaSource_Demo_mulArrByScalar
(JNIEnv *env, jobject thisObj, jintArray arr, jint scalar)
{
// extract data from jvm types
jint* jarr = (*env)->GetIntArrayElements(env, arr, NULL);
jsize len = (*env)->GetArrayLength(env, arr);
// pass data to platform
struct RocListI32 originalArray = init_roclist_i32(jarr, len);
incref((void *)&originalArray, alignof(int32_t*));
struct RocListI32 ret = {0};
roc__programForHost_1__MulArrByScalar_caller(&originalArray, &scalar, 0, &ret);
// create jvm constructs
jintArray multiplied = (*env)->NewIntArray(env, ret.len);
(*env)->SetIntArrayRegion(env, multiplied, 0, ret.len, (jint*) ret.bytes);
// cleanup
(*env)->ReleaseIntArrayElements(env, arr, jarr, 0);
if (is_seamless_listi32_slice(ret)) {
decref((void *)(ret.capacity << 1), alignof(uint8_t *));
}
else {
decref((void *)ret.bytes, alignof(uint8_t *));
}
return multiplied;
}
JNIEXPORT jlong JNICALL Java_javaSource_Demo_factorial
(JNIEnv *env, jobject thisObj, jlong num)
{
int64_t ret;
// can crash - meaning call roc_panic, so we set a jump here
if (setjmp(exception_buffer)) {
// exception was thrown, handle it
jclass exClass = (*env)->FindClass(env, "java/lang/RuntimeException");
const char *msg = (const char *)err_msg;
return (*env)->ThrowNew(env, exClass, msg);
}
else {
int64_t n = (int64_t)num;
roc__programForHost_1__Factorial_caller(&n, 0, &ret);
return ret;
}
}

42
examples/jvm-interop/build.sh Executable file
View file

@ -0,0 +1,42 @@
#!/bin/sh
# https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/
set -euxo pipefail
# don't forget to validate that $JAVA_HOME is defined, the following would not work without it!
# set it either globally or here
# export JAVA_HOME=/your/java/installed/dir
# in nixos, to set it globally, i needed to say `programs.java.enable = true;` in `/etc/nixos/configuration.nix`
# if roc is in your path, you could
# roc build impl.roc --no-link
# else, assuming in roc repo and that you ran `cargo run --release`
../../target/release/roc build impl.roc --no-link
# make jvm look here to see libinterop.so
export LD_LIBRARY_PATH=$(pwd):$LD_LIBRARY_PATH
# needs jdk10 +
# "-h ." is for placing the jni.h header in the cwd.
# the "javaSource" directory may seem redundant (why not just a plain java file),
# but this is the way of java packaging
# we could go without it with an "implicit" package, but that would ache later on,
# especially with other JVM langs
javac -h . javaSource/Demo.java
clang \
-g -Wall \
-fPIC \
-I"$JAVA_HOME/include" \
# -I"$JAVA_HOME/include/darwin" # for macos
-I"$JAVA_HOME/include/linux" \
# -shared -o libinterop.dylib \ # for macos
-shared -o libinterop.so \
rocdemo.o bridge.c
# then run
java javaSource.Demo

View file

@ -0,0 +1,26 @@
app "rocdemo"
packages { pf: "platform.roc" }
imports []
provides [program] to pf
interpolateString : Str -> Str
interpolateString = \name ->
"Hello from Roc \(name)!!!🤘🤘🤘"
# jint is i32
mulArrByScalar : List I32, I32 -> List I32
mulArrByScalar = \arr, scalar ->
List.map arr \x -> x * scalar
# java doesn't have unsigned numbers so we cope with long
# factorial : I64 -> I64
factorial = \n ->
if n < 0 then
# while we get the chance, exemplify a roc panic in an interop
crash "No negatives here!!!"
else if n == 0 then
1
else
n * (factorial (n - 1))
program = { interpolateString, factorial, mulArrByScalar }

View file

@ -0,0 +1,39 @@
package javaSource;
import java.util.Arrays;
public class Demo {
static {
System.loadLibrary("interop");
}
public static native String sayHello(String num);
public static native int[] mulArrByScalar(int[] arr, int scalar);
public static native long factorial(long n) throws RuntimeException;
public static void main(String[] args) {
// string demo
System.out.println(sayHello("Brendan") + "\n");
// array demo
int[] arr = {10, 20, 30, 40};
int x = 3;
System.out.println(Arrays.toString(arr) +
" multiplied by " + x +
" results in " + Arrays.toString(mulArrByScalar(arr, x)) +
"\n");
// number + panic demo
// This can be implemented more peacefully but for sake of demonstration-
// this will panic from the roc side if n is negative
// and in turn will throw a JVM RuntimeException
long n = -1;
System.out.println("Factorial of " + n + " is " + factorial(n));
}
}

View file

@ -0,0 +1,13 @@
platform "jvm-interop"
requires {} { program : _ }
exposes []
packages {}
imports []
provides [programForHost]
programForHost : {
interpolateString : (Str -> Str) as InterpolateString,
mulArrByScalar : (List I32, I32 -> List I32) as MulArrByScalar,
factorial : (I64 -> I64) as Factorial,
}
programForHost = program