OpenJDK project Panama hacks

I had a look at the OpenJDK project panama last week. It's a way to replace using JNI to bind Java to native libraries by using plain old java which through annotations can be bound to the native libraries at runtime inside the jvm.

Actually I had no idea how it worked just that one uses a tool called jextract to convert header files into a jar file which you can then link against. However I wasn't really happy with the way it works (and it sounds like it's being changed for mostly those reasons), so i ended up 'reverse engineering' the tool and writing my own.

I probably should've looked harder for documentation but I didn't find much that specified the details so I just ran jextract on some files and tried to generate the same output.

I created a project page and a repository and dubbed it 'panamaz' in an unimaginative following of tradition.

export.cc

The first part of the problem is getting an accurate dump of the c types and prototypes you want to use. jextract is built using clang, but I've had a tiny bit of experience poking about that and I really didn't like what I saw so I thought I'd try using a gcc plugin.

I didn't look very far but i found a couple of bits of example code and opened up gcc-internals.info and got to work. And it was a lot of work. I can see why ICEs were once so common in gcc - you get no compile time errors for bad type assumptions of the syntax tree, and when you do the wrong thing sometimes it will work anyway. The documentation isn't completely accurate either, which is pretty annoying considering how complex the system is.

But anyway in the end of I have a nice tool that can dump considerable information about c header files. The output format is a better, more usable, canonically-decoding version of 'json', or what one might term 'Perl, son'.

The following simple example:

struct bob {
    int a;
    float b;
};

Produces the descriptor file:

%data = (
'struct:bob' => { name => 'bob', type => 'struct', size => 64, fields => [
        { name => 'a', size => 32, offset => 0, ctype => 'int', type => 'i32',},
        { name => 'b', size => 32, offset => 32, ctype => 'float', type => 'f32',},
]},
# dumped structs:
# bob
);

generate

This is a (mostly!) better-than-i-usually-write bit of perl which takes the 'perl-son', and turns it into a bunch of java files. These are all interfaces which have the correct annotations to be used by the project-panama jdk to perform runtime linking.

It's got a lot of flexible options but the one I like is that you can specify the set of functions and/or types you want and it will recursively drag in all the types they depend on automatically, and only those. At least for most of it, I still haven't done the same for callbacks yet because they're a bit fiddly.

It ignores enums for now because I just ran out of juice but it covers all the other major parts of the language and even some of the lesser parts.

panama jdk

Most of the information is conveyed in signature strings which are passed to the jdk via annotations in interfaces. The simple example above is translated to an interface with the @NativeStruct annotation that includes the two field names and their types in order.

import java.foreign.annotations.*;
import java.foreign.memory.*;
@NativeStruct(value="[i32(a)f32(b)](bob)", resolutionContext={})
public interface Bob extends Struct {
        @NativeGetter(value="a")
        public int getA();
        @NativeSetter(value="a")
        public void setA(int value);
        @NativeGetter(value="b")
        public float getB();
        @NativeSetter(value="b")
        public void setB(float value);
}

Accessors can be named anything (jextract uses the uber-friendly notation of name$set(...) and so forth) but the field they operate on is indicated using the @NativeGetter or @NativeSetter attribute and specifying the field name, which is defined in the @NativeStruct annotation. There's also an address-of annotation for a rather verbose equivalent of the & operator.

I didn't come across a document describing this format (and I didn't look beyond the annotation source code) but I worked it out by seeing what jextract did. Most of it is pretty straightforward.

i32, i64, ...	Signed integer of the given bit width.
u32, u64, ...	Unsigned integer of the given bit width.
f32, f64	IEEE float of the given bit width.
x8, x16, ...	Padding, artibrary number of bits.
${type}	Named compound type
=u64[many of iX, uX, xX]	Bitfields in long (or u32= for int), where X is an arbitrary number not limited to multiples of 8.
u64:	64-bit pointer
u64(name):(argtypes)returntype	Function signature
[type(fieldname)type(fieldname)...](name)	struct name
[type(fieldname)\|type(fieldname)\|...](name)	union name
	... and others

The bitfield one is a bit odd, I'm not sure why one has to specify an internal detail such as the actual word-size used for storage, but I don't use bitfields much. Perhaps it's to supports some alignment/packing __attribute__ in bitfields which is used to map hardware registers, although that seems rather obtuse usage for Java api bindings.

So basically I kept banging at the exporter and the generator until I got something working, then tried to compile it and then fixed those problems. Then tried instantiating the object and fixing the breakage. Then tried another type and so on and kept going until it all worked. And then I rewrote the exporter almost entirely to clean up the code. And then fixed the new bugs. Then created a simple example.

Then created a repository and a project page, then sent an email to the panama-dev list, then wrote a blog post ...

About Me

Tags

OpenJDK project Panama hacks

export.cc

generate

panama jdk