OpenJDK project Panama hacks
I had a look at the OpenJDK project panama last week. It's a way to replace using JNI to bind Java to native libraries by using plain old java which through annotations can be bound to the native libraries at runtime inside the jvm.
Actually I had no idea how it worked just that one uses a tool
called jextract
to convert header files into a jar
file which you can then link against. However I wasn't really
happy with the way it works (and it sounds like it's being changed
for mostly those reasons), so i ended up 'reverse engineering' the
tool and writing my own.
I probably should've looked harder for documentation but I didn't find much that specified the details so I just ran jextract on some files and tried to generate the same output.
I created a project page and a repository and dubbed it 'panamaz' in an unimaginative following of tradition.
export.cc
The first part of the problem is getting an accurate dump of the c types and prototypes you want to use. jextract is built using clang, but I've had a tiny bit of experience poking about that and I really didn't like what I saw so I thought I'd try using a gcc plugin.
I didn't look very far but i found a couple of bits of example code and opened up gcc-internals.info and got to work. And it was a lot of work. I can see why ICEs were once so common in gcc - you get no compile time errors for bad type assumptions of the syntax tree, and when you do the wrong thing sometimes it will work anyway. The documentation isn't completely accurate either, which is pretty annoying considering how complex the system is.
But anyway in the end of I have a nice tool that can dump considerable information about c header files. The output format is a better, more usable, canonically-decoding version of 'json', or what one might term 'Perl, son'.
The following simple example:
struct bob { int a; float b; };
Produces the descriptor file:
%data = ( 'struct:bob' => { name => 'bob', type => 'struct', size => 64, fields => [ { name => 'a', size => 32, offset => 0, ctype => 'int', type => 'i32',}, { name => 'b', size => 32, offset => 32, ctype => 'float', type => 'f32',}, ]}, # dumped structs: # bob );
generate
This is a (mostly!) better-than-i-usually-write bit of perl which takes the 'perl-son', and turns it into a bunch of java files. These are all interfaces which have the correct annotations to be used by the project-panama jdk to perform runtime linking.
It's got a lot of flexible options but the one I like is that you can specify the set of functions and/or types you want and it will recursively drag in all the types they depend on automatically, and only those. At least for most of it, I still haven't done the same for callbacks yet because they're a bit fiddly.
It ignores enums for now because I just ran out of juice but it covers all the other major parts of the language and even some of the lesser parts.
panama jdk
Most of the information is conveyed in signature strings which are
passed to the jdk via annotations in interfaces. The simple
example above is translated to an interface with
the @NativeStruct
annotation that includes the two
field names and their types in order.
import java.foreign.annotations.*; import java.foreign.memory.*; @NativeStruct(value="[i32(a)f32(b)](bob)", resolutionContext={}) public interface Bob extends Struct{ @NativeGetter(value="a") public int getA(); @NativeSetter(value="a") public void setA(int value); @NativeGetter(value="b") public float getB(); @NativeSetter(value="b") public void setB(float value); }
Accessors can be named anything (jextract
uses the
uber-friendly notation of name$set(...)
and so forth) but the field they operate on is indicated using
the @NativeGetter
or @NativeSetter
attribute and specifying the field name, which is defined in
the @NativeStruct
annotation. There's also an
address-of annotation for a rather verbose equivalent of the &
operator.
I didn't come across a document describing this format (and I
didn't look beyond the annotation source code) but I worked it out
by seeing what jextract
did. Most of it is pretty
straightforward.
i32, i64, ... | Signed integer of the given bit width. |
u32, u64, ... | Unsigned integer of the given bit width. |
f32, f64 | IEEE float of the given bit width. |
x8, x16, ... | Padding, artibrary number of bits. |
${type} | Named compound type |
=u64[many of iX, uX, xX] | Bitfields in long (or u32= for int), where X is an arbitrary number not limited to multiples of 8. |
u64: | 64-bit pointer |
u64(name):(argtypes)returntype | Function signature |
[type(fieldname)type(fieldname)...](name) | struct name |
[type(fieldname)|type(fieldname)|...](name) | union name |
... and others |
The bitfield one is a bit odd, I'm not sure why one has to specify
an internal detail such as the actual word-size used for storage,
but I don't use bitfields much. Perhaps it's to supports some
alignment/packing __attribute__
in bitfields which is
used to map hardware registers, although that seems rather obtuse
usage for Java api bindings.
So basically I kept banging at the exporter and the generator until I got something working, then tried to compile it and then fixed those problems. Then tried instantiating the object and fixing the breakage. Then tried another type and so on and kept going until it all worked. And then I rewrote the exporter almost entirely to clean up the code. And then fixed the new bugs. Then created a simple example.
Then created a repository and a project page, then sent an email to the panama-dev list, then wrote a blog post ...