About Me

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as Zed
  to his mates & enemies!

notzed at gmail >
fosstodon.org/@notzed >

Tags

android (44)
beagle (63)
biographical (104)
blogz (9)
business (1)
code (77)
compilerz (1)
cooking (31)
dez (7)
dusk (31)
esp32 (4)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (459)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (231)
java ee (3)
javafx (49)
jjmpeg (81)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (10)
opencl (120)
os (17)
panamaz (5)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
players (1)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (4)
zedzone (26)
Thursday, 25 April 2013, 02:15

0k503

This is the 503rd post to this blog.

I was going to do a 'status update' on everything at the 500 mark, but I just can't be bothered right now and it's not that interesting anyway.

Which pretty much sums up everything else at the moment too.

After some intense activity at work and home i'm a little drained so taking it easier for a bit. Going from sprint to winter always seem to trigger a bit of don't-care too.

Tagged biographical.
Sunday, 21 April 2013, 23:40

In perspective

Every day in the USA alone, around 12 pedestrians are killed every day and a further 200 or so injured (information readily found from official sources).

Every day.

And yet those deaths and injuries don't receive wall-to-wall news scaremongering news coverage and demands for more oppressive law enforcement.

Tagged politics, rants.
Sunday, 14 April 2013, 06:16

Map Viewing Tool

I had a couple of hours after waking up too early so I hacked a couple more things into the prototype map viewer - almost making it a useful-if-a-bit-crude tool.

First it displays any 'interesting' tiles - ones with any associated game behaviour - using red squares. Then as it was so simple even though it isn't terribly useful more than once, it does the same for the original dusk game - using blue squares. Then down the left-hand side it shows all the details of the scripts.

For starters it shows that I didn't get the tutorial complete after-all, although one of those squares just has an empty script.

Not a lot more work is required to turn it into a useful if basic game behaviour editor. Just the ability to save and create scripts and a few little lookup and navigation functions.

A couple of days ago after some work on jjjay I had a fleeting thought about how the game might be transformed to use a database backend. With Berkeley DB JE it would be a cinch, I could reliably persist all the live game objects into a single table with very little work indeed - the subclassing stuff is better than I can recall from last time i looked at it but I suspect I just didn't look too closely. I shouldn't need any DTO's. Actually it would probably turn out to be easier than all the text-mode import/export stuff I had to come up with and it could also be used to replace some of the internal indices. Using a db means that a text editor can no longer be the primary development tool so it would also require additional tooling. Something to ferment in the wort a bit longer ...

I had some other fleeing thought about something about the game, but it just flitted away ...

On an unrelated note, ScrollBar doesn't support middle-click. Annoying.

Oh thats right on the flittery thought. The way JavaScript is being used in the scripts is so simple, I wonder if it isn't easier just using Java instead. The scripts would stay in their format but could inserted into a template before being compiled. Might be an option for rarely-changing/performance critical tasks, but might not be so easy for any dynamic content. Definitely not something i'm going to rush into without a lot of experimentation.

Tagged dusk, java, javafx.
Friday, 12 April 2013, 00:48

Virtual tile grid

So I have been busy with quite a lot of other stuff lately and haven't really been looking at duskz much for a while. Partly there are some biggish problems to do with tooling and game-building that i'm not sure how to address off the top of my head, partly my mind is too full of NEON/Android/OpenCL and takes too long to context switch, and partly I just have other distractions.

This long weekend (every weekend for me is a long one now! yay for me!) the weather is too unseasonably nice to be stuck inside and I need to do some work in the yard too, so I probably wont do any more soon either.

However the other evening I had a short look at addressing one of the problems - how to find out where interactive things are set up within the map. Oddly enough looking at coordinate pairs and using command line globbing isn't a very friendly way to navigate through the game ...

To that end I have had the idea of a JavaFX fork of Tiled, just forking Tiled as is, or creating something more task specific ... that idea is still on the back-burner (one gets a little over-excited at times) but I did investigate how JavaFX would go about displaying the map layers in such a tool. Possibly it should just be a 'creator' mode within the game itself, but a separate tool has it's advantages too.

JavaFX scalability

I tried loading the Dusk map into a Pane using ImageViews, and it just exhausted memory after a very long pause. Yes the map is a very impractical 700x700 tiles which is not necessary anymore with the multi-map support, but if it would be nice if it coped.

JavaFX isn't really 'lightweight' as is mentioned in places - just looking at Node would tell you this.

It would be nice if there were Array versions of some of the basic Nodes that could manage state in a much more efficient manner. e.g. an imageview that could draw thousands of images, but tracks the coordinates or other adjustables in arrays of primitives which are more compact and can be processed efficiently. i.e. by using a structure of arrays rather than an array of structures. Such an approach has some pretty big benefits for things like parallelisation and animation too, but might be hard to fit into a general purpose object-oriented api such as JavaFX.

But even for something like this an Array implementation would need to virtualise it's coordinate generation to be efficient - 16 bytes per coordinate is a lot of memory particularly when the coordinate can be calculated faster on the fly from the tile location than it could ever be loaded from memory, even without an object dereference.

But since no such mechanism exists, one is left with ...

Virtualisation

So basically you get to throw out all that nice scene-graph stuff and have to manage the viewable objects yourself. This management of the viewport is kind of the whole point of scene-graphs (or at least a very major part of it), so for a scene-graph implementation to require manual virtualisation is a bit of a bummer. But what can you do eh? Fortunately for 2D the problem is relatively simple but it will fall down completely for 3D.

I started by trying to read through the ListView source to see how it virtualised itself. After half an hour trolling through half a dozen classes spread across a few namespaces I can't say i'm much the wiser. It seems to be based on the layout pass, but how the layout is requested is a mystery ... Although I did gather that unlike Swing (which to be honest was confusing and difficult to understand) which allows Gadgets to render virtually within any ScrollView (as far as i can tell), JavaFX's ListView just does the scrollbar management itself.

I figured that perhaps I could just use a TableView too, but as i'm not familiar with it at this point I thought i'd give it a pass. It didn't quite seem to be the right approach anyway as as far as I can tell the data still needs to be added as fielded rows, whereas my virtual data 'model' is a 1D array of shorts and it would be easier just to access it directly as 2D data.

As I couldn't really work out how the ListView was working I just took a stab with what I could figure out. I do everything in the layoutChildren() function. First I ensure there are enough ImageView objects just to cover the screen, and then update the tile content to match when the location changes. Per-pixel scrolling is achieved by a simple modulo of the location if you really must have it (personally I find it annoying for lists).

    Pane graphics;

    ScrollBar vbar;
    ScrollBar hbar;

    int vcols, vrows;
    double oldw, oldh;
    
    protected void layoutChildren() {
        super.layoutChildren();
        if (oldw != getWidth() || oldh != getHeight()) {
            vcols = (int) (getWidth() / 32);
            vrows = (int) (getHeight() / 32);
            oldw = getWidth();
            oldh = getHeight();

            ObservableList<Node> c = graphics.getChildren();

            c.clear();
            for (int y = 0; y < vrows; y++) {
                for (int x = 0; x < vcols; x++) {
                    ImageView iv = new ImageView();
                                
                    iv.relocate(x * 32, y * 32);
                    c.add(iv);
                }
            }
        
        }
        
        updateMapVisible();
    }

If the size has changed, it creates enough ImageView's to cover the screen (actually it needs to do one more row and column, but you get the idea).

updateMapVisible(), well updates the visible map oddly enough.

    private void updateMapVisible() {
        int y0 = (int) (vbar.getValue() / 32);
        int x0 = (int) (hbar.getValue() / 32);

        // Set per-pixel offset
        graphics.setTranslateX(-((long)hbar.getValue() & 31));
        graphics.setTranslateY(-((long)vbar.getValue() & 31));

        for (int y = 0; y < vrows; y++) {
            int ty = y + y0;
            for (int x = 0; x < vcols; x++) {
                int tx = x + x0;
                ImageView iv = (ImageView) graphics.getChildren().get(x + y * vcols);

                int tileid = map.getTile(0, tx, ty);
                data.updateTile(iv, tileid, tileSize, tileSize);
            }
        }
    }

Initially I just created new ImageViews, but just updating the viewport and/or the image was faster. Obviously updateMapVisible could optimise further by only refreshing the images if the tile origin has changed, but it's not that important.

There is only one extra bit required to make it work - manage the scrollbars so they represent the view size.

    protected void layoutChildren() {

        ... other above ...

        vbar.setMax(map.getRows() * 32);
        hbar.setMax(map.getCols() * 32);
        vbar.setVisibleAmount(getHeight());
        hbar.setVisibleAmount(getWidth());
    }

It's only an investigative bit of prototype code so it doesn't handle layers, but obviously the same is just repeated for each tile or whatever other layer is required.

It's NOT! magic ...

And I gotta say, this whole thing is a hell of a lot simpler to manage than any other virtually scrollable mechanism I've seen. General purpose virtually scrollable containers always seem to get bogged down in how to report the size to the parent container and other (unnecessarily) messy details with scrolling and handle sizes and so on. A complete implementation would require more complication from selection support and so on, but really each bit is as simple as it should be.

One thing I do like about JavaFX is that in general (and so far ...) it doesn't need to rely on any weird 'magic' and hidden knowledge for shit to work. The scenegraph is just a plain old data structure you can modify at whim without having to worry too much about internal details - the only limitation is any modifications to a live graph needs to be on a specific thread (which is trivially MT-enabled using message passing). I've you've ever worked with writing custom gadgets for any other toolkit you're always faced with some weird behaviour that takes a lot of knowledge to work with, and unless you wrote the toolkit you probably will never grok.

Although having said that ...

What I didn't understand is that simply including a ScrollBar inside the view causes requestLayout() to be invoked every time the handle moves. I'm no sure if this is a feature (some 'hidden' magic) or a bug. Well if it is at least it's a fairly sane bit of magic. The visibleAmount stuff above also doesn't really work properly either - as listed it allows the scrollbar to scroll to exactly one page beyond the limit in each direction. If i tried adjusting the Max by the viewport size ... weird shit happened. Such as creating much too big handle which didn't represent the viewable area properly. Not sure on that, but it was late and I was tired and hungry so it might have been a simple arithmetic error or something.

I suspect just using a WritableImage and rendering via a Canvas would be more efficient ... but then you lose all the animation stuff which could come in handy. The approach above will not work well for a wide zoom either as you may end up needing to create an ImageView for every tile anyway which will be super-slow and run out of memory. So to support a very wide zoom you'd be forced to implement a Canvas version. i.e. again something the scene-graph should handle.

I'm still struggling a bit with general layout issues in JavaFX - when and when not to use a Group, how things align and resizing to fit. That's something I just need a lot more time with I guess. The tech demo I wrote about in yesterday's post was one of the first 'real' applications I've created for my customer that uses JavaFX so I will be getting more exposure to it. Even with that I had a hard time getting the Stage to resize once the content changed (based on 'opening a file') - actually I couldn't so I gave up.

Update: Well I worked out the ScrollBar stuff.

When you set VisibleAmount all it does is change the size of the handle - it doesn't change the reported range which still goes from Min to Max. So one has to manually scale the result and take account of the visible amount yourself.

e.g. something like this, which scales the maximum from 0-(max-visible) linearly, where Max was set to the total information width, and VisibleAmount was set to the width, in pixels.

    double getOriginX() {
        return hbar.getValue() * (hbar.getMax() - getWidth()) / hbar.getMax();
    }

TBH it's a bit annoying, and I can't really see a reason one would ever not need to do this when using a ScrollBar as a scroll bar (vs a slider).

Tagged code, dusk, hacking, javafx.
Thursday, 11 April 2013, 11:59

Beware the malloc ...

So one of the earliest performance lessons I learnt was to try to avoid allocating and freeing memory during some processing step. From bss sections, to stack pointer manipulation, to memory pools.

This was particularly important with Amiga code with it's single-threaded fit-first allocator, but also with SunOS and Solaris - it wasn't until a few versions of glibc in, together with faster hardware, that it became less of an issue on GNU systems. And with the JVM many of the allocation scenarios that were still a problem (e.g. many small or short-lived objects) with libc simply vanished (there are others to worry about, but they are easier to deal with ... usually).

It was something I utilised when I wrote zvt to make it quick, unfortunately whomever started maintaining it after I started at Ximian (I wasn't allowed time to work on zvt anymore, and ximian was a bit too busy to keep it as a hobby) didn't understand why the code did that and it was one of the first things to go ... although by then on a GNU system it wasn't so much of an issue.

But even with a super-computer on the desk it's still a fairly major issue with GPU code. Knowing this I always pre-allocate buffers for a given pipeline and let OpenCL virtualise it if required (or more likely, just run out of memory) but today I had a graphic reinforcement of just why this is such a good idea.

After hacking all week I managed to improve the 'kernel time' of a specific high-level algorithm from 50ms to about 6ms. I was pretty damn chuffed, particularly as it also works better at what it's doing.

However when I finally hooked it up to a working tech demo, the performance improvement plummeted to only about 3x - one expects quite a lot of overhead with a first-cut synchronous implementation from c-java-opencl-java-javafx, but that just seemed unreasonable, it seemed like every kernel invocation had nearly 1ms overhead.

Without any way to use the sprofile output at that level (nanosecond timestamps aren't visually rich ...) I added some manual timing and tracked it down to one routine. Turned out that my port of the Apple FFT code was re-allocating temporary work-space whenever the batch-size changed (rather than simply if it grew). Simply those 2 frees and 2 allocations were taking 20ms alone and obviously swamping the processing and other overheads completely.

Whilst at 15ms with about 60% of the time spent in setup, data transfer, and invocation overheads it is still pretty poor, it is acceptable enough for this particular application for a first-cut single-queue synchronous implementation. Actually apart from running much faster, the new routine barely warms up the GPU and the rest of the system remains more responsive. I must try a newer driver to see if it improves anything, i'm still on 12.x.

Bring on HSA ... I really want to see how OpenCL will work with the better architecture of HSA. Maybe the GCN equipped APUs will have enough capability to show where it's headed, if not enough to show where it's going to end.

Why is the interesting hardware always 6-12 months away?

Tagged hacking, opencl.
Wednesday, 10 April 2013, 11:04

industrial fudge

It survives ... nearly 20 years on.

Never did finish and write the 'AGA' version ... but there were some pretty good reasons for that. When it was released some scumbag stole my second floppy drive, the next night I shorted my home-made video cable and blew out the blue signal, and I had only 2 hours sleep over a 96 hour stretch ... so I kinda lost interest for a while. Can't remember ever having slept particularly well ever since.

Hardware was so much simpler (and more fun) back then.

Tagged biographical.
Thursday, 04 April 2013, 10:40

ActionBar, cramped screen-space, etc.

So after a few ui changes I tried jjjay on the tablet ... oh hang on, there's no menu button and no way to bring it up.

That's one ui design idea out the window ...

So basically I am forced to use an actionbar, and seeing how that is the case I decided to get rid of everything else and just use the actionbar for the buttons. Goodbye jog-wheel, it wasn't really doing anything useful on that screen anyway (unless I added a preview window, but that's going to be too slow/take up too much room on a phone). I also moved the scale/scrollbar thing to the bottom of the screen which is much more usable on a phone as otherwise you accidentally hit the action bar or the notification pull-down too easily.

It creates a more uncluttered view so I suppose it was a better idea anyway.

I fixed a few other little things, played with hooking up buttons to focus/action, and experimented with a 'fling' animator, but at this point some bits of the prototype hacked-up code is starting to collapse in upon itself. It's not really worth spending much more time trying to get it to do other things without a good reorganisation and clean up of some fairly sizeable sections.

I need a model for the time scale pan/zoom although I could live without that for the time being. A model for the sequencer is probably more necessary otherwise adding basic operations such as delete get messier and messier. There's still a bunch of basic interface required too; setting project and rendering particulars, and basic clip details such as transition time and effects.

So this is basically where the prototype development ends and the application development begins. 90% of the effort left? Yes probably. 6 days of prototype to 2 months of application development ... hmm, sounds fairly reasonable if a little on the optimistic side, generally it takes 3 writes to get something right.

I guess i'll find out if I keep poking at it ...

Tagged android, hacking, jjmpeg.
Wednesday, 03 April 2013, 06:59

threads, memory, database, service.

I hooked up the video generator to the jjjay interface last night and did some playing around on a phone form factor.

The video reading and composing was pretty slow, so I separated them into other threads and have them prepare frames in advance. With those changes frames are generally ready in Bitmap form before they're needed, and 80% of the main thread is occupied with the output codec. This allowed me to implement a better frame selection scheme that would let me implement some simple frame interpolation for timebase correction. The frame consumer keeps track of two frames at once, each bracketing the current timestamp. Currently it then chooses the lower-but-in-range one. The frame producer just spits out frames into a blocking queue from another thread - before I had some nasty pull logic and nearby-frame cache, but that is the kind of dumb design decision one makes in design-as-you-go prototype code written at some funny hour of the night.

I guess to be practical at higher resolutions hardware encoding will be necessary, but at SD resolution it isn't too bad. VGA @ 25fps encodes around 1-2x realtime on a quad-core phone, depending mostly on the source material. I think that's liveable. 1280x720 x264 at 1Mb/s was about 1/4 realtime. I suppose I should investigate adding libx264 to the build too.

I cleaned up some of the frame copying and so on by copying the AVFrame directly to a Bitmap, it still needs to use some pretty slow software YUV conversion but that's an issue for another day. Memory use exploded once I started decoding frames in other threads which was puzzling because it should've been an improvement from the previous iteration where all codecs were opened for all clips in the whole scene. But maybe I just didn't look at the numbers. I guess when you do the sums 5x HDxRGBA frames adds up pretty quick, so I reduced the buffering.

It was fun to finally get some output from the full interface, and as mentioned in the last post helps expose the usability issues. I had to play a bit with the interface to make the phone fit better, but i'm still not particularly happy with the sequence editor - it's just hard fitting enough information on the screen at a reasonable size in a usable manner. I changed the database schema so that clips are global rather than per project, and create and store clip/video icons along with the data.

So with a bit more consolidation to add a couple of essential features it'll approach an alpha state. Things like the rendering need to be moved to a service as well. And work out the build ... ugh.

At least I have worked out a (slightly hacked up) way to use jjmpeg from another Android project without too much pain. It involves some copying of files but softlinks would probably work too (jjmpeg only builds on a GNU system so I don't think that's a big problem). Essentially jjmpeg-core.jar is copied to the libs directory and libjjmpegNNN.so is referenced as a pre-build library in the projects own jni/Android.mk. If I created an android library project for jjmpeg-core this could mostly be automatic (apart from the make inside jni).

Incidentally as part of this I've used the on-phone camera to take some test shots - really pretty disappointed with the quality. Phone's still have a long way to go in that department. I can forgive a 100$ chinese tablet and it's front-facing camera for worn-out-VHS quality, but a 800$ phone? Using adb logcat on this 'updated, western phone' is also very frustrating - it's full of debug spew from the system and bundled software which makes it hard to filter out the useful stuff - cyanogen on the tablet is far quieter.

Rounding to the nearest day jjjay is about 5 days work so far.

Attached Bitmaps

On an unrelated note, I was working on updating a bitmap from an algorithm in a background thread, and this caused lots of crashes. Given it's 3KLOC of C+Assembly one generally suspects the C ... But it turned out just to be the greyscale to rgba conversion code which I just hacked up in Java for prototype purposes.

Android seems to do OpenGL stuff when you try to load the pixels on an attached Bitmap, and when done from another thread in Java, things get screwed up.

However ... it only seems to care if you do it from Java.

By changing the code to update the Bitmap from the JNI code - which involves a/an locking/unlocking step it all seems good.

Of course as a side-effect simply moving the greyscale to RGBA conversion to a C loop made it run about 10x faster too. Dalvik pretty much sucks for performance.

Update: Well add a couple more hours to the development. I just moved the rendering task to a Service, which was overall easier than I remembered dealing with Services last time. I guess it helps when you have your own code to look at and maybe after doing it enough times you learn what is unnecessary fluff. Took me too long to get the Intent-on-finished working (you can play the result video), I wasn't writing the file to a public location.

I cleaned up the interface a bit and moved "render" to a menu item - it's not something you want to press accidentally.

So yeah it does the rendering in a service, one job at a time via a thread pool, provides a notification thing with a progress bar, and once it's finished you can click on it to play the video. You know, all the mod-cons we've come to expect.

Tagged android, hacking, jjmpeg.
Newer Posts | Older Posts
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!