About Me

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as Zed
  to his mates & enemies!

notzed at gmail >
fosstodon.org/@notzed >


android (44)
beagle (63)
biographical (104)
blogz (9)
business (1)
code (77)
compilerz (1)
cooking (31)
dez (7)
dusk (31)
esp32 (4)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (459)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (231)
java ee (3)
javafx (49)
jjmpeg (81)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (10)
opencl (120)
os (17)
panamaz (5)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
players (1)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (4)
zedzone (26)
Thursday, 18 August 2011, 02:37


So apparently a lad's been working on getting some OpenCL code into GEGL. What surprises me is just how slow the result is - and how slow GEGL is at doing the super-simple operation of brightness/contrast even with a CPU.

Of course, I'm not sure exactly what is being timed here, so perhaps it's timing a lot more than just the mathematics. Well obviously it has to be, my ageing Pentium-M laptop can do a 1024x1024xRGBA/FLOAT brightness/contrast in about 70ms with simple single-threaded Java code. So 500ms for the same operation using 'optimised sse2' is including a hell of a lot of extra stuff beyond the maths. Curiously, the screenshot of the profiler shows 840 'tiles' have been processed, if they are 128x64 as suggested then that is 6MP, not 1MP as stated in the post - in that case 500ms isn't so bad (it isn't great either, but at least it's in the same order).

I tried posting this to the forum linked to this phoronix post but for whatever reason it refused to take the post, so i'll post it here instead.

This result is really slow. Like about 100x off if I have the relative performance of that gpu correct. Even the CPU timings look suspect - is GEGL really that slow?

A list of potential bottlenecks:

A list of things which can't be bottlenecks:


In the nvidia profiler, look at the 'gpu time width plot' to see when the gpu is actually doing work. You'll probably see the individual jobs (and memory transfers) take almost no time and it's mostly sitting idle waiting for work from the cpu. It's that idle time which is going to be 99% of the elapsed time which is where you find all the gains at this point.

Don't even bother looking at the graph you posted - memory transfer time will have to be greater than the processing time since the processing is so simple and the gpu memory bandwidth is so much higher than pci speed. All you're doing is confirming that fact. The memory transfer time can mostly be hidden using asynchronous programming techniques anyway, so it is basically irrelevant.

Tagged hacking, opencl.
Wednesday, 17 August 2011, 01:09


So in a bit over 2 years since I turned on the stats, this blog broke the 10K hit barrier in the last few weeks. I guess that's nothing particularly to speak of but for what is mostly a bunch of private rants and technical musings it's not insignificant either.

Although one particular page has the lions share of the hits - and that it continues to do so is interesting in itself. This is the long and rather rambling post about trying to find a Java FFT library and some abuse about visual studio. Although it's clearly the Java FFT that people are searching for to find that page! It shows that someone is doing some scientific programming in Java, which I find interesting. The only thing I really wish Java had for this was a native complex type - doing anything with complex numbers quickly gets ugly, and even worse if you want some speed.

Second on the list is the BeagleBoard GS2010 wrap-up post with about half the number of hits. For such a small community there is quite a lot of interest out there. Unfortunately work commitments and other interests have pulled me away from spending time with the Beagleboard, which is a bit of a pity. For the moment all i'm using mine for is for playing internet radio plugged into my stereo. It's sitting boxless on a coffee table next to the amp and my 'user interface' consists of telnetting to it from my workstation and running mplayer on the command line :)

The next few 'high hitters' (if you could call them that) are low-level posts on: SSE optimisations (which basically said they don't make nearly as much difference as vector ops on CELL did), OpenCL Images vs Arrays (which I find rather difficult to understand myself, but i think the gist of it is that you have to write code differently but both perform about the same), and Context Switching on ARM. I would guess the last one may have helped a few students with their assignments ;-) - it doesn't seem to be a topic of general interest.

Onward and upward

Like everyone else I'm pretty useless at predicting the future but I can probably take a rough guess at where my interests will take me in the next few years. I don't have a need for any particular software any-more (beyond what is a yum invocation away), so whatever I work on is only for entertainment (and perhaps a bit for education, but just solving problems for work educates one a great deal).

I think OpenCL will continue to grow - socles is already my most 'hit' google-code project and the only one anyone ever mailed me about (actually someone did mail me about puppybits). It isn't really going anywhere at the moment because I can't really think of anything to use it for myself - I have some vague ideas of a video-something application (mediaz/VideoZ), but there is so much to think about and code before it even gets started. As applications get bigger and more complex, that starting hump is quite a psychological barrier to get over when there are other sources of entertainment competing for my time. Back to socles though - OpenCL is still a bit of a niche, and Java + OpenCL even more-so, so I'm in no rush to expand it until I can find something to use it for.

As an aside, I've noticed a worrying trend on the OpenCL forums - which seems to be more afflicted by this than other forums, although I've seen it before elsehwere and it's probably just because i don't tend to hang around forums a lot these days. And that is this: inexperienced programmers - most likely students, with a very limited command of the English language, posting questions which demonstrate they can't even be bothered to read the manuals (OpenCL has some very good resources available). And even worse, to paraphrase a comment from the BeagleBoard list, the queries generally amount to to "I'm lazy, can you please do my homework for me?". Extremely rude and disrespectful and really messes up mailing lists and forums.

Puppybits ... well that will probably continue to stay on hold. Unless I take another big break between contracts again and have loads of time to work on it. Every now and then I have a look to see if there are any simple USB host stacks to snarf to help progress it, but nothing's popped up so far. Without USB one is severely constrained. If I ever get the OpenPandora I ordered that might pique my interest in ARM hacking again though. I have a big bunch of 'zedos' work I never committed which I probably should if only so it doesn't get lost from my backups (I `upgraded' my OS a few months ago and lost my development environment for example).

mediaz/ImageZ ... is probably of little use to anyone else, but I will keep poking away at it when I have the inclination. There are a few basic things I need to get sorted out before i'm prepared to drop a jar of it, which I will do at some point. One is the tool overlay mechanism which i'm refining again as I work on a crop-tool. Probably a couple of days work.

jjmpeg ... is already quite useful, although to package it up and polish it off would require a lot more work and time. This is one of those building blocks I needed for the video application I was thinking about, so now it's to some state of usefulness I can at least entertain the idea of moving forward with that. Also, if I decide to switch to it for some work code I have it would probably get a bit more of a work-out as well - it's something i'm considering since I can't get xuggle to build for windows (without more time than i'm willing to waste) and it's ffmpeg libraries are getting a bit out of date. Not to mention tied to 32 bits.

And i'll keep ranting about bits and pieces, cooking, gardening and other shit.

Tagged biographical.
Thursday, 11 August 2011, 12:35

Video/Audio Player

I just checked in a reasonably complete audio/video player example using jjmpeg.

It synchronises the video to the audio if it's there, allows one to seek and pause and so on. The pause function is a bit crap - it keeps running any queued up data from the decoder - but that's only a fraction of a second. It uses a JLabel for output via a BufferedImage, which works well enough if the machine is quick. There are some other problems, but it works reasonably well all things considered. It's using JOAL for audio output.

The code is part of the jjmpegdemos sub-project, and is in the au.notzed.jjmpeg.mediaplayer namespace.

This is the one I mentioned I was working on 2 months ago, and since it was reasonably complete (and I don't think i'll be working on it again for a little while) I thought it was about time I checked it in. I have a swathe of stuff for socles I should probably upload at some point too.

Tagged hacking, java, jjmpeg.
Thursday, 11 August 2011, 03:08


matlabotomisation- vbTo write or modify a matlab or octave script in order to achieve maximum efficiency in processing time. Thus rendering the algorithm virtually indecipherable to both mathematicians and software engineers alike.

Yes, i'm back to reading matlab scripts again - an unfortunately common task when dealing with research from computer scientists.

matlab (the language) is a really basic scripting language, with a library of routines that make processing mathematical algorithms possible, but not exactly easy. It isn't something that mirrors the mathematical language very concisely, nor maps easily to procedural languages. If that were it's only shortcoming it would be bad enough, but it is also really very slow.

So to get performance out of matlab one has to write code using (multi-dimensional) array types. Writing a loop which generates results one at a time is far too slow, so instead you generate a table of indices and then write a formulae that uses these indices to generate all results at once. This can be fairly concise, and it sort of sounds like functional programming or representing mathematics cleanly, but unfortunately it falls well short of this goal and often the code is off generating complex sets of indices which can be confused with it actually doing work. So you end up with something that might run reasonably quick (for matlab anyway), but is a real brain-ache trying to understand. It neither matches the mathematics, nor the processing steps the cpu takes to form the result.

I prefer when the scientist just gives up and writes simple matlab - for one, it makes my life a lot easier, and as a bonus even a trivial Java conversion will run at least an order of magnitude faster. So it makes me look smarter too!

Tagged rants.
Sunday, 07 August 2011, 04:17

OpenRaster, SPI, etc.

After poking around ImageZ a bit late last night I thought i'd tackle multi-layer reading/writing.

So I wrote a writer and eventually a saver for OpenRaster format. I decided on OpenRaster since it is so simple, and it was pretty much how I was going to write it anyway - only I was going to avoid the XML. Being a zip file makes things simple om Java too. It seems to interoperate well enough so far (since I only have 'normal' blend mode working anyway), although if you save layers in greyscale or 16 bit formats from ImageZ and then load/save them from MyPaint, everything is converted to RGBA 8 bit.

I still need a float format though - I started looking into OpenEXR last year - but that was about when I stopped working on ImageZ for a chunk of time too - but I hit some walls with the test images. I can't recall where the issue was now though. This isn't really a high priority.

Today I thought I'd work on writing an ImageReaderSpi for the format as well - for example since currently OpenRaster files do not display in the open requester. But i got too side-tracked trying to implement meta-data and other features which in hindsight I probably don't need. I might revisit it again later with reduced requirements and see if I can get it working.

Along the way I also played with JAXB XML (de)serialisation which looks pretty nice - as nice as things can get with XML I guess. In general I try to avoid XML as much as possible because I think it's the phlegm, vomit, and anal leakage of devil's spawn, so this was a pleasant surprise. No surprise that it wasn't originally an apache project though ...

Also started work on a crop tool. This is exposing me once again to issues with the tool overlays, so I should probably think about cleaning that up somehow too. I'm using piccolo2d at the moment, but the way I have the tools track the current zoom is a right pigs breakfast.

Tagged imagez, java, mediaz.
Saturday, 06 August 2011, 03:10

mediaz <-- ImageZ

I finally uploaded ImageZ to google code, under a new project mediaz. I'm pre-empting myself somewhat here, but i'm leaving room should I develop some other tools - e.g. if the VideoZ stuff ever goes anywhere.

I didn't get around to cleaning up everything I had intended to, so it's well short of being terribly useful, but that's the way it goes I guess. I didn't really want to spend my Saturday at the computer again, but there's not much else to do - everyone else is out and it's a crappy cold, windy and eventually wet day we're headed for.

Update: I had intended to catch up with a couple of mates for some beer and food this evening but I slept in and then it started pissing down with rain so I ended up stuck inside again (watching Port get totally arse-raped by Collingwood). Then I ended up playing with ImageZ a bit more and realised i'd sold it a bit short - there is quite a lot of functionality there after-all, even if some big and rather important parts are missing. I did a bit of hacking on it as well as some house-keeping on the google code page.

Tagged hacking, imagez, java, mediaz.
Friday, 05 August 2011, 03:18

Nvidia opencl 1.1

Yay, so NVidia finally released an opencl 1.1 spec driver. I guess now I should read up more on opencl 1.1 and see if there's anything I can take advantage of - so far It wasn't even on the radar because of their complete lack of support; and i'm happy enough with 1.0 anyway. I'm not sure this is really enough to restore confidence that OpenCL is a first-class citizen on NVidia hardware - their weekly emails haven't mentioned OpenCL for months. We're headed for AMD hardware anyway, if only to try alternatives.

Speaking of AMD, I thought I might try to create a Java binding for the AMD FFT library - I wouldn't mind evaluating it to see if it could replace my current FFT implementation (the apple one, as ported in the jocl demos tree). Unfortunately it uses some types and interfaces which are tricky to wrap in Java, at least in a way which works independent of the architecture's native size. So for now I might put it on the back-burner. (I looked at gluegen briefly but it had trouble parsing something - and the error messages it gives aren't much help).

Tagged hacking, java, opencl.
Thursday, 04 August 2011, 02:49

Mailing Lists

I just set up some mailing lists for jjmpeg and socles.

I can't tell from google-code if there is much interest in the projects, but it seems a better idea to set up a mailing list than to receive direct emails about them.

These are still slow long-burn projects i'm working on when I feel inspired, and inspiration varies greatly from week to week.

Tagged jjmpeg, socles.
Newer Posts | Older Posts
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!