Blurragain
Poked away a bit more at the selection mask and compositing code, and for fun I added the 10 lines of code required to add the various selection mathematics in based on the modifier keys (i.e. union, intersection, exclusive or, subtraction and replace).
I still have some issues with the compositing not working quite right for the tool layer when it is active with a mask in replace-mode, but i think that's the only bit left now. I fixed the feathering too - it renders the selection mask to a correctly sized image, blurs it and then pastes it back to the actual selection mask. For the blurring I extracted the multi-threaded blur code from the blur tool into a re-usable object and made it support single channel data and they both now use it.
Enough of the tomatoes, time for the cat
I'm surprised I can even remember cursive let alone write it with a mouse on my first attempt.
I started poking at cleaning up the application state - right now it's all going through the toolbox object which is a singleton. Not happy with what i've come up with so far, and actually I'm starting to wonder if I want that type of single toolbox anyway. But I guess I will stick with a single something-or-other-object to route the state around to the required parts, i.e. current window, current layer, current tool, etc.
Hmm, with that sorted maybe I can start thinking about different backends.
Optimistic?
I may have been a bit overoptimistic with the desire to stick to RGBA float images. I did the maths and it gets a bit out of hand very fast - a full sized image from my camera is over 40MB in uncompressed RGBA/8 format, in float that's pushing 200MB. Add another for the tool layer and another for the compositing buffer and a single-channel image for the selection mask and suddenly you're well on your way to gig country. There's no real reason for the compositing buffer to be so big; it could be as little as a single line (or a tile, I should make it multi-threaded), but breaking up the tool layer would be 'tricky'. Even just clearing that much data is a bit of a task (but surprisingly perhaps it still runs reasonably interactive speed) although again it could limit the area cleared fairly easily.
So I guess I will have to keep other data types in mind after-all but I will probably not bother implementing them for the time being. OpenCL images can be stored in various formats but be read from memory directly as floats, so there I could probably do it relatively transparently, at least for 4-channel images. If I ever get there.