OpenCL SURF
Over the weekend I ported the clsurf code to socles. I had a bit of a play with a few other things as well although they didn't make much of a difference to the execution time (well, maybe 10% maximum kernel time). I guess one thing to try will be to change it so that it doesn't need to communicate with the host at all: it should be possible with some 'persistent' kernels and forcing a hard-limit on the feature point count.
I put a few more details over on the project page.