Wrong UART
Well I got serial output working - was using the wrong UART (should've read the docs before assuming UART1 - it's UART3). And Das U-Boot leaves it properly configured so the output function is only 3 lines of code.
Then I hit an issue as soon as I started to use static data - my code wasn't being loaded at the right address (which took me some time to realise, since the code still ran). I don't really know why it works but I fiddled with the Das U-Boot mkimage line and now it's loading everything to the right address - it should be loading at 0x80008000 but I need to tell it to load at 0x80000000 to make it work. *shrug*
I got sick of ripping the SD card out to setup a new 'kernel' image (and a little worried, the SD slot on the beagleboard isn't really designed for repeated use), so I rigged up a rather nasty `runscript' (from minicom) script which uses ymodem to upload the image whenever I reset the machine. While the image is small this makes the testing cycle pretty easy. Just need a serial port thingy for my workstation and then I wont need the laptop anymore either.
Also along the way I wrote up some basic graphics output - just to clear the screen and display text, including a simple printf type thing. I converted my favourite font 'fixed' to a c file and can now dump text to the framebuffer. Originally I just wanted it to get some debugging output since I couldn't work out what was up with serial, but it's still handy to have.
And lastly I had a bit of a play with the DMA engine. And after a LOT of mucking about, finally got it filling rectangular regions. I couldn't for the life of me work out why double-indexed DMA just did nothing - in my haste to avoid reading the docs too closely I missed the bit which says exactly how it calculates the next address after a 'frame'. I had the alignment mucked up so it did absolutely nothing. If nothing else, it can be used for a very basic 2D accelerator - 90 degree rotations and flips, rectangular moves/copies, solid colour fills, and even masked sprites (colour keyed). So now my 'random filled box demo' runs a lot faster than it did before - around 10x actually (not that it was particularly optimised or anything).
I did some research on USB. Man what a mess that is. A few books i 'found' on the topic were useless too, just written from the perspective of windows driver writers at best (when a book uses 'visual basic' for 'portability' you know there's something very very wrong). I found some USB host stacks but they're proprietary, GPL2.x only, or BSD + advertising clause. And in any event tightly coupled to a given OS (as they must necessarily be). The Haiku one is at least X11 licensed, and clean and simple, so apart from being C++ might be the place to start if I decide to hurt myself one day with that pain in the head.
Hmmm, perhaps interrupts next, will have to get my hands dirty with assembly at some point.