About Me

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as Zed
  to his mates & enemies!

notzed at gmail >
fosstodon.org/@notzed >

Tags

android (44)
beagle (63)
biographical (104)
blogz (9)
business (1)
code (77)
compilerz (1)
cooking (31)
dez (7)
dusk (31)
esp32 (4)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (459)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (231)
java ee (3)
javafx (49)
jjmpeg (81)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (10)
opencl (120)
os (17)
panamaz (5)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
players (1)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (4)
zedzone (26)
Monday, 18 December 2017, 03:42

`parallel' streams

I had a task which I thought naturally fitted the Java streams stuff so tried it out. Turns out it isn't so hot for this case.

The task is to load a set of data from files, process the data, and collate the results. It's quite cpu intensive so is a good fit for parallelisation on modern cpus. Queuing theory would suggest the most efficient processing pipeline would be to run each processing task on it's own thread rather than trying to break the tasks up internally.

I tried a couple of different approaches:

The result was a bit pants. At best they utilised 2 whole cores and the total execution times were 1.0x, 0.77x, and 0.76x respectively of the serial case. The machine is some intel laptop with 4 HT cores (i.e. 8x threads).

I thought maybe it just wasn't throwing enough threads at it and stalling on the i/o, so I tried a separate flatMap() stage to just load the data.

But that made no difference and basically ran the same as the custom collector implementation.

So I hand-rolled a trivial multi-thread processing graph:

With a few sentinel messages to handle finishing off and cleanup.

Result was all 8x "cores" fully utilised and a running time 0.30x of the serial case.

I didn't record the numbers but I also had a different implementation that parallelised parts of the numerical calculation instead. Also using streams via IntStream.range().parallel() (depending on the problem size). Surprisingly this had much better CPU utilisation (5x cores?) and improved runtime. It's surprising because that is a much finer-grained concurrency with higher overheads and not applied to the full calculation.

I've delved into the stream implementation a bit trying to understand how to implement my own Spliterators and whatnot, and it's an extraordinarily large amount of code for these rather middling results.

Not that it isn't a difficult problem to solve in a general way; the stream "executor" doesn't know that I have tasks and i/o which are slow and with latency compared to many small cpu-bound tasks which it seems to be tuned for.

Still a bit disappointing.

Tagged java.
Sunday, 17 December 2017, 07:54

jjmpeg & stuff

Well for whatever reason I got stuck into redoing jjmpeg and seem to have written most of the code (90%?) after a couple of weekends. It was mostly mandraulic and a bit tedious but somehow surprisingly relaxing and engaging; a short stint of unchallenging work can be a nice change. A couple of features are still missing but the main core is done.

Unfortunately my hope that the ffmpeg api was more bindable didn't really pan out but it isn't really any worse either. Some of the nastiest stuff doesn't really need to be dealt with fortunately.

I transformed most of the getters and setters into a small number of simple macros, and thus that part is only about as much work as the previous implementation despite not needing a separate compilation stage. I split most of the objects into separate files to make them simpler to maintain and added some table-based initialisation helpers to reduce the source lines and code footprint.

It's pretty small - counting `;' there's only 750 lines of C and 471 lines of Java sources. The 0.x version has 800 lines of C and 900 lines of Java, a big portion of which is generated from an 800 line (rather unmaintainable) Perl script. And the biggest reduction is the compiled size, the jar shrank from 274KB to 73KB, with only a modest increase from 55KB to 71KB in the (stripped) shared library size (although the latter doesn't include the dvb or utility classes).

There's still a lot of work to do though, I still need to test anything actually works and port over the i/o classes and enum tables at the least, and a few more things probably. This is the boring stuff so it'll depend on my mood.

Fuck PCs

In other news I finally killed my PC - I tried one more time to play with the BIOS and after a few updates it got so unstable it just crashed during an update and bricked the motherboard. Blah. I discovered I could order a new BIOS rom so i've done that and i'll see if i can recover it, otherwise I might get another mobo if I can still get AM2+ boards here, or just get another machine. I'll probably look into the latter anyway as it's always been a bit of a hassle (despite working flawlessly when it does and it's a very nice small machine.

Tagged java, jjmpeg, rants.
Friday, 08 December 2017, 09:16

jjmpeg?

Well i've had reason to visit jjmpeg again for something and although it's still doing the job, it's a very very long way behind in version support (0.10.x?). I've added a couple of things here and there (recently AVFormatContext.open_input so I could open compressed webcam streams) but i'm not particularly interested in dropping another release.

But ... along the way I started looking into writing a new version that will be up to date with current ffmpeg. It's a pretty slow burner and i'm going to be pretty busy with something (relatively interesting, moderately related) for the next couple of months.

But regardless here are a few what-if's should I continue with the effort.

Does anyone else care?

Tagged java, jjmpeg.
Thursday, 30 November 2017, 21:52

RAM saga

Powered down last night because of an approaching thunderstorm ... half my ram gone again.

disable_mtrr_cleanup did nothing. disable_mtrr_trim would hang the boot.

I noticed the RAM speed was wrong in the BIOS again so i reset it, and lo and behold it all showed up, but only until the next reboot. Back to bhe BIOS and just changed the ram from one speed to another and back again - RAM returns!

Erg.

Tagged junk.
Sunday, 19 November 2017, 13:02

I hate my life.

And I wish I was dead.

Sunday, 19 November 2017, 02:52

I hate peecees

Well I was up till 3am fucking around with this bloody machine.

After verifying the hardware actually works it seems that the whole problem with my RAM not being found is the damn BIOS. I downloaded a bunch of BIOSs intending to try an older one and realised I hadn't installed the latest anyway. So I dropped than in and low and behold the memory came back. Yay.

So now I had that, I thought i'd try and get OpenCL working again. Ok, installed the latest (17.40) amdgpu-pro ... and fuck. Unsupported something or other in the kernel module. Sigh. I discovered that 17.30 did apparently work ... but it took a bit of digging to find a download link for it as AMD doesn't link to older releases (at this point i also found out midori has utterly shithouse network code. sigh). I finally found the right one (that wasn't corrupt), installed it and finally ...

Oh, back to 3.5G ram again. FAAARK.

At this point the power went out for the whole street so I had a shower using my phone's torch and went to bed.

Did some more digging when I got up (I was going to give up and say fuck it, but i've come this far), tried manually adding the ram using memmap, and finally confirmed the problem was the BIOS. So i tried an older one. That worked.

But only for a while, and then that broke as well. So trying the previous one ... groan.

Mabye it's time to cut my losses, it's already 1pm, the sun is out, heading for a lovely 31 degrees.

I also got rocm installed but i don't know if it works on this hardware, although at least the kernel is running ok so far.

Tagged junk, rants.
Saturday, 18 November 2017, 09:58

io scheduler, jfs, ssd

I didn't have much to do today and came across some articles about jfs and io schedulers and thought i'd run a few tests while polishing off a bottle of red (Noons Twleve Bells 2011). Actually i'd been waiting for a so-called "mate" to show up but he decided he was too busy to even let me know until after the day was over. Not like i've been suicidally depressed this week or anything.

The test I chose was to compile open jdk 9 which is a pretty i/o intensive and complex build.

My hardware is a Kaveri A10-7850K APU, "4G" of memory (sigh) on an ITX motherboard, and a Samsung SSD 840 EVO 250GB drive. For each test I blew away the build directory, re-ran configure, set the scheduler "echo foo > /sys/block/sda/queue/scheduler", and flushed the buffer caches "echo 3 > /proc/sys/vm/drop_caches" before running "time make images". I forced all cpu cores to a fixed 3.0Ghz using the performance scheduler.

At first I kept getting ICE's in the default compiler so I installed gcc-4.9 ... and got ICE's again. Inconsistently though, and we all know what ICE's in gcc means (it almost always means broken hardware, i.e. RAM). Sigh. It was afer a suspend-resume so that might've had something to do with it.

Well I rebooted and fudged around in the BIOS intending to play with the RAM clock but noticed I'd set the voltage too low. So what the hell I set it 1.60v and rebooted. And whattyaknow, suddenly I have 8G ram working again (for now), first time in a year. Fucking PCs.

Anyway so I ran the tests with each i/o scheduler on the filesystem and had no ICE's this time (which was dissapointing because it means the hardware isn't likely to be busted afterall, just temperamental). I used the default compiler. As it was a fresh reboot I ran builds from the same 80x24 xterm and the only other luser applications running beyond the panels and window manager were 1 other root exterm and an emacs to record the results.

cfq

real    10m39.440s
user    28m55.256s
sys     2m40.140s

deadline

real    10m36.500s
user    28m44.236s
sys     2m40.788s

noop

real    10m42.683s
user    28m43.960s
sys     2m41.036s

As expected from the articles i'd read, deadline (not the default in ubuntu!) is the best. But it's hardly a big deal. Some articles also suggested using noop for SSD storage but that is "clearly" worse (oddly that it increases sys time if it doesn't do anything). I only ran one test each in the order shown so it's only an illustrative result but TBH I'm just not that interested especially if the resutls are so close anyway. If there was something more significant I might care.

I guess i'll switch to deadline anyway, why not.

No doubt much of the compilation ran from buffer cache - infact this is over 2x faster than when I compiled it on Slackware64 a few days ago - at the time I only had 4G system ram (less the framebuffer) and a few fat apps running. But I also had different bios settings (slower ram speed) and btrfs rather than jfs.

As an aside I remember in the early days of Evolution (2000-2-x) when it took 45 minutes for a complete build on a desktop tower, and the code wasn't even very big at the time. That dell piece of crap had woefully shitful i/o. Header files really kill C compilation performance though.

Still openjdk is pretty big, and i'm using a pretty woeful CPU here as well. Steamroller, lol. Where on earth are the ryzen apu's amd??

notzed@minized:~/hg/jdk9u$ find . -type f \
  -a \( -name '*.java' -o -name '*.cpp' -o -name '*.c' \) \
   | xargs grep \; | wc -l
3063124

(some of the c and c++ is platform specific and not built but it's a good enough measure).

Tagged junk.
Friday, 17 November 2017, 22:57

Midori user style

Well I found a way to make Midori usable for me as a browser-of-text using the user stylesheet thing.

Took some theme and stripped out the crap and came up with this ... it turns all the text readable but leaves most of the rest intact which is an improvement on how firefox rendered it's colour and stye overrides. It just overrode everything which broke a lot of style-sheet driven GUI toolkits amongst other sins.

* {
    color: #000 !important;
    text-shadow: 0 0 0px #000 !important;
    box-shadow: none !important;
    background-color: #777 !important;
    border-color: #000 !important;
    border-top-color: #000 !important;
    border-bottom-color: #000 !important;
    border-left-color: #000 !important;
    border-right-color: #000 !important;
}

div, body {
    background: transparent !important;
}

a, a * {
    color: #002255 !important;
    text-decoration: none !important;
}

a:hover, a:hover *, a:visited:hover, a:visited:hover *, span[onclick]:hover, div[onclick]:hover, [role="link"]:hover, [role="link"]:hover *, [role="button"]:hover *, [role="menuitem"]:hover, [role="menuitem"]:hover *, .link:hover, .link:hover * {
    color: #005522 !important;
    text-decoration: none !important;
}

a:visited, a:visited * {
    color: #550022 !important;
}

I don't know if i'll stick with it yet but it's a contender, its built-in javascript blocker looks a lot better than some shitty plugin which will break every ``upgrade'' too. It doesn't seem to run javascript terribly fast, but that's the language's fault for being so shithouse.

Newer Posts | Older Posts
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!