Bugz
Fixed a couple of bugs in the blogz code to do with post urls, and
added newer/older links on the individual post page.
I think I can now setup robots.txt so it doesn't over-index.
Should be able to just block '/tag' and '/blog?' I think. So it
just indexes the front page and then anything /post/* it finds
from it. Any cralwer should be able to find the whole blog from
one post now they're linked.
Well i've put it in and i'll keep an eye on the logs to see what
happens.
I can't add the post title to the prev/next post links yet but
i've been working on a database based index which will allow it
and some other useful things. I could just hack another index
into the compiled-in indices but I think that path has outlived
it's usefulness.
Memory, IPC. Tradeoffs.
I've been playing around with a few ideas for libeze but it's not
gotten anywhere particular yet. I'm getting stuck on the
tradeoffs - even though they don't really matter much.
IPC - Serialisation
It started when I read a rant against serialsiation formats; this
I agree with. An internal blob isn't that important but I thought
i'd look at some to see if any where useful.
- XDR
-
This is old and simple. It was written by a smart bunch from
a smart company from a time when smarter decisions were being
made about things like this. The biggest drawback is the
big-endian nature, but I could always fudge it and just pass
native-endian values and call it RDX or something.
I then had the bright idea of being able to decode structures
without any additional memory - merely referencing the packet
data. But this creates a couple of additional areas where it
would diverge: strings would need to include their trailing nul
byte to be useful, and embedded arrays might need specific
alignment.
So really, it's not quite RFX anymore either. OTOH R is
listed as `supporting' XDR but it's version is modified as
well.
XDR is also not self describing but descriminated
descriminated unions go a long way to handling versioning
which is the main reason I might want such a feature.
- ASN.1 BER/DER/CER/PER
-
There is actually some stuff I quite like about BER. It's
self describing and relatively compact.
There's a lot that makes it a pain. A full implementation
needs to implement a lot as it is vastly over-engineered; this
means it is complex and likely to be buggy. REAL numbers are
encoded in a way that needs a lot of fiddling about on either
end.
It's not all wrong, but it's not too right either.
- Protocol buffers
-
I've had the misfortune to have to deal with these for work.
The actual on-wire format isn't too bad. It's not
self-describing but you can safely skip buts you don't
understand. It's relatively compact. It's also a bit
pointless.
The SDK is a complete pile of shit though. The Java
implementation compiles gigantic code that runs fucking slow
(List<Float>???). It moves all of the protocol variance
decisions into the calling code so it's a messy pain to use as
well: sort of negating one of the reasons to use a
self-describing format in the first place.
For work I wrote my own decoder when I could - a couple of
dozen lines of code. But some projects use such complex
structures you end up with a gigantic library to decode it
(supplied by protoc) and then a gigantic library to marshall
it's output into something usable (that you write by hand).
- JSON
-
It's just fucked. Even the same implementation can't decide
how to canonically encode empty or single-element arrays, null
or emptry strings, and so on.
It's not very compact and while it's realtively simple to
parse it's not trivial either.
And then it's clumsy as fuck to use from any non-dynamic
language. Getting fields you even know are there is a pain.
Handling variances is worse.
- CBOR (and half a dozen others)
-
Well this is just binary JSON. It fixes the end-to-end
canonical encodingness and is arguably faster to parse but
it's still going to be clumsy to use and not significnatly
more compact except for arrays of small integers.
It also has some questionable design decisions and is already
bloating with a dozen extension types - it seems like it wants
to look as 'valid' as ASN.1/BER but with an even messier
encoding scheme.
The whole "look we have a completely-defined 256 element jump
table" is just a fucking weird thing to have in a
serialisation format RFC. And nothing says simple like 256
right? What on earth is that even about?
- SDXF
-
This is another RFC I came across along the way - not sure how
as it just took me 10 minutes to re-find it.
Honestly it's a weird arsed format that looks like someone's
internal stuff they decided to publish. I can't imagine
anyone uses it.
I would note that with some small changes the meta-data I have can
support pretty much all of these formats.
But I pretty much wasted a few hours of my life on this. After all
that fucking around I would be inclined to just dump structures
around if i didn't need strings! I have several desirable
features which can't be covered by a single format anyway,
although two would probably suffice.
e.g. here's some use cases.
- Persistent storage.
- Compact
- Fast decoding
- Versionable
- Objects are good enough to be used directly without having
a DTO
- Complex and nested structures?
- Self describing?
- Local IPC
- Very fast encoding and decoding
- Possibility to decode without auxilliary memory
- Simple and flat structures
- Keep security in mind
- Internet IPC, External interfaces.
- Compact
- Versionable
- Possibly to encode and decode in a streaming fashion
- Simple and unambiguous to help avoid writing-in security issues
- Self describing?
Ok, maybe 3 formats. Although i'm not really interested in the
3rd at the moment.
The Local IPC case is basically a subset of XDR, aside from the
aforementioned tweaks.
The rest? Plenty there but none are a great fit.
Memory
Memory allocation in C is simple enough but managing it can be a
bit of a pain, hence pools.
Another issue is that a given implementation enforces certain
constraints which can waste (significant) memory: for example on
an amd64 platform, malloc() allocates at least 32-bytes for each
allocation even if it's only a 5 byte string (this is is the
minimum required to track free blocks). And every allocation also
saves a size_t before the memory block so it knows how big it was
when you call free: when you have many small objects of the same
size this wastes a lot of memory. A more subtle problem is that
memalign() must also save this size, so that if you start
allocating many aligned structures you start to quickly get a
holey memory.
Dealing with strings in general is a bit of a pain (and a source
of many security issues) and only the GNU-sepcific obstack really
provides any good support for it.
So I experimented with a few ideas.
High level internals of malloc() & free()
An implementation of this pair basically all work the same.
malloc will find a block of free memory big enough to hold the
requested size plus space to hold that size. Some area of this
block will be reserved for the allocation and the free block will
either be reduced by the same amount, or removed (if the whole
block is used).
free() will take the saved size and address and mark it as free.
As part of this process it will find out if there are other free
blocks immediately adjacent to this new free block and coalesce
them into a single larger block if so. There are only 4 cases:
isolated free, adjacent before, adjacent after, adjacent before
and after.
The memory in the free block itself is used to hold the
controlling structures required for the two functions. They cease
to exist once the memory is allocated.
Additionally, for finding blocks there are two basic strategies:
- first-fit
-
The free list is stored sorted by address. It is scanned from
the start and the first block with enough memory to fit the
request is used.
- best-fit
-
The free list is sorted by size. It is scanned from a block
which is larger-or-equal to the desired size and the first one
is used.
Each has their benefits and trade-offs.
first-fit may be slower to allocate but will more effectively
utilise memory - less total memory will be required. I did some
basic and non-exhaustive testing that gave me this result. It can
also be implemented with very simple code and the same list can be
used for performing coalescing.
best-fit can be faster to allocate as you find the correct block
directly. Based on my testing it requires more total memory.
Additional data structures are required to implement the free()
operation.
So finally, these choices any other indices arequired determine
the data structures required to track and find the free blocks,
and thus determine the minimum allocation size.
GNU libc malloc() uses a size_t for the block size, a
double-linked list for the free list, and another size_t at the
end of the free block for fast coalescing. So the minimum size of
a free block (and hence the minimum allocation size) is 32 bytes
on a 64-bit system 16 bytes on a 32-bit system. I think it uses
best-fit.
AllocMem()/FreeMem()
The first memory allocator I knew was the one from AmigaOS via
AllocMem(). This only has a single constraint in that the minimum
allocation is 8 bytes. There is no fixed allocation overhead to
store the allocated size as you need to also provide the size to
FreeMem(). Free nodes are stored as a single-linked list.
It takes very little code to implement this and it utilises memory
very well (it can't really be more utilised than 100%).
The drawback is that both allocation and deallocation are O(N)
operations since both have to walk the free list. I guess 'slow'
is relative as they did ok on a 7Mhz CPU with a global(!) shared
memory list. But because one knew it was slow you wrote code to
avoid it (SunOS also had a dreadfully slow allocator so it wasn't
unique in this regard).
On memory constrainted systems I think it's still a useful
technique. free() can also be accelerated somewhat by using a
balanced tree by bumping up the minimum allocation.
ez-page-alloc
So the first idea I played with was a page allocator which would
underly more specific algorithms. The idea would be that it would
require the free size like FreeMem() and so allow pages to be
allocated with no overhead and no wastage.
I didn't really get anywhere with this and then revisited it in a
different way later on (ez-alloc-alloc).
ez-string-alloc
I diverted for a little while trying to work on a string
allocator. It would take some number of pages as an allocation
block. In this case there would be no way to free individual
strings. But they wouldn't require any alignment or size overhead
either.
Strings can be built a character at a time, and large strings are
moved over to their own block.
Allocation can be made very fast by only searching the most recent
block. Or it can be made more compact by searching all blocks.
Or some trade-off in-between.
I think this could be the basis of a good no-free high performance
pool allocator.
On the way I came across apr_pool and it's code is pretty ugly and
I was suprised it didn't support free() of individual allocations,
given how complex it is.
ez-fixed-alloc
I've written multiple memory allocators in the past and one of the
easiest and most useful is an allocator for fixed-sized blocks.
It was particularly important for glib container classes.
Free is trivial as you already know the size and never have to
coalesce free blocks and you can just plop the memory into the
head of a single-linked list.
Allocate is likewise as trivial as a single-linked list.
However you end up having to have multiple allocators for each
size so you just end up with a lot of wasted space anyway.
Eh I dunno, i'm thinking about whether this one is useful at all.
ez-alloc-alloc
I had a look at an allocator that could support memalign() without
wastage. And basically do-away with a need for a special page
allocator.
This is basically a tree-based version of AlocMem()/FreeMem() but
with a very large alignment. Because there is no size overhead
all memory is used and allocating new page-aligned pages doesn't
leave unfillable holes.
Performance problems with the algorithm would be mitigated by
having fewer, larger allocations and leaving the internal
allocations to other more specific algorithms.
Still I dunno, it's many drawbacks in this form but I like the
idea. Probably some sort of external index (radix-tree) would be
better and enforcing N*page sized allocations.
ez-pool-alloc
I tried to write an allocator that supported free but with the
minimum overhead. I got it down to a minimum allocation size of 8
bytes (void *) with no overhead.
Basically I wrote another AllocMem()/FreeMem() but this time much
closer to the original. To support the minimum allocation size of
8 bytes on a 64-bit platform the lsb of the link pointer is used
to indicate whether the size is 8 bytes (no room for size) or
larger (has size).
This still suffers from the same performance problems, although it
can be mitigated by being used only for short-term memory or with
few frees.
The Story So Far. A Big Waste Of Fucking Time.
In the end i'm not really sure I like anything I came up with.
I'll probably change the existing ezelib serialiser, and likely
split the code up so that the description part is separate from
the (multiple) encoding part(s). I'll probably have something
XDR-like and another more compact one, maybe tagged. Each may not
support all possible structures.
The memory stuff was really just fucking around; I don't remotely
need any of this. But I will poke at it some more I guess.
Probably a pageish allocator which manages system memory without
allocation overhead (at point of allocation), then a couple of
allocators which sit atop of it and can be used as pools. A
general purpose one and a high-performance non-freeing one that
also supports incremental string/structure creation.
The general purpose pool might just end up being very similar to
malloc() but with the (rather useful) benefit of being able to be
freed en-mass. I can possibly get the minimum allocation size
down to 16 bytes although I will need to borrow some bits from
ez-tree's ez_node to achieve this. I can possibly also get
first-fit to work efficiently (order tree by size then address?).
I think i'm going to have to save the size in allocations though
(thus 8-byte allocation overhead): it's going to be needed a lot
of the time and I can't do fast coalescing without it. If I have
a particualrly fast page allocator i might be able to do something
neat for small fixed-sized allocs and that would also cover the
interesting memalign cases.
Ultumately they would also need valgrind instrumentation as well.
Probably back to work next week, should get away from the screen.
Wheather's nice and warm but not too hot. Finally getting a few
tomatoes and the sweet-corn is nearly there.
For the PlayerZ
Well i've been stuck in the house for various reasons so i've been
doing a lot of solid work on the audio player software for the
mele box.
It's not feature complete but it all works:
Removable drives are automatically detected and mounted or
unmounted if they contain filesystems. They are mounted read-only
via their UUID. This is implemented only using kernel calls and
libblkid.
Detected drives are scanned and media files are indexed
into a persistent database. Arbitrary trees on the filesystem can
also be indexed.
Multiple secondary indices are genereated including title
and author.
Deleted, changed, or added files are updated properly when
they are scanned.
Various relational database constraints are implemented -
albeit manually.
The music player supports all the basic features like
pause, volume, mute, fast-forward, rewind, next and previous
file.
It can be controlled via the Mele airmouse.
It's somewhat robust to media being removed while in use.
I've already hit a bug in the kernel so it may be the limiting
factor here.
Scalable, space, and time efficient algorithms are used
throughout.
It's about 2KLOC of plain C and a couple of libraries.
It is already or is very close to 'valgrind clean'.
I'm using posix message queues for communicating between the
processes with a simple C struct serialisation mechanism. I'm not
using threads as yet, the player just runs as a single thread.
This means it can block while reading from libavformat but it
greatly simplifies the player logic.
At the moment i'm just running it on my 'workstation' so I don't
know how much work - if any - will be required to get it going on
the mele, or how it will handle the processes and so on. For
example if the disk indexer will overload the cpu or i/o.
Anyway here is a high-level diagram of the processes and various
bits that make it up. Sorry it's a bit shit, but well, openoffice
draw is a fucking headache to work with.
For it to be feature complete there isn't much left. Some sort of
playlist mechanism and/or other ways to navigate the files beyond
disk+file-path order. Shuffle should be easy. Network streams
would be nice.
With no user interface beyond 2 leds (and given they're in the
same hole that just means 2 colours) it can't get too complicated
without resorting to an additional service for external
maniplation. But that is of course possible.
The things I want to look at are playlists and other ways to
navigate the files (other than file-path order), and network
streams.
Bored as Fuck
Haven't left the house for a full week at this point. I was sick
for a few days but now i'm just miserable for other reasons. This
stuff is keeping me well occupied but yeah it's kind of pointless.
Even my knees fucking hurt.
I'm wondering whether i should just go back to work early since
there's still a couple of weeks before I would otherwise. I don't
think I could face it though.
I'm also a bit pissed off that I think I drowned and killed a
chilli plant i've been growing for weeks just as it started to
kick into gear. We had a couple of hot days and i overwatered it
i think. Actually not much is growing very well this year and I
can't seem to water it properly either. Too much or too little.
Water is so expensive here.
Oh, I also wasted a couple of hours playing
with yacy. The idea of
decentralised internet services certainly appeals. But it was
just slow and provided almost no relevant results to any search I
tried - it just didn't work for me. Solr/Lucene just does a
weighted sub-string index which isn't very sophisticated. It's
also configured for some big server so running it on this little
vps was difficult even after fighting with the complex
configuration system.
I'm not eating much at least so maybe i can lose some of the
weight I keep putting on.
Media Queries
I made a small change to the HTML and CSS to try to render a bit
better on phones. Even though the text is all resizeable and
reflowable they rendered at some massive resolution and then
scaled down - making the text unredable and non-reflowable. Sigh.
Anyway, first I added:
<meta name='viewport' content='width=device-width, initial-scale=1'>
But this made the text too big and interfered with the table-like
layout. I tried using a width of 800 which sort of worked but
wasn't very readable either.
So I added a media query and adjusted some of the main sections
and the borders and so on. It's just a quick and dirty but it
works better than it did. I've only tested on one phone so others
may not have changed.
/* bloody phones */
@media (max-width: 480px) {
div#site-menu, .tag-menu {
float: none;
width: 100%;
}
.post-footer, .post, .post-header {
margin-right: 1em;
}
}
On the other hand i'm lucky to get one visitor per week (lots of
crawlers, and plenty of hacking attempts) so who really cares eh?