About Me
Michael Zucchi
B.E. (Comp. Sys. Eng.)
also known as Zed
to his mates & enemies!
< notzed at gmail >
< fosstodon.org/@notzed >
https, TLS upgrade
Ahah, so it seems things have changed a bit since last I looked
into certificates and certificate authorities - and even then I
was looking into code and email signing certs anyway.
After a short poke around I quickly became aware of the
Let's Encrypt project which
provides automated and free server domain certificates. It can be
automated because you control the server and part of the issuing
process creates temporary server resources that the signer can
cross-check. And all the certs are created locally.
So after a bit of fudging around with
the C-based acme
client and some apache config I got it all turned on and
(compatible) browsers automagically redirecting to the TLS
protected url.
Yay.
I didn't want to go with the offical CertBot because python isn't
otherwise installed on this server and I didn't want to drag all
that snot in for no other reason.
Because the acme-client is a little out of date I had to pass it a
few extra parameters to make it create certificates (and had to do
some small porting related changes to it using libressl rather than
libopenssl).
acme-client \
-ahttps://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf \
-C/var/zedzone/acme \
-vNn \
zedzone.au www.zedzone.au code.zedzone.au
Once created a daily cron job runs it (without the -vNn options)
which requests new certificates if the old ones are within a month
of their expirey date (since the Let's Encrypt certificates only
last for 90 days).
I then added a https server config:
<VirtualHost www.zedzone.au:443>
ServerName www.zedzone.au
...
SSLEngine on
SSLCertificateFile /etc/ssl/acme/cert.pem
SSLCertificateKeyFile /etc/ssl/acme/private/privkey.pem
SSLCertificateChainFile /etc/ssl/acme/fullchain.pem
SSLUseStapling on
Header always set Strict-Transport-Security "max-age=31536000"
Header always set Content-Security-Policy upgrade-insecure-requests
<VirtualHost>
And finally another header to the main server which tells
compatible clients to upgrade to use https. This can be a bit odd
on the first access but thereafter it does the right thing. I
hope!
<VirtualHost www.zedzone.au:80>
ServerName www.zedzone.au
...
Header always set Content-Security-Policy upgrade-insecure-requests
<VirtualHost>
I didn't want to use a rewrite rule because at the moment I want
to keep both url's active, but i might change that in the future.
It seems like it might be useful - on the other hand any client
anyone is likely to use will support TLS wont it?
I've left code.zedzone.au unencrypted for now (even
though it's currently the only part of the site that can be logged
into!) because I need to check things work with virtual
servers on https first and more importantly i'm too hungover to
care this fine yet overcast afternoon!
Update: For what it's worth, the server gets an A+ rating
on ssllabs
SSL Server Test at the time of posting. Although to get the
score above B required a few mod_ssl config changes.
Rabbit Holes All The Way Down
I kept poking around the blog code over the last couple of days.
It just keeps leading to more and more questions.
DBD
Tuesday I mostly spent re-brushing up on the C api for Berkeley DB
and designing the schema to implement my version database using
it. At some point since I last looked foreign key constraints
must have been added so I implemented that - unfortunately unlike
JE they don't support self-referential keys (where a field
references the primary key of the same object) so I will have to
code up a couple of cases for that manually. Actually i'm not
sure I even need fully indexed key constraints as the database is
designed never to have deletions. If I ever get that far i'll do
some benchmarking to evaluate the tradeoffs, or decide how to do
deletions.
During the journey I also discovered that at some point Berkeley
DB JE changed licenses again - it had been AGPL3 last time I
looked. Now it's changed to Apache. I wonder if this is another
project soon to be abandoned to the ASF? Anyway it doesn't make
much difference to my Free Software projects (not that I ever got
far enough to publish any) but it'll be handy for work as i've
wanted to use it pleny of times. It's about the only decent NoSQL
DB left these days.
Uploading JavaScript
I pretty much detest JavaScript but I wanted to look at how to
write some sort of web-based editor for writing posts and I don't
really feel like writing yet another MIME parser to handle
multipart/form. Well I probably will have to eventually (or
likely dig one of the few i've already written back up) but in the
mean-time I investigated direct uploads using XJAX.
Most results from searching turn up JQuery snot but I eventually
found some raw JavaScript using XMLHttpRequest directly. Given
it's only a few lines of code one has to wonder about these
'frameworks'. I digress. I played around a bit, extended my
FastCGI library to support streaming stdin and wrote a basic
REST-like `uploader' that can handle binary blobs directly without
any messy protocol parsing. Yay. And then I fell down another
hole ... how the fuck am I going to do security?
I don't really want to buy an SSL cert for this site but using a
self-signed certificate isn't really any good. Without that
pretty much any auth system is wildly insecure. I started looking
into JavaScript libraries for crypto - some are a little over the
top but there are a few smaller ones that might serve the purpose.
Crypto has a lot of gotchas and one can't be an expert in
everything so i'm not sure I want to start down what would be a
very long and winding road just to post to a website.
So i'm toying with a few ideas. First just do nothing, stick to
ssh and emacs for posting. If I ever bother with comments or
feedback they can be anonymous and not require auth. Or instead
of using JavaScript write a standalone Java editor / operator
console that calls REST services. Or even using an ssh driven
backend. This has some appeal personally but I'll see. Another is
to use SSL + Digest Auth - this way I let the browsers and server
handle all the complexity and get a mostly ok system. If I
install my own CA on my local browser(s) and enforce client
certificates from the server side, it should be reasonably secure.
Damn windy road already.
I need a real rest
My sleep has been particularly bad of late. The sleep apnoae is
quite bad and I regularly (mostly) forget to wear the mouth splint
which
doesn't-treat-it-particularly-well-but-it's-better-than-nothing.
At least I remembered last night.
Today I gotta try and do some hours for work though. At the moment
i'm trying to decipher some statistical software written in
matlab, which is about my most favouritist thing in the whole
world. Fuck matlab.
Oh, I also bought some mice. I've got a couple of small 'travel
mouse' mice that I much prefer to the standard fare and although
they used to be easy to find they've become quite scarce around
here. What ever happened to BenQ anyway? All the local retail
only have microsoft or logitech or their own badged chinese crap
now. Coordless also seem to have taken over (higher margins one
suspects). I looked everwhere locally and on the usual suspects
online but couldn't find anything decent. Oddly enough
the ThinkPad
one I already have was one of cheapest, and from the source, so I
ordered a couple to tide me over for the forseeable future. On a
whim I also added
a wireless
'laser' one as well, although it's marginally larger.
FastCGI Enabled
Further the previous post I did end up porting my blog driver to
my fastcgi implementation.
Benchmarking using `ab' from home it doesn't really make any
difference reading the front page of the blog - if anything it's
actually marginally slower.
Running the benchmark locally though things are quite different.
Previous standard cgi:
Concurrency Level: 1
Time taken for tests: 13.651 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 37100000 bytes
HTML transferred: 36946000 bytes
Requests per second: 73.25 [#/sec] (mean)
Time per request: 13.651 [ms] (mean)
Time per request: 13.651 [ms] (mean, across all concurrent requests)
Transfer rate: 2654.05 [Kbytes/sec] received
Using fcgi:
Concurrency Level: 1
Time taken for tests: 0.706 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 37062000 bytes
HTML transferred: 36908000 bytes
Requests per second: 1416.39 [#/sec] (mean)
Time per request: 0.706 [ms] (mean)
Time per request: 0.706 [ms] (mean, across all concurrent requests)
Transfer rate: 51264.00 [Kbytes/sec] received
So yeah, only 20x faster. If I up the concurrency level of the
benchmark it gets better but it's hard to tell much from it since
everything is running on the same machine.
Regardless, I made it live.
FastCGI experiments
It's not particularly important - i'm lucky to get more than
one-non-bot hit in a given day - but I thought i'd have a look
into FastCGI. If in the future I do use a database backend or
even a Java one it should be an easy way to get some performance
while leveraging the simplicity of CGI and leaving the protocol
stuff to apache.
After a bit of background reading and looking into some 'simple'
implementations I decided to just roll my own. The 'official'
fastcgi.com site is no longer live so I didn't think it worth
playing with the official sdk. The way it handled stdio just
seemed a little odd as well.
With the use of a few GNU libc extensions for stdio (cookie
streams) and memory (obstacks) I put together enough of a partial
(but robust) implementation to serve output-only pages from the
fcgid module in a few hundred lines of code.
This is the public api for it.
struct fcgi_param {
char *name;
char *value;
};
struct fcgi {
// Active during cgi request
FILE *stdout;
FILE *stderr;
// Current request info
unsigned char rid1, rid0;
unsigned char flags;
unsigned char role;
// Current request params (environment)
size_t param_length;
size_t param_size;
struct fcgi_param *param;
struct obstack param_stack;
// Internal buffer stuff
int fd;
size_t pos;
size_t limit;
size_t buffer_size;
unsigned char *buffer;
};
typedef int (*fcgi_callback_t)(struct fcgi *, void *);
struct fcgi *fcgi_alloc(void);
void fcgi_free(struct fcgi *cgi);
int fcgi_accept_all(struct fcgi *cgi, fcgi_callback_t cb, void *data);
char *fcgi_getenv(struct fcgi *cgi, const char *name);
I didn't bother to implement concurrent requests, the various
access control roles, or STDIN messages. The first doesn't appear
to be used by mod_fcgi (it handles concurrency itself) and I don't
need the rest (yet at least). As previously stated I used GNU
libc extensions to implement custom stdio streams for stdout and
stderr, although I used a custom 'zero-copy' buffer implementation
for the protocol handling (wherein the calls can access the
internal buffer address rather than having to copy data around).
Converting a CGI program is a little more involved than using the
original SDK because it doesn't hide the i/o behind macros or use
global variables to pass information. Instead via a
context-specific handle it provides stdio compatible FILE handles
and a separate environmental variable lookup function. Of course
it is possible to write a handler callback which can implement
such a solution.
The main function of a the fast cgi program just allocates the
context, calls accept_all and then free. The callback is invoked
for each request and can access stdout/stderr from the context
using stdio calls as it wishes.
Apache config
Here's the basic apache config snipped I used to hook it into
`/blog' on a server (I did this locally rather than live on this
site though).
ScriptAlias /blog /path/fcgi-test.fcgi
FcgidCmdOptions /path/fcgi-test MaxProcesses 1
<Directory "/path">
AllowOverride None
Options +ExecCGI
Require all granted
</Directory>
Custom streams and cookies
Using a GNU extension it is trivial to hook up custom stdio
streams - one gets all the benefits of libc's buffering and
formatting and one only has to write a couple of simple callbacks.
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/uio.h>
#include <stdio.h>
#include <unistd.h>
static ssize_t fcgi_write(void *f, const char *buf, size_t size, int type) {
struct fcgi *cgi = f;
size_t sent = 0;
FCGI_Header header = {
.version = FCGI_VERSION_1,
.type = type,
.requestIdB1 = cgi->rid1,
.requestIdB0 = cgi->rid0
};
while (sent < size) {
size_t left = size - sent;
ssize_t res;
struct iovec iov[2];
if (left > 65535)
left = 65535;
header.contentLengthB1 = left >> 8;
header.contentLengthB0 = left & 0xff;
iov[0].iov_base = &header;
iov[0].iov_len = sizeof(header);
iov[1].iov_base = (void *)(buf + sent);
iov[1].iov_len = left;
res = writev(cgi->fd, iov, 2);
if (res < 0)
return -1;
sent += left;
}
return size;
}
static int fcgi_close(void *f, int type) {
struct fcgi *cgi = f;
FCGI_Header header = {
.version = FCGI_VERSION_1,
.type = type,
.requestIdB1 = cgi->rid1,
.requestIdB0 = cgi->rid0
};
if (write(cgi->fd, &header, sizeof(header)) < 0)
return -1;
return 0;
}
Well perhaps the callbacks are more `straightforward' than simple
in this case. FastCGI has a payload limit of 64K so any larger
writes need to be broken up into parts. I use writev
to write the header and content directly from the library buffer
in a single system call (a pretty insignificant performance
improvment in this case but one nonetheless). I might need to
handle partial writes but this works so far - in which case
the writev
approach gets too complicated to bother
with.
The actual 'cookie' callbacks just invoke the functions above with
the FCGI channel to write to.
static ssize_t fcgi_stdout_write(void *f, const char *buf, size_t size) {
return fcgi_write(f, buf, size, FCGI_STDOUT);
}
static int fcgi_stdout_close(void *f) {
return fcgi_close(f, FCGI_STDOUT);
}
const static cookie_io_functions_t fcgi_stdout = {
.read = NULL,
.write = fcgi_stdout_write,
.seek = NULL,
.close = fcgi_stdout_close
};
And opening a custom stream is as as simple as opening a regular file.
static int fcgi_begin(struct fcgi *cgi) {
cgi->stdout = fopencookie(cgi, "w", fcgi_stdout);
...;
return 0;
}
Example
Here's a basic example that just dumps all the parameters to the
client. It also maintains a count to demonstrate that it's
persistent.
I went with a callback mechanism rather than the polling mechanism
of the original SDK mostly to simplify managing state. Shrug.
#include "fcgi.h"
static int cgi_func(struct fcgi *cgi, void *data) {
static int count;
fprintf(cgi->stdout, "Content-Type: text/plain\n\n");
fprintf(cgi->stdout, "Request %d\n", count++);
fprintf(cgi->stdout, "Parameters\n");
for (int i=0;i<cgi->param_length;i++)
fprintf(cgi->stdout, " %s=%s\n", cgi->param[i].name, cgi->param[i].value);
return 0;
}
int main(int argc, char **argv) {
struct fcgi * cgi = fcgi_alloc();
fcgi_accept_all(cgi, cgi_func, NULL);
fcgi_free(cgi);
}
Notes
I haven't worked out how to get the CGI script to 'exit' when the
MaxRequestsPerProcess limit has been reached without causing
service pauses. Whether I do nothing or whether I exit and close
the socket at the right time it still pauses the next request for
1-4 seconds.
I haven't converted my blog driver to use it yet - maybe later on
tonight if I keep poking at it.
Oh and it is quite fast, even with a trivial C program.
Versioning DB
Well I don't have any code ready yet but between falling asleep
today i did a little more work on a versioned data-store i've been
working on ... for years, a couple of decades infact.
In it's current iteration it utilises just 3 core (simple)
relational(-like) tables and can support both SVN style branches
(lightweight renames) and CVS style branches (even lighter
weight). Like SVN it uses global revisions and transactions but
like CVS it works on a version tree rather than a path tree, but
both approaches are possible within the same small library.
Together with dez it allows for
compact reverse delta storage.
Originally I started in C but i've been working Java since - but
i'm looking at back-porting the latest design to C again for some
performance comparisons. It's always used Berkeley DB (JE for
Java) as storage although I did experiment with using a SQL
version in the past.
My renewed interest is that the goal is eventually run this site
with it as a backing storage - for code, documentation, musings.
e.g. the ability to branch a document tree for versioning and yet
have it served live from common storage. This was essentially the
reason I started investigating the project many years ago but
never quite got there. I'm pretty sure I've got a solid schema
but still need to solidify the API around a few basic use-cases
before I move forward.
The last time I touched the code as 2 years ago, and the last time
I did any significant code on it was 3 years ago, so it's
definitely a slow burner project!
Well, more when I have more to say.
FFmpeg 4.0
Just a short post about the latest FFmpeg release. I tried
building jjmpeg 3.0.1 against FFmpeg 4.0 and it compiles cleanly
with no warnings.
So I think it should be good enough to go ... but I realised I
don't actually have anything handy already written to test it
against right now so that's only a guess.
Once I do i'll bump the version and do another release. This is
more or less what I had planned to do today but instead got tags
working on this site instead.
Tags & Styles
Worked a little more on the site.
The big one is that i've added tags back to the pages. Once you
start viewing a tag based index is sticks to it navigation until
it's cleared or another is chosen. It works in pretty much the
way you'd expect it to.
I've also linked in a stylesheet and started filling it out - but
this is very rudimentary for now and not much more than enough to
make things operate properly.
Powered by gcc and me!
Just for a little background, the blog itself is currently driven
by a small stand-alone C program which is executed as a cgi
script. The parameter processing is quite strict and just fails
with a 4xx series error if anything (external) isn't right. At
the moment it doesn't use any sort of database as such - the post
text is simply a file on disk which is interleaved within
code-generated text. A script generates several indexes in the
form of C code from the filenames and a metadata properties file
which is then compiled into the binary.
For now, to create a new post I have a small program which creates
a simple post template and launches emacs directly on the server.
If the file gets edited when emacs exits it then gets moved into
the post directory and a metadata file is created. I then have to
run make install on the binary to update the indices for the new
file and any tags. This can eventually be replaced with a
web-based editor with image uploads and so on if I ever get around
to it.
It's just meant to be a 'quick and dirty' to get it up and
running, but I somewhat like it's simplicity. I can't see it
being of any particular interest or use to anyone but I will
eventually publish it as Free Software at some later date.
Ahh stuff it
Got sick off all the snot in the logs so i've just moved ssh to
another port and DROP all incoming ssh packets.
Well i'm doing a LOG + DROP for now just out of curiosity, but at
least the failed login attempts have stopped cold.
I also put up a banner on a-hackers-craic redirecting here. This
site still supports access via the year/month/title.html url's
that match the ones on blogger (in addition to the hex-id ones); I
was going to try to write some javascript to link or direct each
post to the new one but it just seems like too much work today.
Copyright (C) 2019 Michael Zucchi, All Rights Reserved.
Powered by gcc & me!