FastCGI Enabled
Further the previous post I did end up porting my blog driver to
my fastcgi implementation.
Benchmarking using `ab' from home it doesn't really make any
difference reading the front page of the blog - if anything it's
actually marginally slower.
Running the benchmark locally though things are quite different.
Previous standard cgi:
Concurrency Level: 1
Time taken for tests: 13.651 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 37100000 bytes
HTML transferred: 36946000 bytes
Requests per second: 73.25 [#/sec] (mean)
Time per request: 13.651 [ms] (mean)
Time per request: 13.651 [ms] (mean, across all concurrent requests)
Transfer rate: 2654.05 [Kbytes/sec] received
Using fcgi:
Concurrency Level: 1
Time taken for tests: 0.706 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 37062000 bytes
HTML transferred: 36908000 bytes
Requests per second: 1416.39 [#/sec] (mean)
Time per request: 0.706 [ms] (mean)
Time per request: 0.706 [ms] (mean, across all concurrent requests)
Transfer rate: 51264.00 [Kbytes/sec] received
So yeah, only 20x faster. If I up the concurrency level of the
benchmark it gets better but it's hard to tell much from it since
everything is running on the same machine.
Regardless, I made it live.
FastCGI experiments
It's not particularly important - i'm lucky to get more than
one-non-bot hit in a given day - but I thought i'd have a look
into FastCGI. If in the future I do use a database backend or
even a Java one it should be an easy way to get some performance
while leveraging the simplicity of CGI and leaving the protocol
stuff to apache.
After a bit of background reading and looking into some 'simple'
implementations I decided to just roll my own. The 'official'
fastcgi.com site is no longer live so I didn't think it worth
playing with the official sdk. The way it handled stdio just
seemed a little odd as well.
With the use of a few GNU libc extensions for stdio (cookie
streams) and memory (obstacks) I put together enough of a partial
(but robust) implementation to serve output-only pages from the
fcgid module in a few hundred lines of code.
This is the public api for it.
struct fcgi_param {
char *name;
char *value;
};
struct fcgi {
// Active during cgi request
FILE *stdout;
FILE *stderr;
// Current request info
unsigned char rid1, rid0;
unsigned char flags;
unsigned char role;
// Current request params (environment)
size_t param_length;
size_t param_size;
struct fcgi_param *param;
struct obstack param_stack;
// Internal buffer stuff
int fd;
size_t pos;
size_t limit;
size_t buffer_size;
unsigned char *buffer;
};
typedef int (*fcgi_callback_t)(struct fcgi *, void *);
struct fcgi *fcgi_alloc(void);
void fcgi_free(struct fcgi *cgi);
int fcgi_accept_all(struct fcgi *cgi, fcgi_callback_t cb, void *data);
char *fcgi_getenv(struct fcgi *cgi, const char *name);
I didn't bother to implement concurrent requests, the various
access control roles, or STDIN messages. The first doesn't appear
to be used by mod_fcgi (it handles concurrency itself) and I don't
need the rest (yet at least). As previously stated I used GNU
libc extensions to implement custom stdio streams for stdout and
stderr, although I used a custom 'zero-copy' buffer implementation
for the protocol handling (wherein the calls can access the
internal buffer address rather than having to copy data around).
Converting a CGI program is a little more involved than using the
original SDK because it doesn't hide the i/o behind macros or use
global variables to pass information. Instead via a
context-specific handle it provides stdio compatible FILE handles
and a separate environmental variable lookup function. Of course
it is possible to write a handler callback which can implement
such a solution.
The main function of a the fast cgi program just allocates the
context, calls accept_all and then free. The callback is invoked
for each request and can access stdout/stderr from the context
using stdio calls as it wishes.
Apache config
Here's the basic apache config snipped I used to hook it into
`/blog' on a server (I did this locally rather than live on this
site though).
ScriptAlias /blog /path/fcgi-test.fcgi
FcgidCmdOptions /path/fcgi-test MaxProcesses 1
<Directory "/path">
AllowOverride None
Options +ExecCGI
Require all granted
</Directory>
Custom streams and cookies
Using a GNU extension it is trivial to hook up custom stdio
streams - one gets all the benefits of libc's buffering and
formatting and one only has to write a couple of simple callbacks.
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/uio.h>
#include <stdio.h>
#include <unistd.h>
static ssize_t fcgi_write(void *f, const char *buf, size_t size, int type) {
struct fcgi *cgi = f;
size_t sent = 0;
FCGI_Header header = {
.version = FCGI_VERSION_1,
.type = type,
.requestIdB1 = cgi->rid1,
.requestIdB0 = cgi->rid0
};
while (sent < size) {
size_t left = size - sent;
ssize_t res;
struct iovec iov[2];
if (left > 65535)
left = 65535;
header.contentLengthB1 = left >> 8;
header.contentLengthB0 = left & 0xff;
iov[0].iov_base = &header;
iov[0].iov_len = sizeof(header);
iov[1].iov_base = (void *)(buf + sent);
iov[1].iov_len = left;
res = writev(cgi->fd, iov, 2);
if (res < 0)
return -1;
sent += left;
}
return size;
}
static int fcgi_close(void *f, int type) {
struct fcgi *cgi = f;
FCGI_Header header = {
.version = FCGI_VERSION_1,
.type = type,
.requestIdB1 = cgi->rid1,
.requestIdB0 = cgi->rid0
};
if (write(cgi->fd, &header, sizeof(header)) < 0)
return -1;
return 0;
}
Well perhaps the callbacks are more `straightforward' than simple
in this case. FastCGI has a payload limit of 64K so any larger
writes need to be broken up into parts. I use writev
to write the header and content directly from the library buffer
in a single system call (a pretty insignificant performance
improvment in this case but one nonetheless). I might need to
handle partial writes but this works so far - in which case
the writev
approach gets too complicated to bother
with.
The actual 'cookie' callbacks just invoke the functions above with
the FCGI channel to write to.
static ssize_t fcgi_stdout_write(void *f, const char *buf, size_t size) {
return fcgi_write(f, buf, size, FCGI_STDOUT);
}
static int fcgi_stdout_close(void *f) {
return fcgi_close(f, FCGI_STDOUT);
}
const static cookie_io_functions_t fcgi_stdout = {
.read = NULL,
.write = fcgi_stdout_write,
.seek = NULL,
.close = fcgi_stdout_close
};
And opening a custom stream is as as simple as opening a regular file.
static int fcgi_begin(struct fcgi *cgi) {
cgi->stdout = fopencookie(cgi, "w", fcgi_stdout);
...;
return 0;
}
Example
Here's a basic example that just dumps all the parameters to the
client. It also maintains a count to demonstrate that it's
persistent.
I went with a callback mechanism rather than the polling mechanism
of the original SDK mostly to simplify managing state. Shrug.
#include "fcgi.h"
static int cgi_func(struct fcgi *cgi, void *data) {
static int count;
fprintf(cgi->stdout, "Content-Type: text/plain\n\n");
fprintf(cgi->stdout, "Request %d\n", count++);
fprintf(cgi->stdout, "Parameters\n");
for (int i=0;i<cgi->param_length;i++)
fprintf(cgi->stdout, " %s=%s\n", cgi->param[i].name, cgi->param[i].value);
return 0;
}
int main(int argc, char **argv) {
struct fcgi * cgi = fcgi_alloc();
fcgi_accept_all(cgi, cgi_func, NULL);
fcgi_free(cgi);
}
Notes
I haven't worked out how to get the CGI script to 'exit' when the
MaxRequestsPerProcess limit has been reached without causing
service pauses. Whether I do nothing or whether I exit and close
the socket at the right time it still pauses the next request for
1-4 seconds.
I haven't converted my blog driver to use it yet - maybe later on
tonight if I keep poking at it.
Oh and it is quite fast, even with a trivial C program.
Versioning DB
Well I don't have any code ready yet but between falling asleep
today i did a little more work on a versioned data-store i've been
working on ... for years, a couple of decades infact.
In it's current iteration it utilises just 3 core (simple)
relational(-like) tables and can support both SVN style branches
(lightweight renames) and CVS style branches (even lighter
weight). Like SVN it uses global revisions and transactions but
like CVS it works on a version tree rather than a path tree, but
both approaches are possible within the same small library.
Together with dez it allows for
compact reverse delta storage.
Originally I started in C but i've been working Java since - but
i'm looking at back-porting the latest design to C again for some
performance comparisons. It's always used Berkeley DB (JE for
Java) as storage although I did experiment with using a SQL
version in the past.
My renewed interest is that the goal is eventually run this site
with it as a backing storage - for code, documentation, musings.
e.g. the ability to branch a document tree for versioning and yet
have it served live from common storage. This was essentially the
reason I started investigating the project many years ago but
never quite got there. I'm pretty sure I've got a solid schema
but still need to solidify the API around a few basic use-cases
before I move forward.
The last time I touched the code as 2 years ago, and the last time
I did any significant code on it was 3 years ago, so it's
definitely a slow burner project!
Well, more when I have more to say.
FFmpeg 4.0
Just a short post about the latest FFmpeg release. I tried
building jjmpeg 3.0.1 against FFmpeg 4.0 and it compiles cleanly
with no warnings.
So I think it should be good enough to go ... but I realised I
don't actually have anything handy already written to test it
against right now so that's only a guess.
Once I do i'll bump the version and do another release. This is
more or less what I had planned to do today but instead got tags
working on this site instead.
Tags & Styles
Worked a little more on the site.
The big one is that i've added tags back to the pages. Once you
start viewing a tag based index is sticks to it navigation until
it's cleared or another is chosen. It works in pretty much the
way you'd expect it to.
I've also linked in a stylesheet and started filling it out - but
this is very rudimentary for now and not much more than enough to
make things operate properly.
Powered by gcc and me!
Just for a little background, the blog itself is currently driven
by a small stand-alone C program which is executed as a cgi
script. The parameter processing is quite strict and just fails
with a 4xx series error if anything (external) isn't right. At
the moment it doesn't use any sort of database as such - the post
text is simply a file on disk which is interleaved within
code-generated text. A script generates several indexes in the
form of C code from the filenames and a metadata properties file
which is then compiled into the binary.
For now, to create a new post I have a small program which creates
a simple post template and launches emacs directly on the server.
If the file gets edited when emacs exits it then gets moved into
the post directory and a metadata file is created. I then have to
run make install on the binary to update the indices for the new
file and any tags. This can eventually be replaced with a
web-based editor with image uploads and so on if I ever get around
to it.
It's just meant to be a 'quick and dirty' to get it up and
running, but I somewhat like it's simplicity. I can't see it
being of any particular interest or use to anyone but I will
eventually publish it as Free Software at some later date.
Ahh stuff it
Got sick off all the snot in the logs so i've just moved ssh to
another port and DROP all incoming ssh packets.
Well i'm doing a LOG + DROP for now just out of curiosity, but at
least the failed login attempts have stopped cold.
I also put up a banner on a-hackers-craic redirecting here. This
site still supports access via the year/month/title.html url's
that match the ones on blogger (in addition to the hex-id ones); I
was going to try to write some javascript to link or direct each
post to the new one but it just seems like too much work today.
NotworkManager and other small things
Had a few problems with system updates lately. One was an upgrade
to my remaining slackware system that broke a few things. First
it wanted to run LILO after updating the kernel and I said no (I
don't use it); not sure if that would have run the grub setup but
what happened it wasn't run. Fortunately one of the kernels in
grub still existed and booted so it wasn't too hard to fix.
It also broke NetworkManager - or rather, it stopped working
again. It's been a flakey piece of shit forever but I thought it
was finally 'stable' enough to use (despite a few quirks on that
machine like it not automatically reconnecting after waking up).
Well not so!
It simply wouldn't connect anymore. No idea why. I went back to
using rc.inet1.conf and it now works flawlessly - even reconnects
after waking up. I'd already done this (or equivalent) on all my
other machines, and it seems to be with good reason.
Crackers
I knew the internet was pretty slimey these days but actually
setting up a server on the naked internet over the last weekend
was a bit of an eye opener.
I noticed a massive spike in traffic on the 15th - given that the
only service running at the time was the 'experiment' page 1GB
seemed a bit off. It was just someone brute-forcing sshd. Since
this server went live on the 26th of march it has
processed over 300 000 failed login attempts, I
imagine (but haven't verified) most of those were on the 15th.
They certainly weren't me.
It's probably just a drop in the ocean compared to all the `real'
traffic but it seems such a waste. Yay for bots.
So i've put a few mitigations in place over the last few days:
- iptables rules to throttle new connections to port 22;
- disabled root login through ssh entirely;
- added a small blacklist using ipset.
I don't really want to have to maintain the last but i'll see
how it goes.
Anyway it's sort of interesting to see the logins being used
- root
is obvious
but hottie
, mother
and david
don't seem too obvious.
Just for fun, here's the complete list of the usernames and
frequency counts as of a few minutes ago.
1 irc 1 sync 1 syslog 2
2 ! 2 12345678 2 1234qwer 2 123qwe
2 12qwaszx 2 1qazxsw2 2 654321 2 777777
2 aaron 2 abcd1234 2 admin@12 2 admintek
2 admUS 2 adriana 2 aion 2 alexis
2 amanda 2 amit 2 amy 2 andrea
2 angela 2 anthony 2 antiviru 2 ARGENTIN
2 arsenal 2 ashok 2 asshole 2 bananapi
2 bank 2 baseball 2 board 2 bobby
2 bonita 2 botmaste 2 byte 2 bytes
2 cameron 2 carditek 2 carmen 2 carolina
2 centos 2 chat 2 chelsea 2 chicken
2 chris 2 cinema 2 claudia 2 corazon
2 counters 2 crystal 2 cs 2 csgoserv
2 csserver 2 customs 2 cuteako 2 cvs
2 cyber 2 data 2 db1 2 db2inst1
2 december 2 deploy 2 destiny 2 docker
2 download 2 dragon 2 dvd 2 edu
2 educatio 2 elastics 2 family 2 fedora
2 flower 2 forum 2 freedom 2 ftpuser1
2 gabriel 2 games 2 gaming 2 gb
2 ghost 2 gmodserv 2 gnu 2 gnuworld
2 greenday 2 harley 2 hdsf 2 hiitplc
2 home 2 hottie 2 html 2 http
2 hunter 2 idc!@ 2 internet 2 ircd
2 isabel 2 jessica 2 jessie 2 jiamima
2 karen 2 kartel 2 keith 2 kernel
2 kitten 2 kmc 2 laura 2 lauren
2 libuuid 2 liferay 2 linaro 2 linux
2 linuxmin 2 liverpoo 2 logon 2 lovers
2 lpa 2 lucas 2 maganda 2 maggie
2 mail 2 mailman 2 maintain 2 manuel
2 marketin 2 matthew 2 mdb 2 miguel
2 muiehack 2 music 2 musicbot 2 mylove
2 myspace 2 nathan 2 Neuchate 2 Norwood
2 ns 2 ns2 2 nuucp 2 october
2 odroid 2 openssh- 2 openvpn 2 oper
2 oracle2 2 orlando 2 otrs 2 pass
2 passw0rd 2 passwd 2 pc 2 pepper
2 php 2 pictures 2 poohbear 2 portal
2 pretty 2 princess 2 proba 2 proftpd
2 project 2 p@ssw0rd 2 purple 2 q1w2e3r4
2 qazwsx 2 qwe123 2 qwerty 2 radio
2 rangers 2 rdp 2 redis 2 redmine
2 richard 2 root123 2 rootme 2 rsync
2 sakura 2 saw 2 scanner 2 security
2 servercs 2 serverpi 2 services 2 shell
2 sinus123 2 skan 2 skaner 2 snoopy
2 soccer 2 soft 2 software 2 steven
2 sweetie 2 sweety 2 tequiero 2 test123
2 test5 2 test6 2 testftp 2 tim
2 tomcat7 2 transfer 2 tsserver 2 ucpss
2 Untersee 2 upload 2 upport 2 uptime
2 user02 2 veronica 2 victor 2 video
2 virus 2 visitor 2 vnc 2 volumio
2 webconfi 2 webporta 2 webtest 2 Welcome1
2 wmware 2 x 2 xbmc 2 xuelp123
2 zhaowei 2 zxin10 4 50cent 4 666666
4 admin123 4 alan 4 alarm 4 alejandr
4 alpine 4 andy 4 antonio 4 babygirl
4 bamboo 4 bin 4 blankend 4 build
4 carlos 4 control 4 csgo 4 daemon
4 daniela 4 dante 4 database 4 debian-s
4 dev 4 edi 4 fabricio 4 fabrizio
4 forever 4 gian 4 giorgio 4 giovanni
4 hannah 4 hello 4 iloveyou 4 jira
4 justin 4 leonardo 4 marco 4 mine
4 minecraf 4 naruto 4 nas 4 nginx
4 odoo 4 odoo2 4 oracle4 4 packer
4 patricia 4 patrizio 4 paul 4 plex
4 qwer1234 4 rebecca 4 roberto 4 rocco
4 sergio 4 shadow 4 shorty 4 shoutcas
4 staff 4 sysop 4 t7adm 4 test4
4 tsbot 4 vincenzi 4 vitaly 4 web
4 welcome 6 2Wire 6 admin2 6 amber
6 bot 6 camera 6 develope 6 dummy
6 Guest 6 hduser 6 jason 6 max
6 mobile 6 mythtv 6 netman 6 proxy
6 !root 6 Root 6 samba 6 server
6 sinus 6 temp 6 teste 6 training
6 ts3bot 6 ts3sleep 6 ts3user 6 vagrant
6 vps 6 zimeip 7 sys 8 albert
8 alessio 8 alex 8 anna 8 aurora
8 bianca 8 elena 8 enrica 8 ethos
8 hadoop 8 informix 8 lorenco 8 lorenzo
8 lucaluca 8 luigi 8 luka 8 marcel
8 marcello 8 maria 8 marta 8 massimo
8 mattia 8 olivia 8 oracle1 8 pia
8 piero 8 pippo 8 romeo 8 sinusbot
8 suporte 8 t7inst 8 test7 8 testing
8 tommaso 8 ts 8 user3 8 valerio
10 0101 10 admins 10 cpanel 10 danny
10 dbuser 10 gnats 10 john 10 lavander
10 michael 10 miner 10 office 10 oracle3
10 postmast 10 prueba 10 test1 10 test8
10 tplink 10 user2 10 vmuser 12 101
12 123321 12 1502 12 266344 12 3comcso
12 aaa 12 acc 12 adam 12 adfexc
12 Admin 12 ADMN 12 agent 12 alessand
12 am 12 api 12 avahi 12 bill
12 bob 12 Cisco 12 draytek 12 echo
12 engineer 12 enrique 12 fax 12 gopher
12 helpdesk 12 houx 12 installe 12 kodi
12 luca 12 mario 12 mark 12 matteo
12 mike 12 mtch 12 naadmin 12 NAU
12 nt 12 pizza 12 Polycom 12 pos
12 print200 12 PRODDTA 12 PSEAdmin 12 radware
12 rapport 12 rcust 12 router 12 shop
12 steve 12 svin 12 svn 12 Sweex
12 SYSADM 12 SYSDBA 12 target 12 telco
12 telecom 12 ts3serve 12 ubadmin 12 user01
12 USERID 12 username 12 vcr 12 vmadmin
12 VNC 12 volition 12 vt100 12 VTech
12 webadmin 14 1111 14 a 14 demo
14 ftptest 14 info 14 library 14 media
14 midgear 14 superman 14 system 14 www-data
16 angelo 16 cvsuser 16 cyrus 16 donatell
16 dvs 16 firebird 16 oracle5 16 scan
16 supervis 16 vyatta 18 Administ 18 backup
18 ftpadmin 18 git 18 jenkins 18 mtcl
18 raspberr 18 steam 18 teamspea 18 tech
18 ts3 18 User 18 www 20 debian
20 martin 20 sales 20 sshd 20 test9
22 12345 22 oliver 22 setup 22 telecoma
22 test2 24 123456 24 client 24 daniel
24 Operator 24 student 24 sysadm 26 0
26 backuppc 26 vision 28 avis 28 cisco
28 david 28 Manageme 28 mother 28 mysql
28 sysadmin 28 uucp 30 plcmspip 30 public
32 apache 32 master 34 applmgr 34 osmc
34 phion 36 butter 36 squid 38 111111
38 cacti 38 cron 38 nobody 38 user1
38 wp-user 38 zimbra 40 scaner 42 anonymou
42 castis 42 ftp_user 46 123 46 22
46 PlcmSpIp 46 usuario 46 webmaste 50 monitor
54 qhsuppor 54 testuser 60 manager 60 sybase
62 jboss 64 ftp_test 65 service 72 tomcat
76 zabbix 78 administ 78 super 90 default
96 adm 96 nagios 102 1234 112 operator
128 oracle 130 postgres 142 ftp 228 ftpuser
242 support 292 pi 4140 ubuntu 4192 guest
4268 ubnt 4302 test 4434 user 6081 admin
267194 root
Given this i'm not entirely sure it's a great idea to be running
cvstrac - it appears to be unmaintained and so on, but it's only
intended to be a short-term solution anyway.
Weather's too nice to be inside, i've done enough hours for the
week, and a brother is in town so I think it's beer time!
Update 22/4/18: Thinking about the strange usernames, they
are probably bot related accounts? Doesn't really matter.
Welcome to the ZedZone
First post on the new blog!
Experimenting with a very rudimentary partly manual, somewhat
temporary posting mechanism until I can sort something better
out.
Apart from setting this up i've done a little hardening on the
software and the system. I tuned ssh a little bit. I added
robots.txt to code.zedzone.au to stop indexers creating
potentially infinite references. And I changed the blog
indexing method and some of the url's for the same reason.
Mostly root login attempts via ssh, some looking for various
(mostly php) server stuff (which isn't installed), and google
bot getting a bit cross eyed at some of my url
alias/rewriting/cgi mistakes.