FastCGI Enabled

Further the previous post I did end up porting my blog driver to my fastcgi implementation.

Benchmarking using `ab' from home it doesn't really make any difference reading the front page of the blog - if anything it's actually marginally slower.

Running the benchmark locally though things are quite different.

Previous standard cgi:

Concurrency Level:      1
Time taken for tests:   13.651 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      37100000 bytes
HTML transferred:       36946000 bytes
Requests per second:    73.25 [#/sec] (mean)
Time per request:       13.651 [ms] (mean)
Time per request:       13.651 [ms] (mean, across all concurrent requests)
Transfer rate:          2654.05 [Kbytes/sec] received

Using fcgi:

Concurrency Level:      1
Time taken for tests:   0.706 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      37062000 bytes
HTML transferred:       36908000 bytes
Requests per second:    1416.39 [#/sec] (mean)
Time per request:       0.706 [ms] (mean)
Time per request:       0.706 [ms] (mean, across all concurrent requests)
Transfer rate:          51264.00 [Kbytes/sec] received

So yeah, only 20x faster. If I up the concurrency level of the benchmark it gets better but it's hard to tell much from it since everything is running on the same machine.

Regardless, I made it live.

FastCGI experiments

It's not particularly important - i'm lucky to get more than one-non-bot hit in a given day - but I thought i'd have a look into FastCGI. If in the future I do use a database backend or even a Java one it should be an easy way to get some performance while leveraging the simplicity of CGI and leaving the protocol stuff to apache.

After a bit of background reading and looking into some 'simple' implementations I decided to just roll my own. The 'official' fastcgi.com site is no longer live so I didn't think it worth playing with the official sdk. The way it handled stdio just seemed a little odd as well.

With the use of a few GNU libc extensions for stdio (cookie streams) and memory (obstacks) I put together enough of a partial (but robust) implementation to serve output-only pages from the fcgid module in a few hundred lines of code.

This is the public api for it.

struct fcgi_param {
        char *name;
        char *value;
};

struct fcgi {
        // Active during cgi request
        FILE *stdout;
        FILE *stderr;

        // Current request info
        unsigned char rid1, rid0;
        unsigned char flags;
        unsigned char role;

        // Current request params (environment)
        size_t param_length;
        size_t param_size;
        struct fcgi_param *param;
        struct obstack param_stack;

        // Internal buffer stuff
        int fd;
        size_t pos;
        size_t limit;
        size_t buffer_size;
        unsigned char *buffer;
};

typedef int (*fcgi_callback_t)(struct fcgi *, void *);

struct fcgi *fcgi_alloc(void);
void fcgi_free(struct fcgi *cgi);

int fcgi_accept_all(struct fcgi *cgi, fcgi_callback_t cb, void *data);
char *fcgi_getenv(struct fcgi *cgi, const char *name);

I didn't bother to implement concurrent requests, the various access control roles, or STDIN messages. The first doesn't appear to be used by mod_fcgi (it handles concurrency itself) and I don't need the rest (yet at least). As previously stated I used GNU libc extensions to implement custom stdio streams for stdout and stderr, although I used a custom 'zero-copy' buffer implementation for the protocol handling (wherein the calls can access the internal buffer address rather than having to copy data around).

Converting a CGI program is a little more involved than using the original SDK because it doesn't hide the i/o behind macros or use global variables to pass information. Instead via a context-specific handle it provides stdio compatible FILE handles and a separate environmental variable lookup function. Of course it is possible to write a handler callback which can implement such a solution.

The main function of a the fast cgi program just allocates the context, calls accept_all and then free. The callback is invoked for each request and can access stdout/stderr from the context using stdio calls as it wishes.

Apache config

Here's the basic apache config snipped I used to hook it into `/blog' on a server (I did this locally rather than live on this site though).

        ScriptAlias /blog /path/fcgi-test.fcgi

        FcgidCmdOptions /path/fcgi-test MaxProcesses 1

        <Directory "/path">
                AllowOverride None
                Options +ExecCGI
                Require all granted
        </Directory>

Custom streams and cookies

Using a GNU extension it is trivial to hook up custom stdio streams - one gets all the benefits of libc's buffering and formatting and one only has to write a couple of simple callbacks.

#define _GNU_SOURCE

#include <sys/types.h>
#include <sys/uio.h>
#include <stdio.h>
#include <unistd.h>

static ssize_t fcgi_write(void *f, const char *buf, size_t size, int type) {
        struct fcgi *cgi = f;
        size_t sent = 0;
        FCGI_Header header = {
                .version = FCGI_VERSION_1,
                .type = type,
                .requestIdB1 = cgi->rid1,
                .requestIdB0 = cgi->rid0
        };

        while (sent < size) {
                size_t left = size - sent;
                ssize_t res;
                struct iovec iov[2];

                if (left > 65535)
                        left = 65535;

                header.contentLengthB1 = left >> 8;
                header.contentLengthB0 = left & 0xff;

                iov[0].iov_base = &header;
                iov[0].iov_len = sizeof(header);
                iov[1].iov_base = (void *)(buf + sent);
                iov[1].iov_len = left;
                
                res = writev(cgi->fd, iov, 2);
                if (res < 0)
                        return -1;

                sent += left;
        }

        return size;
}

static int fcgi_close(void *f, int type) {
        struct fcgi *cgi = f;
        FCGI_Header header = {
                .version = FCGI_VERSION_1,
                .type = type,
                .requestIdB1 = cgi->rid1,
                .requestIdB0 = cgi->rid0
        };
        if (write(cgi->fd, &header, sizeof(header)) < 0)
                return -1;
        return 0;
}

Well perhaps the callbacks are more `straightforward' than simple in this case. FastCGI has a payload limit of 64K so any larger writes need to be broken up into parts. I use writev to write the header and content directly from the library buffer in a single system call (a pretty insignificant performance improvment in this case but one nonetheless). I might need to handle partial writes but this works so far - in which case the writev approach gets too complicated to bother with.

The actual 'cookie' callbacks just invoke the functions above with the FCGI channel to write to.

  
static ssize_t fcgi_stdout_write(void *f, const char *buf, size_t size) {
        return fcgi_write(f, buf, size, FCGI_STDOUT);
}

static int fcgi_stdout_close(void *f) {
        return fcgi_close(f, FCGI_STDOUT);
}

const static cookie_io_functions_t fcgi_stdout = {
        .read = NULL,
        .write = fcgi_stdout_write,
        .seek = NULL,
        .close = fcgi_stdout_close
};

And opening a custom stream is as as simple as opening a regular file.

static int fcgi_begin(struct fcgi *cgi) {
        cgi->stdout = fopencookie(cgi, "w", fcgi_stdout);

        ...;

        return 0;
}

Example

Here's a basic example that just dumps all the parameters to the client. It also maintains a count to demonstrate that it's persistent.

I went with a callback mechanism rather than the polling mechanism of the original SDK mostly to simplify managing state. Shrug.

#include "fcgi.h"

static int cgi_func(struct fcgi *cgi, void *data) {
        static int count;

        fprintf(cgi->stdout, "Content-Type: text/plain\n\n");
        fprintf(cgi->stdout, "Request %d\n", count++);
        fprintf(cgi->stdout, "Parameters\n");
        for (int i=0;i<cgi->param_length;i++)
                fprintf(cgi->stdout, " %s=%s\n", cgi->param[i].name, cgi->param[i].value);

        return 0;
}

int main(int argc, char **argv) {
        struct fcgi * cgi = fcgi_alloc();
        
        fcgi_accept_all(cgi, cgi_func, NULL);
        
        fcgi_free(cgi);
}

Notes

I haven't worked out how to get the CGI script to 'exit' when the MaxRequestsPerProcess limit has been reached without causing service pauses. Whether I do nothing or whether I exit and close the socket at the right time it still pauses the next request for 1-4 seconds.

I haven't converted my blog driver to use it yet - maybe later on tonight if I keep poking at it.

Oh and it is quite fast, even with a trivial C program.

Versioning DB

Well I don't have any code ready yet but between falling asleep today i did a little more work on a versioned data-store i've been working on ... for years, a couple of decades infact.

In it's current iteration it utilises just 3 core (simple) relational(-like) tables and can support both SVN style branches (lightweight renames) and CVS style branches (even lighter weight). Like SVN it uses global revisions and transactions but like CVS it works on a version tree rather than a path tree, but both approaches are possible within the same small library.

Together with dez it allows for compact reverse delta storage.

Originally I started in C but i've been working Java since - but i'm looking at back-porting the latest design to C again for some performance comparisons. It's always used Berkeley DB (JE for Java) as storage although I did experiment with using a SQL version in the past.

My renewed interest is that the goal is eventually run this site with it as a backing storage - for code, documentation, musings. e.g. the ability to branch a document tree for versioning and yet have it served live from common storage. This was essentially the reason I started investigating the project many years ago but never quite got there. I'm pretty sure I've got a solid schema but still need to solidify the API around a few basic use-cases before I move forward.

The last time I touched the code as 2 years ago, and the last time I did any significant code on it was 3 years ago, so it's definitely a slow burner project!

Well, more when I have more to say.

FFmpeg 4.0

Just a short post about the latest FFmpeg release. I tried building jjmpeg 3.0.1 against FFmpeg 4.0 and it compiles cleanly with no warnings.

So I think it should be good enough to go ... but I realised I don't actually have anything handy already written to test it against right now so that's only a guess.

Once I do i'll bump the version and do another release. This is more or less what I had planned to do today but instead got tags working on this site instead.

Tags & Styles

Worked a little more on the site.

The big one is that i've added tags back to the pages. Once you start viewing a tag based index is sticks to it navigation until it's cleared or another is chosen. It works in pretty much the way you'd expect it to.

I've also linked in a stylesheet and started filling it out - but this is very rudimentary for now and not much more than enough to make things operate properly.

Powered by gcc and me!

Just for a little background, the blog itself is currently driven by a small stand-alone C program which is executed as a cgi script. The parameter processing is quite strict and just fails with a 4xx series error if anything (external) isn't right. At the moment it doesn't use any sort of database as such - the post text is simply a file on disk which is interleaved within code-generated text. A script generates several indexes in the form of C code from the filenames and a metadata properties file which is then compiled into the binary.

For now, to create a new post I have a small program which creates a simple post template and launches emacs directly on the server. If the file gets edited when emacs exits it then gets moved into the post directory and a metadata file is created. I then have to run make install on the binary to update the indices for the new file and any tags. This can eventually be replaced with a web-based editor with image uploads and so on if I ever get around to it.

It's just meant to be a 'quick and dirty' to get it up and running, but I somewhat like it's simplicity. I can't see it being of any particular interest or use to anyone but I will eventually publish it as Free Software at some later date.

Ahh stuff it

Got sick off all the snot in the logs so i've just moved ssh to another port and DROP all incoming ssh packets.

Well i'm doing a LOG + DROP for now just out of curiosity, but at least the failed login attempts have stopped cold.

I also put up a banner on a-hackers-craic redirecting here. This site still supports access via the year/month/title.html url's that match the ones on blogger (in addition to the hex-id ones); I was going to try to write some javascript to link or direct each post to the new one but it just seems like too much work today.

NotworkManager and other small things

Had a few problems with system updates lately. One was an upgrade to my remaining slackware system that broke a few things. First it wanted to run LILO after updating the kernel and I said no (I don't use it); not sure if that would have run the grub setup but what happened it wasn't run. Fortunately one of the kernels in grub still existed and booted so it wasn't too hard to fix.

It also broke NetworkManager - or rather, it stopped working again. It's been a flakey piece of shit forever but I thought it was finally 'stable' enough to use (despite a few quirks on that machine like it not automatically reconnecting after waking up).

Well not so!

It simply wouldn't connect anymore. No idea why. I went back to using rc.inet1.conf and it now works flawlessly - even reconnects after waking up. I'd already done this (or equivalent) on all my other machines, and it seems to be with good reason.

Crackers

I knew the internet was pretty slimey these days but actually setting up a server on the naked internet over the last weekend was a bit of an eye opener.

I noticed a massive spike in traffic on the 15th - given that the only service running at the time was the 'experiment' page 1GB seemed a bit off. It was just someone brute-forcing sshd. Since this server went live on the 26th of march it has processed over 300 000 failed login attempts, I imagine (but haven't verified) most of those were on the 15th. They certainly weren't me.

It's probably just a drop in the ocean compared to all the `real' traffic but it seems such a waste. Yay for bots.

So i've put a few mitigations in place over the last few days:

iptables rules to throttle new connections to port 22;
disabled root login through ssh entirely;
added a small blacklist using ipset.

I don't really want to have to maintain the last but i'll see how it goes.

Anyway it's sort of interesting to see the logins being used - root is obvious but hottie, mother and david don't seem too obvious.

Just for fun, here's the complete list of the usernames and frequency counts as of a few minutes ago.

       1 irc                1 sync               1 syslog             2             
       2 !                  2 12345678           2 1234qwer           2 123qwe      
       2 12qwaszx           2 1qazxsw2           2 654321             2 777777      
       2 aaron              2 abcd1234           2 admin@12           2 admintek    
       2 admUS              2 adriana            2 aion               2 alexis      
       2 amanda             2 amit               2 amy                2 andrea      
       2 angela             2 anthony            2 antiviru           2 ARGENTIN    
       2 arsenal            2 ashok              2 asshole            2 bananapi    
       2 bank               2 baseball           2 board              2 bobby       
       2 bonita             2 botmaste           2 byte               2 bytes       
       2 cameron            2 carditek           2 carmen             2 carolina    
       2 centos             2 chat               2 chelsea            2 chicken     
       2 chris              2 cinema             2 claudia            2 corazon     
       2 counters           2 crystal            2 cs                 2 csgoserv    
       2 csserver           2 customs            2 cuteako            2 cvs         
       2 cyber              2 data               2 db1                2 db2inst1    
       2 december           2 deploy             2 destiny            2 docker      
       2 download           2 dragon             2 dvd                2 edu         
       2 educatio           2 elastics           2 family             2 fedora      
       2 flower             2 forum              2 freedom            2 ftpuser1    
       2 gabriel            2 games              2 gaming             2 gb          
       2 ghost              2 gmodserv           2 gnu                2 gnuworld    
       2 greenday           2 harley             2 hdsf               2 hiitplc     
       2 home               2 hottie             2 html               2 http        
       2 hunter             2 idc!@              2 internet           2 ircd        
       2 isabel             2 jessica            2 jessie             2 jiamima     
       2 karen              2 kartel             2 keith              2 kernel      
       2 kitten             2 kmc                2 laura              2 lauren      
       2 libuuid            2 liferay            2 linaro             2 linux       
       2 linuxmin           2 liverpoo           2 logon              2 lovers      
       2 lpa                2 lucas              2 maganda            2 maggie      
       2 mail               2 mailman            2 maintain           2 manuel      
       2 marketin           2 matthew            2 mdb                2 miguel      
       2 muiehack           2 music              2 musicbot           2 mylove      
       2 myspace            2 nathan             2 Neuchate           2 Norwood     
       2 ns                 2 ns2                2 nuucp              2 october     
       2 odroid             2 openssh-           2 openvpn            2 oper        
       2 oracle2            2 orlando            2 otrs               2 pass        
       2 passw0rd           2 passwd             2 pc                 2 pepper      
       2 php                2 pictures           2 poohbear           2 portal      
       2 pretty             2 princess           2 proba              2 proftpd     
       2 project            2 p@ssw0rd           2 purple             2 q1w2e3r4    
       2 qazwsx             2 qwe123             2 qwerty             2 radio       
       2 rangers            2 rdp                2 redis              2 redmine     
       2 richard            2 root123            2 rootme             2 rsync       
       2 sakura             2 saw                2 scanner            2 security    
       2 servercs           2 serverpi           2 services           2 shell       
       2 sinus123           2 skan               2 skaner             2 snoopy      
       2 soccer             2 soft               2 software           2 steven      
       2 sweetie            2 sweety             2 tequiero           2 test123     
       2 test5              2 test6              2 testftp            2 tim         
       2 tomcat7            2 transfer           2 tsserver           2 ucpss       
       2 Untersee           2 upload             2 upport             2 uptime      
       2 user02             2 veronica           2 victor             2 video       
       2 virus              2 visitor            2 vnc                2 volumio     
       2 webconfi           2 webporta           2 webtest            2 Welcome1    
       2 wmware             2 x                  2 xbmc               2 xuelp123    
       2 zhaowei            2 zxin10             4 50cent             4 666666      
       4 admin123           4 alan               4 alarm              4 alejandr    
       4 alpine             4 andy               4 antonio            4 babygirl    
       4 bamboo             4 bin                4 blankend           4 build       
       4 carlos             4 control            4 csgo               4 daemon      
       4 daniela            4 dante              4 database           4 debian-s    
       4 dev                4 edi                4 fabricio           4 fabrizio    
       4 forever            4 gian               4 giorgio            4 giovanni    
       4 hannah             4 hello              4 iloveyou           4 jira        
       4 justin             4 leonardo           4 marco              4 mine        
       4 minecraf           4 naruto             4 nas                4 nginx       
       4 odoo               4 odoo2              4 oracle4            4 packer      
       4 patricia           4 patrizio           4 paul               4 plex        
       4 qwer1234           4 rebecca            4 roberto            4 rocco       
       4 sergio             4 shadow             4 shorty             4 shoutcas    
       4 staff              4 sysop              4 t7adm              4 test4       
       4 tsbot              4 vincenzi           4 vitaly             4 web         
       4 welcome            6 2Wire              6 admin2             6 amber       
       6 bot                6 camera             6 develope           6 dummy       
       6 Guest              6 hduser             6 jason              6 max         
       6 mobile             6 mythtv             6 netman             6 proxy       
       6 !root              6 Root               6 samba              6 server      
       6 sinus              6 temp               6 teste              6 training    
       6 ts3bot             6 ts3sleep           6 ts3user            6 vagrant     
       6 vps                6 zimeip             7 sys                8 albert      
       8 alessio            8 alex               8 anna               8 aurora      
       8 bianca             8 elena              8 enrica             8 ethos       
       8 hadoop             8 informix           8 lorenco            8 lorenzo     
       8 lucaluca           8 luigi              8 luka               8 marcel      
       8 marcello           8 maria              8 marta              8 massimo     
       8 mattia             8 olivia             8 oracle1            8 pia         
       8 piero              8 pippo              8 romeo              8 sinusbot    
       8 suporte            8 t7inst             8 test7              8 testing     
       8 tommaso            8 ts                 8 user3              8 valerio     
      10 0101              10 admins            10 cpanel            10 danny       
      10 dbuser            10 gnats             10 john              10 lavander    
      10 michael           10 miner             10 office            10 oracle3     
      10 postmast          10 prueba            10 test1             10 test8       
      10 tplink            10 user2             10 vmuser            12 101         
      12 123321            12 1502              12 266344            12 3comcso     
      12 aaa               12 acc               12 adam              12 adfexc      
      12 Admin             12 ADMN              12 agent             12 alessand    
      12 am                12 api               12 avahi             12 bill        
      12 bob               12 Cisco             12 draytek           12 echo        
      12 engineer          12 enrique           12 fax               12 gopher      
      12 helpdesk          12 houx              12 installe          12 kodi        
      12 luca              12 mario             12 mark              12 matteo      
      12 mike              12 mtch              12 naadmin           12 NAU         
      12 nt                12 pizza             12 Polycom           12 pos         
      12 print200          12 PRODDTA           12 PSEAdmin          12 radware     
      12 rapport           12 rcust             12 router            12 shop        
      12 steve             12 svin              12 svn               12 Sweex       
      12 SYSADM            12 SYSDBA            12 target            12 telco       
      12 telecom           12 ts3serve          12 ubadmin           12 user01      
      12 USERID            12 username          12 vcr               12 vmadmin     
      12 VNC               12 volition          12 vt100             12 VTech       
      12 webadmin          14 1111              14 a                 14 demo        
      14 ftptest           14 info              14 library           14 media       
      14 midgear           14 superman          14 system            14 www-data    
      16 angelo            16 cvsuser           16 cyrus             16 donatell    
      16 dvs               16 firebird          16 oracle5           16 scan        
      16 supervis          16 vyatta            18 Administ          18 backup      
      18 ftpadmin          18 git               18 jenkins           18 mtcl        
      18 raspberr          18 steam             18 teamspea          18 tech        
      18 ts3               18 User              18 www               20 debian      
      20 martin            20 sales             20 sshd              20 test9       
      22 12345             22 oliver            22 setup             22 telecoma    
      22 test2             24 123456            24 client            24 daniel      
      24 Operator          24 student           24 sysadm            26 0           
      26 backuppc          26 vision            28 avis              28 cisco       
      28 david             28 Manageme          28 mother            28 mysql       
      28 sysadmin          28 uucp              30 plcmspip          30 public      
      32 apache            32 master            34 applmgr           34 osmc        
      34 phion             36 butter            36 squid             38 111111      
      38 cacti             38 cron              38 nobody            38 user1       
      38 wp-user           38 zimbra            40 scaner            42 anonymou    
      42 castis            42 ftp_user          46 123               46 22          
      46 PlcmSpIp          46 usuario           46 webmaste          50 monitor     
      54 qhsuppor          54 testuser          60 manager           60 sybase      
      62 jboss             64 ftp_test          65 service           72 tomcat      
      76 zabbix            78 administ          78 super             90 default     
      96 adm               96 nagios           102 1234             112 operator    
     128 oracle           130 postgres         142 ftp              228 ftpuser     
     242 support          292 pi              4140 ubuntu          4192 guest       
    4268 ubnt            4302 test            4434 user            6081 admin       
  267194 root

Given this i'm not entirely sure it's a great idea to be running cvstrac - it appears to be unmaintained and so on, but it's only intended to be a short-term solution anyway.

Weather's too nice to be inside, i've done enough hours for the week, and a brother is in town so I think it's beer time!

Update 22/4/18: Thinking about the strange usernames, they are probably bot related accounts? Doesn't really matter.

About Me

Tags