Home | stroppykitten.com - An occasional sysadmin blog

My First WebExtension - CookieMaster

2018-02-11

I finally completed my first WebExtension today. It's for controlling what domains may set cookies, in the style of the old CookieMonster add-on for Firefox that didn't get ported to WebExtensions.

Modern (2017/2018) CSS + Javascript guides

2018-02-06

Despite being in IT and doing things in and around web-development for 20 years, my focus has been on System Administration for a long time. I 'know' HTML, Javascript, and CSS at a perfunctory level and can do all sorts of fun (but typically ugly) things in them, but modern 'front-end' development has changed a lot over the last 5-10 years, and I haven't had the time (nor inclination, to be frank) to keep up.

Prometheus, Grafana, rates, and statistical kerfluffery

2018-01-22

I love performance graphs and monitoring software; graphs are pretty, and there's nothing quite like the feeling of using a graph to identify precisely the cause of a technical problem. It means, however, that every few years I end up delving deep into some aspect of them to figure out why my graphs don't look 'right'. A few years ago it was Cacti + RRD losing information we thought we were keeping (pro-tip: consolidating to the maximum value of an average seen over a given period, rather than keeping the maximum of all maximums seen, is probably not what you want). This is the store of my latest battle with Prometheus and Grafana; there are quite a few different moving parts involved, and the story is an interesting one. Come on a journey with me (Spoiler: I win in the end).

DR for Puppet

2017-01-01

I recently had to set up a DR (Disaster Recovery) capability for our Puppet Master (Puppet 4, Open Source version); until now, we'd run with just a single puppet master in a single geographical location. Certain events brought DR to the forefront of our minds and priority lists, and the task fell to me. Note that I'm not talking about resilient/redundant arrangements of multiple always-live servers as very well documented at https://docs.puppet.com/guides/scaling_multiple_masters.html, but rather a somewhat simpler approach more suited to our current needs. The solution was fairly simple in the end, and worth recording in case it is of value for someone else. The constraints:

What Could Possibly Go Wrong

2016-12-07

At my work we often throw around the phrase WCPGW (What Could Possibly Go Wrong) in response to ill-advised or just plain crazy ideas. It's fun, and lets off some steam, but it occurred to me recently that there's a useful kernel of truth in it. Indeed, a good sysadmin is always asking this question; when designing systems, preparing to make a change, in the heat of an emergency, and in security design and response.

VarnishNCSA LogFormat hackery

2015-09-28

Varnishncsa is the logging component of the Varnish web cache software (https://www.varnish-cache.org/). Sadly it doesn't care for reading a config file, so the only way to configure the format of the log lines is on the command line. The trouble with that is spaces and quotes. Sure, on Debian/Ubuntu you have /etc/default/varnishncsa, and you can set "$DAEMONOPTS" to add -F . But because this log format will typically contain spaces, the quoting is exactingly particular, and as far as I've been able to determine, by the time it's been set through $DAEMONOPTS, there's either no way to get the quoting right, or it's too complex for a simple sysadmin like me to figure out (and according to my web searching, many smarter people have tried that approach and failed).

Fun times with random IPSec corruption

2015-01-25

Let me tell you a story of woe, intermittent/random corruption, and confusion.

Things I have learned - Part 5

2013-11-25

Short and sweet today: There is always a point of failure, between your redundant, non-single-point-of-failure components You know, the single cable or switch that connects your VRRP firewalls, which on failure results in two machines that both think they're master. Or the RAID controller that connects to both disks in your RAID-1 mirror, which on failure takes out both disks (or worse, corrupts data on them). Or that little Environmental Monitoring Unit in your lovely big SAN that, on failure, makes the redundant SAN controllers decide that they cannot and should not be serving any delicious data from your racks of redundant disks to your servers over the multi-path multi-switch SAN fabric. That last one? Yeah, saw that in production once. Removing all single points of failure is actually rather hard. I'm not saying you shouldn't try, but when you think you're done, look again. Look in the cracks between your components, and ask yourself what will happen if those cracks widen. It's kinda fun, in a "watch a scary movie" way.

Why Perl programs should always 'use strict'

2013-11-17

Yes, every Perl programmer knows that you should 'use strict', but sometimes it's just easier not to. BUT YOU SHOULD ALWAYS DO IT ANYWAY. I just spent an hour debugging a bit of existing code where I added a bit of fork/waitpid code (copy/pasted from elsewhere) to implement concurrent child processing. And because 'use strict' wasn't on in (not my fault, the original code isn't mine), and I didn't add use POSIX ":sys_wait_h"; at the start, the WNOHANG constant wasn't defined. So perl just said "ok, I'll make that 0". Which means that my waitpid that was supposed to not hang, did indeed hang, so my concurrency code failed miserably to be concurrent. This makes me grumpy.

Things I have learned - Part 4

2013-11-15

Following on from part 2 of this series, is the requirement that e-mail must leave the system it's generated on, unless it's going to a mailbox on the local system which is actively monitored (weekly minimum, preferably every week-day or maybe even several times a day). Typically that place will only be on your actual mail server. In general, system mail must go to a real IMAP/POP/Exchange/LotusNotes/Whatever mailbox server (MDA), where someone' mail client will present it to them.