d.werder

Notes on Systems & Infrastructure

I'm Daniel — a systems engineer based in Zurich. I write about Linux, self-hosting, networking, and the occasional deep-dive into whatever I'm troubleshooting at the moment.

Recent Posts

Migrating from Nginx to Caddy: Was It Worth It?

After running Nginx for years on all my personal servers, I finally gave Caddy a proper try. Here's what I found after three months of running it in production on five different machines.

Monitoring a Homelab with Prometheus and Grafana

My setup for monitoring 12 containers and 3 physical hosts with Prometheus, node_exporter, and Grafana. Includes alert rules that actually make sense.

Hardening SSH: Beyond the Basics

Everyone knows to disable password auth and change the port. Here are the less obvious steps I take on every new server: certificate-based auth, fail2ban tuning, and kernel-level restrictions.

About

I'm Daniel Werder, a systems and infrastructure engineer based in Zurich, Switzerland. I've been working with Linux servers professionally since around 2015, mostly in small-to-medium hosting and SaaS environments.

Outside of work I run a homelab with more machines than I probably need, contribute to a few open source projects when I have time, and slowly work through my ever-growing reading list of technical books.

This blog is a place for me to document things I've figured out, mostly so I don't have to figure them out again. If it helps someone else along the way, even better.

You can reach me at daniel [at] werder [dot] ch.

This is a personal, non-commercial website. No cookies are used for tracking or analytics. Server access logs are retained for 7 days for security purposes and then deleted.

← Back to all posts

Migrating from Nginx to Caddy: Was It Worth It?

March 19, 2026 · 5 min read

I've been an Nginx user since 2016. It's fast, well-documented, and I know its config syntax like the back of my hand. So when people kept recommending Caddy, I honestly didn't see the point. Automatic HTTPS? I already had certbot scripts. Simpler config? Nginx configs aren't that hard once you've written a few hundred of them.

But curiosity got the better of me, and three months ago I migrated five personal servers over. Here's what I found.

The first thing that struck me was the Caddyfile format. Compare a typical Nginx reverse proxy block:

server {
    listen 443 ssl http2;
    server_name app.example.com;
    ssl_certificate /etc/letsencrypt/live/...;
    ssl_certificate_key /etc/letsencrypt/live/...;

    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

With the equivalent Caddy config:

app.example.com {
    reverse_proxy 127.0.0.1:3000
}

That's it. TLS is automatic. Headers are set sensibly by default. The certificate renews itself without cron jobs or hooks.

Performance-wise, I haven't noticed any meaningful difference for my workloads. Both handle static files and reverse proxying without breaking a sweat at my traffic levels (which, let's be honest, are modest).

The one area where Nginx still wins is edge cases. Complex rewrite rules, map blocks, conditional logic — Nginx gives you more low-level control. Caddy's approach is more opinionated, which is great until you need something it hasn't opinionated about.

After three months, I'm keeping Caddy on four of the five servers. The fifth has a complex multi-tenant setup with conditional routing that was easier to express in Nginx. For everything else, the reduced config complexity and zero-touch TLS has been worth the switch.

Would I recommend it? If you're setting up a new server today and don't have years of Nginx muscle memory — yes, absolutely. If you have a working Nginx setup and no pain points, there's no urgent reason to switch.

← Back to all posts

Practical WireGuard: Setting Up a Multi-Site Mesh

March 4, 2026 · 7 min read

WireGuard has been my go-to VPN protocol for a while now, and for good reason: it's fast, simple, and the codebase is small enough that you can actually audit it. But most tutorials only cover the basic point-to-point setup. Here's how I connected three VPS nodes and my home server into a private mesh.

The goal was straightforward: every node should be able to reach every other node over private IPs, traffic between them should be encrypted, and if one node goes down, the others should still be able to communicate.

First, the addressing scheme. I used 10.100.0.0/24 for the mesh:

# Node 1 (Frankfurt)  - 10.100.0.1
# Node 2 (Helsinki)   - 10.100.0.2
# Node 3 (Zurich VPS) - 10.100.0.3
# Node 4 (Home)       - 10.100.0.4

Each node gets a WireGuard interface with a config that lists all other nodes as peers. The key insight for a mesh (as opposed to a hub-and-spoke) is that every node needs to have every other node in its peer list, and AllowedIPs should only contain that specific peer's address.

For the home server behind NAT, I set PersistentKeepalive = 25 so it maintains the connection through the router. The VPS nodes with public IPs don't need this.

DNS was the trickier part. I wanted to resolve hostnames like frankfurt.mesh from any node. I ended up running a small CoreDNS instance on one of the VPS nodes that serves a custom zone file, and pointed all nodes' resolv.conf at it through the mesh IP.

The whole setup has been running for about four months now without any issues. WireGuard reconnects automatically after reboots, and the overhead is negligible — I measured roughly 3-5% throughput reduction compared to direct connections.

← Back to all posts

Monitoring a Homelab with Prometheus and Grafana

February 17, 2026 · 6 min read

I put off setting up proper monitoring for way too long. For years I relied on a combination of htop over SSH and the occasional "does the website load?" check. It worked until it didn't — I lost a drive in a ZFS mirror and didn't notice for two weeks because SMART warnings were going to a log file nobody was reading.

That was the push I needed. Here's what I ended up building.

The stack is standard: Prometheus for metrics collection, node_exporter on every host, cAdvisor for container metrics, and Grafana for dashboards. Everything runs in Docker on a dedicated monitoring VM.

The part most guides skip is alert rules that don't drive you crazy. My approach: only alert on things that require action within 24 hours. Disk above 85%? Alert. CPU spike to 90% for 5 minutes? Not an alert — that's just a busy server doing its job. A container restarting more than 3 times in an hour? Alert. Load average above 4 on a 4-core machine? Only if it persists for 15+ minutes.

For notifications I use Alertmanager with a webhook to a private Telegram bot. Email notifications went to a folder I never checked. Telegram messages I actually see.

Total resource overhead: Prometheus uses about 200MB of RAM and negligible CPU. Grafana another 150MB. node_exporter is invisible. For the visibility it gives me, that's a bargain.

← Back to all posts

ZFS on Linux in 2026: Still My Filesystem of Choice

January 28, 2026 · 5 min read

Every year or so I re-evaluate whether ZFS is still the right choice for my servers, and every year I come to the same conclusion: nothing else gives me the same combination of data integrity, snapshots, and flexibility.

My current setup across three machines: two mirror vdevs per pool (so four drives per pool), with weekly scrubs and hourly snapshots retained for 30 days. The snapshot rotation is handled by sanoid, which has been rock-solid.

The killer feature for me is still zfs send | zfs receive. I replicate snapshots from my primary NAS to a backup server every night over WireGuard. The incremental sends are tiny — usually a few hundred MB — and I get a byte-for-byte copy of the filesystem state on the remote end.

Common complaints about ZFS: memory usage and complexity. The memory point is fair — ZFS's ARC will happily use all available RAM for caching, though you can cap it. On my 32GB NAS, I limit it to 16GB. For complexity, I'd argue that once you understand pools, vdevs, and datasets, the day-to-day is actually simpler than managing LVM + ext4 + separate backup tooling.

Btrfs has come a long way, and I wouldn't discourage anyone from using it. But for data I care about — family photos, documents, project archives — I stick with ZFS. The track record matters.

← Back to all posts

Hardening SSH: Beyond the Basics

January 10, 2026 · 6 min read

Disabling password authentication and using key-based login is step one of SSH hardening. Most guides stop there, maybe with a mention of changing the default port. Here's what I do on every new server beyond those basics.

First, I use SSH certificates instead of raw public keys. With a certificate authority (even a self-managed one), you sign user keys and host keys. The server trusts any key signed by your CA, and you trust any host signed by it. No more TOFU (trust on first use) warnings, and revoking access means revoking the certificate — you don't need to hunt down authorized_keys files across machines.

The setup with ssh-keygen:

# Generate a CA key pair (do this once, store securely)
ssh-keygen -t ed25519 -f ca_key -C "my-ssh-ca"

# Sign a user's public key
ssh-keygen -s ca_key -I user@hostname -n root -V +52w id_ed25519.pub

# Sign a host key
ssh-keygen -s ca_key -I hostname -h -n hostname.example.com /etc/ssh/ssh_host_ed25519_key.pub

Second, I restrict SSH to specific algorithms. In /etc/ssh/sshd_config:

KexAlgorithms curve25519-sha256
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com

This eliminates older, weaker algorithms. If you only connect from modern clients, there's no reason to support anything else.

Third, fail2ban tuning. The defaults are too lenient in my opinion. I set maxretry = 3 and bantime = 1h for SSH, with a recidive jail that bans repeat offenders for a week. The number of brute-force attempts dropped from hundreds per day to essentially zero after the first week.

Finally, kernel-level restrictions: I use sysctl settings to disable IP source routing, ignore ICMP redirects, and enable SYN cookies. These aren't SSH-specific, but they reduce the attack surface of any internet-facing machine.