Saturday. Eleven AM. One terminal.

A week ago I launched an AI chatbot on imarch.dev. A virtual clone that answers visitors’ questions about me, my experience, and my services. FastAPI, Docker, Hetzner Cloud. But it’s not just a wrapper around an API with a massive system prompt. My bio, projects, tech stack, it all lives in a vector database. For every visitor question, the bot first retrieves relevant fragments via embeddings, then sends only that compact context to Claude Haiku 3.5. The vector DB acts as a cache: fewer tokens, faster responses, lower bills. It works. But “works” and “production-ready” are two very different things.

Today I sat down to close the tech debt. In a single day, I went from “works” to “locked down.”

A tower of concentric shields with an AI brain at its core

Data Audit

Started with the data. The chatbot’s system prompt and my actual CV were out of sync. 99.9% uptime instead of 99.92%. Missing projects. The bot was confidently lying to visitors about facts from my own biography. I synchronized everything by hand, line by line.

Six Holes

Then security. I opened the terminal and started testing the chatbot the same way I’d test someone else’s service. Rate limiting at the Python level? Yes. At the Nginx level? No. Which means during a DDoS, requests still hit the application. X-Forwarded-For was taken from the client header, so spoof a fake IP and you bypass rate limiting entirely. The app container was running as root. Body size was unlimited. No timeout on Claude API calls, so one hang would block a worker forever.

Six holes. None critical. All obvious.

12 Layers of Defense

Fixed in a few hours:

Nginx: rate limit 5 req/s, burst 10, max 5 connections per IP, body 2 kilobytes
X-Forwarded-For overwritten with the real address
Fail2ban bans IPs for an hour after 20 violations
Docker: non-root user, .dockerignore, concurrency limit
FastAPI: 15-second API timeout, periodic session cleanup, hard cap on memory
SSH: disabled passwords, keys only
Added swap as an OOM safety net
Spending limit on Anthropic

Twelve layers. From Cloudflare at the front to iptables at the back.

A Co-Pilot in the Terminal

Working alongside me in the terminal was Claude Code. Not as autopilot. As a second engineer.

I say “check the security” and it reads configs, finds holes, suggests fixes. I say “apply” and it writes code, commits, deploys, SSHes into the server, and patches nginx. I review every change. But I barely write anything by hand.

This isn’t the future. This is a Saturday.

I don’t know what software development will look like in two years. But today, one architect with an AI co-pilot closed a volume of work that would have taken a week with a team of three.

And the chatbot no longer lies about uptime.

Want an AI assistant like this on your own site? Or looking to integrate AI into an existing product? Get in touch and let’s figure it out.

Saturday Deploy: 12 Layers of Defense

Data Audit

Six Holes

12 Layers of Defense

A Co-Pilot in the Terminal

Related posts

An AI Chatbot on My Website

Four Bugs in One Evening

The Bot Learned to Do Things