← Back to all posts

Overwatch and the Download Pipeline

·
overwatch webhooks prowlarr infrastructure

Rewrote the notification system from scratch, lost all indexers, accidentally broke the VPN with redacted keys, and started NVMe migration.

Some days you build things. Other days you spend twelve hours debugging stuff that should work and doesn’t, and you come out the other side with solutions that are actually better than what you planned. Today was both kinds of day simultaneously.

Why Overwatch Needed a Rewrite

Overwatch is the notification system. It tells you when a download finishes. Version 1 watched the filesystem. When files appeared in the download directory, it sent a Telegram message. Simple. Broken.

The problem: inotify doesn’t work across Docker container mounts on Windows. Docker Desktop uses a virtualized filesystem layer. File events from inside containers don’t propagate to the host. Your download completes, the file appears, and inotify shrugs.

I added a polling fallback. Check the directory every 30 seconds for new files. It worked, technically. But polling is the “we give up” of event-driven architecture. I could feel my professors judging me from across the years.

The real fix was rethinking the approach entirely. Why watch the filesystem when Sonarr and Radarr already know when downloads complete? They have a webhook system. On import, they fire an HTTP request to whatever URL you configure. So Overwatch v2 became a webhook server.

class WebhookHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        content_length = int(self.headers['Content-Length'])
        body = json.loads(self.rfile.read(content_length))
        
        event_type = body.get('eventType')
        if event_type == 'Download':
            title = body.get('series', {}).get('title') or \
                    body.get('movie', {}).get('title')
            send_telegram(f"✅ {title} is ready to watch!")
        
        self.send_response(200)
        self.end_headers()

Configure Sonarr and Radarr to point webhooks at the server. No polling. No filesystem watching. No inotify nonsense. When Sonarr imports an episode, Overwatch knows instantly.

There’s a wrinkle though. The download client runs inside the VPN container. All its traffic goes through the WireGuard tunnel. It literally cannot see the host machine’s webhook port from where it sits. Sonarr and Radarr have normal networking so their webhooks work fine. But anything from the download client is stuck.

I ended up adding a relay container that shares the VPN network namespace and can forward requests to the host. It’s a hack. I know it’s a hack. But it works, and the alternative was restructuring the entire network topology.

The Prowlarr Disaster

Mid-afternoon, I recreated the Prowlarr container to update its config. Standard operation. Except I hadn’t mapped Prowlarr’s config directory to a persistent volume.

Every single indexer. Gone. All the configured search sources vanished. Prowlarr came up fresh, empty, confused. This is the Docker equivalent of “did you save?” and the answer was no.

Re-added all the indexers manually. Configured FlareSolverr as the proxy for the ones behind Cloudflare. Pointed them back at Sonarr and Radarr. Then triple-checked the volume mount:

prowlarr:
    volumes:
      - ./config/prowlarr:/config  # NEVER FORGET THIS AGAIN

The comment stayed in the compose file. It’s a scar, not documentation.

The VPN Key Incident

Then things got worse. I had a .env file with all the environment variables for the stack. VPN credentials, API keys, service passwords. At some point I’d redacted the WireGuard keys for a commit. Good instinct. Bad execution. The .env file got saved with the redacted values. When I recreated the Gluetun container, it pulled the redacted keys.

Gluetun failed to connect. The download client lost VPN access. The entire download pipeline went dark.

The fix was obvious once I found it. Put the real keys back. But the debugging was not obvious. Gluetun’s error messages when WireGuard keys are wrong are… not helpful. It took thirty minutes of reading container logs to understand that “interface creation failed” meant “your private key is literally the string [REDACTED].”

After this I restructured how env files work. The real .env is gitignored and contains actual values. A separate .env.example gets committed with placeholders. Never touching the real one for a commit again.

Jetson NVMe Migration

While the Surface was being dramatic, I started working on the Jetson’s storage. The Orin Nano came with JetPack on a microSD card. Slow, limited write endurance, bottleneck for everything. I wanted the OS on the NVMe drive.

Cloned the root filesystem with rsync, updated the UUID references in fstab, pointed the boot config to the NVMe partition. The Jetson uses extlinux for booting, not GRUB. Different from what I’m used to on x86 Linux, but straightforward once you find the right config file.

sudo rsync -axHAWX --numeric-ids --info=progress2 / /mnt/nvme/

Reboot. NVMe root. The SD card becomes a backup boot device. Night and day difference in responsiveness.

Writing It All Down

At the end of the day I realized I was carrying too much architecture in my head. Every service, every port, every container relationship, every credential flow. All implicit knowledge. If I forgot over a weekend, reconstructing this would hurt.

So I wrote a complete system document. Every service and its purpose. The Docker network topology. The credential flow through the Secret Broker. The webhook pipeline. Backup strategy. Failure modes.

It’s the kind of documentation you always mean to write but never do because you’re too busy building. Today I wrote it. Future-me will be grateful, or at least less confused.

Three Lessons

Event-driven beats polling. Always. Let the thing that knows about the event tell you, instead of compulsively checking.

Docker volumes are not optional. If you don’t explicitly persist something, it will eventually disappear. Treat every docker-compose up as a potential data-loss event for anything not mounted.

Document while you remember. The best time to write architecture docs is right after you’ve been debugging for hours and every connection is still vivid. The worst time is “later.” There is no later.

Tomorrow the real migration begins. The Jetson is prepped, the NVMe is fast, and the plan is ready. Time to move everything to its new home.