The Boring Stuff (at 1am)

Sentry, SEO, and the 37-minute midnight Docker debugging saga that taught me what .gcloudignore does. Plus the extensibility overhaul where one terminal became plugins, skills, agents, and hooks.

Eyal Harush12 min read

After shipping day closed the first-beta epic, I had a decision to make. The game was on TestFlight. Friends and family were playing it. The obvious temptation was to keep piling on features. The less obvious but more correct move was to spend a few days on the invisible work, the boring stuff that separates a prototype from a product.

This post is about that few days. It's also the post where my entire Claude Code workflow transformed from "one terminal, direct chat" into the multi-plugin, multi-skill, multi-agent orchestration that powers everything since. The Boring Stuff post doubles as the Extensibility Overhaul post. Both happened in the same 48 hours because one led to the other.

The 1am Docker saga

Let's start with the most relatable story. It's 00:28, April 3rd. I'm staring at the marketing website on my laptop. The home page loads fine. The /updates page loads fine. The PNG assets embedded in those pages (logo, hero background, zone preview images) return 404 Not Found.

The website ran fine locally. The website ran fine in dev. The website broke in production the moment the Docker image hit Cloud Run. Static assets that existed in the repo were, according to Cloud Run, not in the container.

Here's the git log of the next 37 minutes:

  • 00:28 Fix Buffer type and add path debugging for Docker PNG serving
  • 00:40 Fix Dockerfile COPY order so public/ PNGs survive standalone overlay
  • 00:48 Debug: add directory listing to PNG asset route
  • 01:05 Fix .gcloudignore excluding PNGs from Cloud Build

Four commits. 37 minutes. Each commit represents a guess that didn't fix the problem (except the last one).

The bug path went like this:

  1. First guess: it's a path resolution issue. Next.js in standalone mode sometimes has weird CWD behavior, and maybe the asset paths are resolving relative to the wrong directory. Added debug logging to show the runtime paths. Fix didn't work. Paths were fine.
  2. Second guess: it's a Dockerfile COPY order issue. When Next.js builds in standalone mode, it creates a .next/standalone/ directory that needs to be overlaid with public/ and .next/static/. Maybe the COPY order was wrong and public/ was being overwritten. Fixed the COPY order. Fix didn't work.
  3. Third guess (the turning point): stop guessing. I added a debug endpoint that lists the contents of the asset directory at runtime. Deploy, hit the endpoint, and the PNGs weren't in the directory at all. They weren't in the Docker image. That changed the diagnosis entirely. It wasn't a runtime path problem. It was a build-time "the files aren't even in the container" problem.
  4. Fourth fix (the actual fix): the files weren't in the container because .gcloudignore was excluding them from the Cloud Build context. .gcloudignore is the file that tells Cloud Build what to not upload when you submit a build. My .gcloudignore had a line that excluded *.png at some level. I don't remember the exact pattern anymore, but the effect was that every PNG in the repo was silently being stripped from the build context before Cloud Build even ran. Docker never saw them, the container never had them, the static asset route returned 404.

It's 12:30am and my PNGs won't load in production. Every developer reading this has been there. You're tired. Claude keeps suggesting fixes that don't work. You start to wonder if the problem is fundamentally different from what you think it is. Eventually you stop guessing and dump state, and when you dump state the real problem becomes obvious in the next iteration.

The lesson from this 37-minute saga: when guessing isn't working, dump state. List the directory. Print the environment. Log the full payload. Show what's actually there, not what you think should be there. Every debugging session I've ever run that stalled got unstuck by switching from "why isn't this working" to "what does the thing actually look like right now." The two questions sound similar and are very different. The first invites theorizing. The second forces empiricism.

The other lesson: build context exclusions are insidious. .gitignore, .dockerignore, .gcloudignore: each of them silently drops files, and none of them emit loud error messages when they do. When files "aren't there" in a deployed environment, check the exclusion files early. It's the fastest-to-check and most-easily-overlooked cause.

Voice FX, styled text, music: the fun layer

Fifteen minutes after the PNG saga ended at 01:05, at 01:30 the same night, a completely different kind of PR landed:

Add voice FX system, styled combo/hurry-up text, music refresh, and difficulty nerf (#34), 279 files

279 files. This is the fun stuff. The game gets personality in this PR:

  • Voice FX. When you chain a lot of jumps into a combo, a voice yells at you. "Sweet!" "Amazing!" "No way!" "Incredible!" The voice lines escalate with combo size. They're synthetic at this point (I'll replace them with real recorded voices later), but even synthetic voices radically change the feel of the game. A silent arcade climber is a chore. An arcade climber that yells at you when you're doing well is a game.
  • Styled combo/hurry-up text. The floating combo counter and the "hurry-up!" warning when the chase camera catches up both get a real visual treatment. Outlines, gradients, animation. They look like arcade game text now, not like debug overlays.
  • Music refresh. Background music for different zones, with proper crossfade when transitioning between zones.
  • Difficulty nerf. Testers said the game was too hard. Nerfed it. The follow-up PR introduced a whole new skill specifically for this.

Shipping the PNG fix and the dopamine-layer PR in the same session is the indie dev mood. Production hardening and the fun features happen in the same working session because there's only one of me and I can't batch them separately. The 00:28 commit is me cursing at Cloud Build. The 01:30 commit is me delighted that the game yells at players now. Same person, same night.

Balance boring work with fun work. Don't do only infrastructure or only features. The fun features keep the project feeling alive even when you're deep in the plumbing. If you're only doing plumbing, you'll burn out. If you're only doing features, you'll ship a pretty but fragile product.

The extensibility overhaul

I woke up on April 3rd, had coffee, and realized something. I wasn't pushing Claude Code to its full potential. The tool was starting to feel slow. Responses were longer than they needed to be. The model was making more mistakes than it had a week earlier. Prompts that used to feel magical were feeling ordinary.

What had changed? Not the tool. The tool was the same. What had changed was that my usage had settled into patterns, and the patterns weren't taking advantage of the extensibility features that would have made those patterns dramatically more efficient.

Specifically:

  • I was re-explaining the same project conventions to every new Claude Code session. CLAUDE.md helped, but I was also retyping specific instructions about /ship-style releases, Linear integration, commit message formats, and code review patterns.
  • I was running the same sequences of commands manually. Commit-push-PR-review-merge-changelog-Discord-Linear was an eight-step ritual I was doing from scratch every time instead of encoding it into something reusable.
  • I didn't have any domain-specific skills. Everything Claude did was "generic code assistant with project context." For tasks like difficulty tuning (where I'd explained the tuning rules to Claude multiple times) or level art generation (where I'd described the pipeline multiple times), I should have had skills.
  • I had no plugin setup. Every integration I used (Linear, Discord, Sentry, security guidance) was being rebuilt from scratch in every conversation instead of being installed as a reusable plugin.

All of this pointed in one direction: formalize the workflow into Claude Code's extensibility model.

Claude Code has four extensibility primitives:

  1. Plugins: reusable integrations, often with MCP servers, that provide domain capabilities (Linear, Discord, Sentry, SEO, etc.)
  2. Skills: user-invocable or model-invocable named workflows. /ship, /linear-review, /journal, /difficulty-adjuster. The reusable unit for "I want to do this sequence of steps."
  3. Agents: specialized sub-processes with their own context, dispatched for specific tasks. The checkpoint agent, the changelog agent, the level-generator agent. The reusable unit for "I want to delegate this kind of task."
  4. Hooks: shell commands triggered by Claude Code events (PreToolUse, PostToolUse, SessionEnd, PreCompact). Enforce rules, capture state, or trigger side effects.

On day 18, I committed the extensibility overhaul:

Overhaul Claude Code extensibility: plugins, skills, agents, hooks. 31 files

This commit enabled the full stack for my project. The already-existing /ship skill from shipping day was the proof-of-concept. After this commit, I added:

  • Security guidance plugin, hooks for protecting sensitive files
  • Language server plugins: TypeScript, Python, Swift LSP for proper typechecking during Claude sessions
  • Sentry plugin for crash investigation
  • Frontend design plugin for UI component guidance
  • Discord plugin for posting to the Geo Climber Discord
  • API plugin for the backend admin endpoints
  • Art plugin for the zone art generation pipeline

Plus domain-specific skills (/difficulty-adjuster landed the same day) and specialized agents (level-generator, level-implementer, product-manager, changelog, checkpoint).

The extensibility model is a qualitative change, not a quantitative one. Before this commit, my Claude Code sessions were "smart assistant with project context." After this commit, they were "team of specialists, each with their own scope and tools." The difference is not that things got faster (though they did). The difference is that things that weren't possible before now were. I could dispatch a level generation task to a specialized agent with its own tools and return to my main conversation without losing context. I could run /ship and have it invoke a code review agent in parallel with a CI watcher agent, both reporting back to the main session. I could fire off a /linear-review and have it interactively walk me through stale issues, updating them on my confirmation.

The thesis that came out of this day: when Claude gets slow or sloppy, it's usually a sign you haven't given it enough structure. Fix the structure rather than powering through. The instinct when things get slow is to try harder. The right response is to build tooling.

The second thesis: wait for workflows to stabilize before automating them. Premature automation encodes the wrong patterns. /ship worked as an automation because I'd been running that exact sequence by hand for weeks and it had converged on the same eight steps. If I'd tried to build /ship on day 3, it would have been built around a workflow that didn't exist yet and would have needed rewriting twice.

The difficulty-adjuster skill

One specific example from day 18 deserves its own callout because it shows the extensibility model applied to domain work, not just generic workflows.

PR #36 at 16:43 did two things in one commit: rebalanced the difficulty curve and added the /difficulty-adjuster skill.

Before the skill, every difficulty tuning pass was a multi-step explanation: "Claude, open game-content.json, find the difficultyProfiles section, here's how the timer values work, here's how the platform weight distribution works, here's what I want to change, explain why you're changing each value, don't break the easy/normal/hard separation, remember that floor 50 should take about 2 minutes in normal mode..." I was typing that paragraph every time I wanted to tweak difficulty.

After the skill, it's "Claude, make the game slightly harder on levels 1-3 but not the later ones" and the skill knows the config format, the weight system, the tuning rules, and the invariants I care about.

The skill is ~150 lines of markdown. It codifies everything I'd been explaining about difficulty tuning into a reusable context that every future difficulty-tuning session starts with.

Skills aren't just for workflows. They're for encoding domain judgment. Anything I find myself explaining to Claude repeatedly is a skill candidate. The difficulty-adjuster skill is the first domain-specific skill in the project, and since that day I've added five more. Each one removed a recurring friction point and freed me to focus on the decisions that actually need human judgment.

The payoff arrives on day 19

The extensibility overhaul was a huge amount of structural work on day 18. The payoff started arriving on day 19 and hasn't stopped since. The recording system in the next post, an in-progress AI player effort, the blog infrastructure you're reading right now, all of them were built using the workflow machinery that landed on day 18. Brainstorm, plan, subagent-driven implementation, ship. This is the workflow I now use for every non-trivial feature.

Without the extensibility overhaul, the rest of this series wouldn't exist. The recording system would have been a long chaotic chat session instead of a clean feature branch. The AI player design would have been a document I wrote alone instead of a spec that Claude and I iterated on together through the superpowers skills. The blog itself (this blog, the one you're reading) wouldn't have a /journal skill capturing conversations, a /draft-post skill turning journal entries into blog drafts, or the capture hooks that have been running quietly in the background this whole time.

Day 18 is the hinge point of the series. Days one through seventeen were about building the game. Day 18 was about building the machinery that builds the game. Days nineteen through the present are about using that machinery to build faster and more reliably than I could without it.

If you're running any non-trivial project with Claude Code and you haven't invested in the extensibility stack, invest. It's the single highest-leverage use of a few hours you can make. Set up CLAUDE.md properly. Install the plugins that match your stack. Build the first skill for a workflow you run every day. Build the first agent for a task you want to delegate. The compounding returns start the same week.

This is post 11 of 18 in a series about building Geo Climber with Claude Code. The extensibility overhaul is the hinge point of the project. Join the Discord and download Geo Climber on the App Store.

sentryseodockerproduction-hardeningextensibility-overhaulplugins-skills-agentsclaude-code
The Boring Stuff (at 1am) — Building Geo Climber