Chesterton’s Wall

Ankil Patel

Human cognition has always been the wall. Are we blocking the view?

Most of what we call software engineering best practices are coping mechanisms for having a 1.5 kilogram brain.

Like other sorts of ancient tradition, our tribal software engineering rituals are mostly the result of latent fundamental limitations. Small functions. Single responsibility. DRY. Code review. These aren't laws of computation; they're accommodations for a neural processor that can barely juggle seven items in working memory, gets tired after a few hours, and can't reliably diff two files without external tooling.

Over decades, we have internalized these rituals so deeply that we’ve forgotten that they’re merely contingent. Constraint created practices, but the practices have now outlived the constraints. Ask an engineer why functions should be short and you'll get post-hoc reasoning: testability, reusability, maintainability. All true but missing the point. Functions are short because humans can't hold large things in our heads.

Chesterton warned us to not tear down a fence until we understand why it was built. But he didn't tell us what to do when the cattle are gone and we're still maintaining the fence. Call it Chesterton's Wall: the edifice that persists not because the constraint still binds, but because no one remembers why it was ever needed.

This isn't a new phenomenon. Remember, the QWERTY keyboard was specifically designed to reduce jamming on mechanical typewriters. We used to hand-optimize assembly because compilers were stupid. Entire careers were built on knowing which instructions were faster on which architectures. Then compilers got better, and that expertise became much rarer. Manual optimization didn't disappear, but it retreated to the edges: high-performance or extremely large scale systems, hot loops, embedded systems - the places where the new constraint (compiler intelligence) still had gaps.

Likewise, we used to ration memory like it was precious, because it was. Data structures were chosen for space efficiency first. Then memory got cheaper, and we started trading space for time. Engineers who'd learned to count bytes found their instincts actively misleading them.

Image source. As the price of memory goes down, the industry’s capabilities grow and infrastructure transforms to take advantage of the new constraints.

Consider B-trees. Wide, shallow trees dominate disk-based databases because seeks are expensive. We minimize reads by packing keys per node and keeping the tree shallow. Red-black trees solve the same problem (ordered key-value lookup) but hold the tree in memory, allowing for deeper and narrower trees and more pointer chasing. Same abstraction, different structure, because the underlying constraint differs.

As memory got cheaper, entire database working sets moved into memory, and suddenly the B-tree ethos started to crack: the buffer pool, the page cache, the LRU eviction policies, the careful distinction between sequential and random I/O, the write-ahead log optimized for sequential appends, the checkpoint strategies… the whole apparatus for emulating memory with disk. A nontrivial chunk of it is compensation for the disk constraint. Databases like SingleStore started using skip lists from the beginning; why emulate random access with extra infrastructure when you have actual random access? Much of the disk-based infrastructure wasn't solving a real business problem; it was working around a hardware limitation. Remove the limitation and the infrastructure becomes dead weight. Engineers had to consciously decide how to adapt to new constraints rather than simply inheriting the disk-era default.

The B-tree story is about storage, but a similar pattern played out in compute.

Now we're seeing something similar in ML infrastructure. GPUs have lower memory bandwidth relative to compute, so you get pipeline bubbles - idle time waiting for data. To compensate, engineers build infrastructure: prefetching, overlapping compute with communication, careful scheduling, memory management layers. As models scale, the memory bandwidth constraint pushes us out of simpler parallelism strategies (data parallel, tensor parallel) and into pipeline parallelism, where the infrastructure burden explodes: microbatch scheduling, stage balancing, gradient synchronization across pipeline stages.

TPUs have higher memory bandwidth, which lets us stay in simpler parallelism modes longer. The bubbles shrink. Some of that pipeline parallelism infrastructure exists to compensate for a constraint that doesn't bind as hard on different hardware.

In each of these cases, as a constraint relaxes, a common practice becomes vestigial. Engineers who understand why the original practice existed adapt. Those who don’t, cargo-cult: retaining rituals with no remaining purpose.

The constraint shifting now is cognition itself. What’s changing, specifically, are the limits on how much context a single "reviewer" can hold, how quickly quality code can be generated, and how exhaustively execution paths can be traced.

Here's what this looks like in practice. Last month, a colleague asked me to stop splitting my PRs. Not because partial PRs were wrong, but because it was confusing the AI code assistant. I'd written scaffolding in one PR - functions, types, structure - with the integration coming in a follow-up. From the AI's perspective, I'd merged in code that nothing called. It kept getting confused about how my change related to a bug it was solving for another colleague.

I'd staged the work into smaller pieces for good, well-established reasons:

Human reviewers have limited attention. They miss bugs in large diffs. They get fatigued. Small PRs respect this constraint.
Reverts need isolation. When something breaks at 3am, you want to undo one thing, not untangle a mega-commit. Small PRs make rollback surgical.
Organizational scope control. Small changes are easier to reason about politically. They're easier to approve, easier to ship, and easier to track.

AI is loosening the first constraint. The second is changing slowly. The third hasn't moved. The practice of "small PRs" was never one practice. It was three constraints bundled together because they happened to point in the same direction. Now they don't.

This is what it looks like when bundles decompress. Consider code review:

Catching bugs (cognitive -> AI can help)
Knowledge sharing (organizational -> unchanged)
Accountability and signoff (political/legal -> unchanged)
Mentorship (developmental -> unchanged)

If you think code review is "for catching bugs," you might conclude AI makes it obsolete. If you recognize it's a bundle, you realize only one leg of the stool is changing.

The same unbundling applies to documentation:

Explaining what code does (cognitive -> AI can generate this on demand)
Recording why decisions were made (organizational memory -> still needed, maybe more)
Onboarding new team members (developmental -> mixed)
Contractual/compliance requirements (legal -> changing?)

Some documentation becomes redundant. Some becomes more valuable precisely because AI can't infer intent from code.

And to modularity:

Limiting what humans need to understand at once (cognitive -> shifting)
Limiting blast radius of changes (operational -> unchanged)
Enabling parallel work by different teams (organizational -> unchanged)
Creating boundaries for ownership and accountability (political -> unchanged)

The cognitive case for modularity weakens if AI can hold more context. The other cases don't budge.

So what happens next?

The engineers who navigate this well will look at existing practices and ask: “What limitations do these exist to address? Are those limitations still real? For whom?”

Some practices will survive with new justification. Small functions might persist not because humans need to understand them, but because they're easier for AI to regenerate and modify. Same practice, different reason.

Some practices will become vestigial. Elaborate inline comments explaining what code does? AI can just tell you. The practice might linger through inertia, but the constraint is gone.

Some practices will invert. Consolidation might beat decomposition in contexts where AI needs semantic context more than humans need small diffs. Your instincts may mislead you.

Some practices will conflict with themselves. Small PRs for human reviewers versus large PRs with lots of context for AI assistants. You'll have to pick which constraint you're optimizing for, consciously, instead of assuming they're the same thing.

The point isn't that AI is making best practices obsolete. It's that best practices were always contingent, and we have often forgotten which traditions were predicated on which fundamental constraints.

Every rule in your engineering culture exists because of some constraint - cognitive, operational, organizational, political, legal. Many rules exist because of several constraints bundled together. AI is perturbing the cognitive constraints specifically, unevenly, while leaving many others intact.

If you can't unbundle the practices into their constituent constraints, you'll either cargo-cult things that no longer help, or discard things that were never about cognition in the first place. Both failure modes will be common.

We've always outsourced cognition: to traditions, to institutions, to tools. AI is just the latest substrate. Today's big question is whether we understand what we're outsourcing, and what happens when the substrate becomes more capable than the constraint ever required.

So here we are, surrounded by Chesterton’s Walls. The new skill isn't simply tearing them down. It's knowing which walls are still holding cattle back, and which are just blocking the view.

Ankil Patel is a Member of Technical Staff at Basis.

LinkedIn X

Chesterton’s Wall

Recommended reading

Sikich partners with Basis to advance AI-enabled accounting services

We Automated On-Call. Building the System Was the Easy Part.

How Accounting Landed Me a Sales Job

Put Basis to work.