The Thinking is the Hard Part

index

Ideas have never been the scarce resource in software. Everyone has ideas. The hard part has always been something else: deciding which ideas are worth pursuing, who they’re actually for, and whether the problem you think you’re solving is a problem anyone cares about.

This isn’t a new observation. But something has changed recently that makes it worth revisiting.

Friction used to do the thinking for us

There’s an old joke: how do you stop a developer writing code? Put a spec in front of them.

It’s funny because it’s true. The tension between building and thinking about what to build has always existed. Developers have always preferred the doing over the defining. That’s not a criticism; it’s just how the work tends to pull people. The interesting stuff is in the building. The spec is the bit you have to get through first.

But here’s what I think we’ve underappreciated: building software used to be hard enough that it created natural checkpoints. Multiple checkpoints, in fact.

If you couldn’t build it yourself, you had to convince someone to fund the building of it. Then you had to convince a developer (or a team of them) that it was worth their time to build it. Each of those conversations forced you to articulate the problem clearly. Each one was an opportunity for someone to ask “but why?” And even if you could build it yourself, the sheer effort involved meant you were more likely to pause and ask whether the thing was actually worth doing.

That friction wasn’t just a barrier. It was quietly doing useful work. It forced evaluation before execution. Not everyone did that evaluation well, but the cost of building at least slowed people down enough to think.

The filter is gone

AI-assisted engineering and vibe coding have dramatically lowered the barrier to building. In many ways, this is a good thing. More people can participate, iteration is faster, and prototyping is more accessible than it’s ever been.

The numbers bear this out. According to a16z and Sensor Tower, new iOS app releases surged 60% year-over-year in December 2025, after three years of being essentially flat (though it’s worth noting some of the underlying data has been debated). Y Combinator reported that 25% of their Winter 2025 batch had codebases that were 95% AI-generated. These aren’t hobbyists tinkering at weekends. These are funded companies shipping AI-generated code to production.

But the side effect is that the natural pause where evaluation happened has been compressed or removed entirely. The incentive to validate an idea before committing to it has weakened, because “committing to it” now costs so much less. People are diving into building because they can, not because they’ve determined that they should.

And here’s the thing: this is easy to do because building is addictive. Software engineering always has been. We’ve all been there. It’s the end of the day, time to go home, and you’re still trying to get your tests to go green. One more change. One more iteration. The jackpot is just around the corner.

AI-assisted engineering has made that loop faster and tighter. Many people have described it as being like a slot machine, and I think that’s exactly right. The cycle between spins is shorter, so the dopamine hits come more frequently, and it becomes even harder to step back and ask the bigger questions.

The initial prototype is easy. Getting something up and running quickly can be useful for proving out an idea. But it’s dangerously easy at that point to just keep going. Keep iterating, keep adding features, keep spinning. The momentum of building carries you past the point where you should have stopped to evaluate. What starts as “let me quickly see if this could work” becomes the product, and the thinking never happens.

In some ways, it’s a bit like what happens when credit is readily available. Banks lend freely, more businesses get started, but a proportion of them only exist because the money was easy, not because the fundamentals were sound. Eventually there’s a correction. The business didn’t fail because the loan was badly structured; it failed because the idea was never viable. AI tooling is cheap credit for software. It lowers the cost of building to near zero, so more things get built. But the ones that survive will still be the ones where someone did the due diligence first.

A quick but important distinction

There’s already a well-covered conversation about the quality, security, and maintainability of AI-generated code. Fast Company called it the “vibe coding hangover.” Veracode found that 45% of AI-generated code contains security vulnerabilities. There’s no shortage of commentary on that front.

That’s a real and valid concern, but it’s not what this article is about.

The problem I’m describing is further upstream. It’s about what happens before anyone writes or prompts a single line of code. You can have a perfectly well-built application that solves a problem nobody has. The code quality is irrelevant if the idea was never properly evaluated.

A METR study found that developers believed they were 20% faster when using AI coding tools, but were actually 19% slower. That gap between perceived and actual effectiveness is quite telling. People feel productive without necessarily being effective. I think the same dynamic applies to product thinking: building something feels like progress, even when no evaluation has taken place.

Not everything needs to be an app

I’ve watched this pattern play out quite a lot recently. Someone discovers they can now build a web front-end, something they’ve never been able to do before, and the novelty is compelling. So they start putting web UIs on everything.

The problem is that the users of these tools are increasingly working with AI assistants, scripting things, automating workflows. For them, a browser-based UI doesn’t remove friction — it adds it. Manually clicking through a web form to trigger a deployment pipeline, for instance, when everyone on the team already has a terminal open and could run a single command. The builder’s experience improved. The user’s experience got worse.

And then there’s the feature creep driven by capability rather than need. “Oh, and it could also send notifications, and maybe we add a dashboard.” But who asked for that? What problem is it solving? What’s the purpose? It’s building for the sake of building, and it shows poor product ownership.

Now, there’s a useful distinction here. Kevin Roose at the New York Times coined the term “software for one” to describe personal vibe-coded tools, little applications built for your own use, and that’s absolutely fine. When you’re both the builder and the user, the evaluation step is naturally embedded. You know whether it solves your problem because you’re living with it every day.

Similarly, throwaway tools and short-lived scripts can absolutely be worth the effort. A quick automation for something tedious in your own workflow, or a prototype you build specifically to learn a new tool or technique. The value there might be the learning, not the output, and that’s a legitimate reason to build.

But even these aren’t completely exempt from thinking. There’s still a question: are you building this with intent — “I want to learn how this works” or “I want to save myself 20 minutes a day” — or are you building because building is the path of least resistance? A learning exercise has value. Spinning the slot machine because it feels productive doesn’t.

The trouble really starts when people apply the same “just build it” mindset to tools intended for other people. That’s where understanding the user’s actual problem is critical, and that’s what gets skipped. This isn’t a criticism of any individual. It’s a pattern playing out everywhere right now. The ability to build has outpaced the discipline to evaluate.

The Mom Test still applies

A colleague recommended Rob Fitzpatrick’s The Mom Test to me recently, and the timing was perfect. It was written for startup founders trying to validate business ideas, but the principles translate directly to anyone building tools for other people.

The core idea is simple: “they own the problem, you own the solution.” You don’t get to decide what the problem is on behalf of your users. You have to go and find out.

A few of Fitzpatrick’s rules of thumb hit particularly hard in this context.

“Some problems don’t actually matter.” Reduced friction means people are now solving unimportant problems faster and with nicer interfaces than ever before. The speed of building can make a trivial problem feel significant.

“If they haven’t looked for ways of solving it already, they’re not going to look for (or buy) yours.” If your users aren’t frustrated by the lack of a UI, a UI probably isn’t the answer. If they’re happily scripting something and it works, that’s a strong signal that a web front-end isn’t solving a felt need. It’s solving the builder’s desire to build a web front-end.

“Watching someone do a task will show you where the problems and inefficiencies really are, not where the customer thinks they are.” That’s the user-centric perspective that gets skipped when building is easy. Actually observing how people work, rather than assuming what they need.

The book’s emphasis on facts and commitments over compliments is relevant too. In an internal tools context, a polite “oh that’s cool” when you demo something is not validation. Someone changing their workflow to actually use your tool is. Fitzpatrick calls friendly non-buyers “a particularly dangerous source of mixed signals,” and I think that applies just as well to internal tooling as it does to startups.

Andrew Ng made a related point when he pushed back on the term “vibe coding” itself, arguing that AI-assisted development should be “a deeply intellectual exercise” rather than just “going with the vibes.” I think he’s right. The thinking is still where the real work happens, and framing it as vibes actively encourages people to skip it.

Will Wilson of the software testing firm Antithesis described a category of AI bugs he calls “evil genies”: the AI doing exactly what you asked for, but not what you actually needed. That’s the evaluation failure in miniature. Faithful execution of an idea that was never properly evaluated.

You don’t need to be building a startup for these principles to matter. Anyone building tools, internal or external, benefits from this mindset.

Even the most optimistic visions of AI-assisted entrepreneurship seem to acknowledge this. Linas Beliūnas’s “One-Person Unicorn” piece describes building a complete AI-powered startup operating system with twelve interconnected skills, and skill number one is idea validation — using Mom Test principles and GO/PIVOT/KILL decision criteria. The architecture itself puts evaluation before everything else. The tools can help you move faster through the process, but they can’t skip it for you.

Building is the easy bit

The democratisation of building is a net positive. More people creating things is good.

And to be clear: using AI to quickly knock up a prototype, a wireframe, a visual that helps you explain and communicate an idea? That’s a brilliant use of the technology. It’s in service of the evaluation, not a replacement for it.

The problem is when the prototype becomes the product and nobody ever circles back to ask whether it should exist. When building becomes a substitute for thinking, not an aid to it.

So the next time you’re about to start building, it’s worth pausing on a few things. Is this a real problem? Does someone actually care about it? How are they solving it today, and does your idea make their life easier, or just yours? And is the delivery mechanism — whether that’s an app, a UI, a script, or an API — dictated by the problem, or by what’s newly exciting to build? Are you optimising the 5% of the task, or the 80%? Where’s the actual return on value?

Tahir Fayyaz, a Staff PM at Databricks, described this evolution well in a recent interview with Bilal Aslam. A year ago, he said, he was just chatting to ChatGPT and telling it to do stuff, and ended up with a huge chat and no results. Now he starts by asking two questions: “What kind of work am I doing that’s repetitive?” and “What’s the work I’ve done before that’s the gold standard of what an LLM should replicate?” He knows what the output should look like before he starts building. That’s product thinking applied to AI tooling, not the other way around.

The skill that’s in short supply right now isn’t the ability to build. It’s the ability to evaluate. The hard part was always the thinking, not the typing.

If any of this resonates, The Mom Test by Rob Fitzpatrick is well worth a read. It was written for startup founders, but the principles apply to anyone building something for other people. Talk about their life instead of your idea. Ask about specifics in the past instead of generics about the future. Talk less and listen more.