The missing step between hype and profit

In February, I picked up a flyer at an anti-AI march in London. I can’t say for sure whether the writers meant to riff on South Park’s underpants gnomes. But if they did, they nailed it: “Step 1: Grow a digital super mind,” it read. “Step 2: ? Step 3: ?”

Produced by Pause AI, an activist group that co-organized the protest, it ended with: “Pause AI until we know what the hell Step 2 is.”

If you don’t remember the 1998 episode, the gnomes steal underpants from dressers at night. Their pitch deck: Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit. It’s become one of the great internet memes, used to satirize everything from startup strategies to Elon Musk’s Mars mission funding plan.

Right now, it captures the state of AI perfectly. Companies have built the tech (Step 1) and promised transformation (Step 3). How they get there is still a giant question mark.

Pause AI thinks Step 2 must involve regulation. But what kind and who enforces it remain open questions. AI boosters, meanwhile, are convinced Step 3 is salvation and tend to glaze over the middle bit. OpenAI’s chief scientist Jakub Pachocki told me a few weeks ago we’re racing toward sunny uplands on the back of an “economically transformative technology.” Everyone knows where they want to go — it’s hazy up there and still some way off — but they’re all taking different routes. Will anyone make it?

For every big claim about the future, there’s a more sober assessment. Consider two recent studies. One from Anthropic predicted which jobs LLMs will affect most. Managers, architects, and media folks should prepare for change; groundskeepers, construction workers, and hospitality staff, not so much. But these predictions are really just guesses based on what kinds of tasks LLMs seem good at, not how they actually perform in the workplace.

Another study from February, by researchers at Mercor (an AI hiring startup), tested several AI agents powered by top-tier models from OpenAI, Anthropic, and Google DeepMind on 480 workplace tasks that human bankers, consultants, and lawyers do regularly. Every single agent failed to complete most of its duties.

Why such wide disagreement? A few factors. First, consider who’s making the claims and why. Anthropic has skin in the game. Most people telling us something big is about to happen have reached that conclusion largely based on how fast AI coding tools are getting. But not all tasks can be hacked with coding. Other studies have found LLMs are bad at making strategic judgment calls.

When these tools are deployed, they’re not dropped into a cleanroom. They need to work in places contaminated with people and existing workflows. Sometimes adding AI makes things worse. Sure, maybe those workflows need to be torn up and refashioned around the new technology for it to achieve transformative status, but that takes time and guts.

That big hole is right where Step 2 should be. The lack of agreement on exactly what’s about to happen — and how — creates an information vacuum that gets filled by the latest wild claim of the week, evidence be damned. We’re so unmoored from any real understanding of what’s coming that a single social media post can shake markets.

We need fewer guesses and more evidence. That requires transparency from model makers, coordination between researchers and businesses, and new ways to evaluate this technology that tell us what really happens when it’s rolled out in the real world.

The tech industry (and with it the world’s economy) rests on the held-out promise that AI really will be transformative. That is not yet a sure bet. Next time you hear bold claims about the future, remember that most businesses are still figuring out what to do with their underpants.

The missing step between hype and profit

Comments (0)