Control is the next frontier. Own as much as you can

May 08, 2025

Last month I wrote an article about model companies are increasingly moving up the stack, absorbing more of the application layer as they seek to own the user experience and distribution. This article talks about application layer companies working to flip the script.

Joel Monegro’s seminal article, Fat Protocols in 2016, argued that in the blockchain paradigm, value accrues at the protocol layer, unlike the web era, where most of the value was captured by applications. The previous generation of shared protocols (TCP/IP, HTTP, etc.) produced immeasurable amounts of value, but most of it got captured and re-aggregated on top at the applications layer, largely in the form of data (think Google, Facebook and so on). For blockchain, the inversion happens: value accrues at the protocol layer rather than the application layer. Protocols like Bitcoin and Ethereum captured significant value directly—through native tokens—before killer apps were even built.

ChatGPT wrappers no more

For the past year, a lot of great AI companies were dismissed as “just wrappers.” But here’s the thing: many of them were—at the beginning. It was the fastest path to market: wire up GPT-4, layer on some UI, and ship something useful. And if you found distribution early, you bought yourself time to do the real work. Now we’re seeing what that work looks like.

Look at the moves Harvey and Cursor have made in just the past few months. These companies aren’t content being front-end interfaces to someone else’s intelligence. They’re hiring researchers and engineers from the large research labs. They’re training their own task-specific models. They’re rebuilding the stack beneath their product layer, not because it’s flashy, but because they need to own quality, latency, privacy, and cost.

h/t to Bucky Moore calling this trend out on TBPN

In this market, you don’t raise $900M of fresh funding unless you’re training your own models or you’re looking to acquire a company. These application layer companies are raising enormous amounts of capital because they realize they need to own more of the stack and it’s a massive issue to be dependent on the research labs.

The best AI startups are evolving from wrappers into full-stack product companies. Founders should take note. This is no longer a game of who can call the OpenAI API in the most elegant way. That phase of the market is maturing. What’s next is defined by control: control over your user experience, your model behavior, your economics, and your edge.

Why Go Deeper?

At first, renting inference from OpenAI is a feature. Over time, it becomes a liability. Here’s why some are making the shift:

1. Platform risk compounds over time and you won’t see it coming

APIs are someone else’s product. That means their roadmap, pricing strategy, and reliability aren’t aligned with yours — and you’re at the mercy of decisions you don’t control. Maybe the API quietly changes how it handles prompts. Maybe costs double overnight, rate limits tighten, or a feature critical to your flow gets deprecated. None of these are hypothetical, they’ve all happened. And when they do, the impact isn’t just technical; it’s existential.

If your product relies on brittle prompt chains, system instructions, or quirks of model behavior, even small changes can break UX, degrade performance, or kill differentiation. Worse, these shifts rarely come with warnings or migration paths.

2. Costs compound fast

While renting inference is cost-effective early on, usage-based pricing quickly becomes a margin killer as usage scales. For businesses with high query volume or complex multi-step agents, per-token pricing creates unpredictable and unsustainable COGS. Owning your model — or at least fine-tuning and hosting it yourself — brings cost stability, better margins, and clearer pricing power. For AI-native companies, owning the stack is the only path to long-term economic defensibility.

For most startups at scale, OpenAI usage becomes one of the top two line items. And the pricing is outside your control. Training (or even fine-tuning) small task-specific models gives you pricing leverage, and in some cases, lets you run things locally or on cheaper open weights.

Just to put this into more perspective. The Information published an article in March 2025 that said Cursor serves 200M+ queries / day. These are very rough assumptions but if they’re using Claude 3.7 Sonnet

Daily Inference Cost Estimate

Volume: 200M+ queries/day
Token Usage: ~500 tokens/query (250 input, 250 output) → 100B tokens/day
Pricing: $3/M input tokens, $15/M output tokens

Costs

Input: 50B tokens → $150K/day
Output: 50B tokens → $750K/day
Total: $900K/day

If you believe Cursor is solely using Sonnet 3.7, it’d cost them over $1M per day. Obviously, they’ve built their own models given how enormous that cost would be.

Where This Is Going

The best vertical AI companies won’t stay thin. They’ll be thick, opinionated systems. They’ll combine product UX, orchestration, retrieval, inference, and (eventually) custom models. They’ll use OpenAI where it makes sense, but they won’t be at its mercy.

And they’ll win not just because they ship quickly, but because they own what they ship: their UX, their latency, their intelligence, their brand.

You don’t need to become a model lab. But you do need to know where your edge lives, and how much of it depends on someone else’s roadmap.

The takeaway: The era of the thin wrapper is ending. The companies that win in this wave won’t just talk to LLMs, they’ll build moats around them, shape them, and use them as tools in a much deeper product stack.

In addition, if you’re interested in joining a startup (whether it be now or in the future), fill this form out! I’ll be funneling a few opportunities that have been coming up through it.

Top of Mind

Discussion about this post