16 Comments
User's avatar
Mark Vickers's avatar

Thanks for the cogent reporting on this. I think my biggest question is about the depreciation mismatch portion: that is, H200s purchased today will compete against Rubin chips delivering ~9x the compute within 2-3 years

In an inference and energy constrained system, does this really matter? Even if you're competing against more efficient chips, aren't you still making money if there's an ongoing demand for your product?

Seems like the problem only appears if supply catches up with demand. If demand exceeds total available compute, then doesn’t even less efficient chips command premium pricing? Seems like for the financing to break, equilibrium pricing must arrive before the debt matures.

Or am I missing something?

Expand full comment
Kenn So's avatar

You got it spot on Mark. The music continues as long as there's shortage of compute (energy constrained or otherwise) that allows inefficient compute to command premium pricing. I don't think we'll solve the energy shortage anytime soon - not in 12-18 months for sure, but I'm hopeful that market need drives new tech development that'll drive to equilibrium faster.

Expand full comment
Michael Fuchs's avatar

The flaw in GPU depreciation analyses is that they are based on technical obsolescence, which can be overcome if scarcity pumps up value, as the poster says. But obsolescence is not the real reason older GPUs lose their value fast. The reason is that AI GPUs are run very hard and therefore at high temperatures. Processors suffer from thermal degradation when run under such conditions. After only a few years of overheating, GPUs become useless, not merely outmoded. You actually have to replace stuff if you’ve had to throw it away.

Expand full comment
Kenn So's avatar

This is great insight to add to the analysis. Do you have a good resource or data that shows how hot these GPUs are run now in GPU-centric data centers and the thermal degradation at such conditions?

Expand full comment
Michael Fuchs's avatar

This is a useful writeup. It’s ultimately a vendor pitch (not mine!) for mitigating the problem, but still explains the problem well.

https://www.whaleflux.com/blog/safe-gpu-temperatures-a-guide-for-ai-teams/

Expand full comment
atul mori's avatar

You may be over simplifying by commoditizing all demand and all supply into a vanilla teraflop. In the future, with the growth in the inference market and corporate use of AI, these new demands may dictate unique needs. Some may require different security mechanism, some may require dedicated capacity, some may require some new features which are not invented yet. Therefore, suppliers may have to evolve to meet the unique needs of the demand side to survive and stay in the business. (Notice that not all retail stores around the corner went out of business because of lower cost Amazon retail model.)

Expand full comment
Feisal Nanji's avatar

Missing the point entirely. AI investment is mostly about who gets to provide compute to the Western world . Only 3 companies have the legitimate power to furnish tokens to the western world. It’s about tokens . And the volume of tokens has exploded. Msft , goog and AWS , are the only capable providers of compute. The rest is small share and therefore noise .. think about a world where only 3 providers have the heft to provide most compute. Oligopoly confirmed. Oligopolistic rents are confirmed.

All this talk about financing is silly . This is the biggest economic transformation ever .. token volumes will explode as we make entire movies from a single context window..

Expand full comment
Kenn So's avatar

Very clearly said. It could very well be what you've pointed out - what matters are the hyperscalers and they can weather anything. I do think there's more nuance to it since the supply chain financing is more complex and material enough in size - but hyperscalers should be able to weather blips.

+1 to "entire movies from a single context window"

Expand full comment
G88's avatar

Thanks for the writeup, however I feel it's a little bit missing the forest from the trees.

What I mean by this is that certain basic but important angles are not raised.

It is mentioned that the revenue is not proportional to the capex. But it should be also mentioned that the revenue generated is inflated by massive subsidies. If OpenAI and Anthropic generate more than 50% of the LLM-revenues, but they lose a lot of money doing that, a large share of the revenue and revenue growth is kind of fake, because it is artificial demand spurred by lower than natural prices. It should be added that the other providers also run at a loss.

I would also be curious that the supposed productivity growth mentioned in the article, well, do we know how much is spent on that, and exactly what sort of gain is attributed to it? Even if a process that takes up a 1/10000th of a company's total value creation workflow is tripled in efficiency, it is not going to drive a significant difference, and it won't be worth a lot of money to spend on for the company (overall).

So it seems like there is a massive TAM, which might not exist at all.

As a note: with LLMs being not actually intelligent, and prone to providing false but plausible-looking information, are we even sure that it actually drives growth that would not be invalidated later as problems down the line come up and compound?

And one more important thing that the writeup partly addresses but still does not: if LLMs are not the best architecture, or their algorh. efficiency increases, the whole superlarge compute buildup would be just a massive waste of resources.

This is a large gamble that actually looks more and more like it surely won't pay off.

Expand full comment
Kenn So's avatar
1dEdited

Good points raised - I'll respond to a few directly.

On subsidized pricing inflating demand: I'd actually push back on the framing that OpenAI and Anthropic are "running models at a loss." They may be losing money as companies (R&D, headcount, etc.), but serving the models themselves appears to be gross margin positive and improving. We don't have audited US financials, but two Chinese model companies (Zhipu and Minimax) have filed IPO prospectuses showing positive and growing gross margins - and this is in a market with much lower willingness to pay for software. Standard SaaS trajectory: margins improve as revenue scales.

On productivity gains needing to be material: Completely agree. If AI touches 1/10,000th of a workflow, tripling that efficiency is noise. The adoption data I cite (28% of S&P 500 reporting AI-driven revenue/cost impacts in earnings calls) suggests it's hitting more than peripheral processes, but you're right that the magnitude question remains open. ROI has to be there.

On architectural efficiency: Addressed this in the Deepseek section above - it's a real risk worth monitoring. My take: if a new approach is 10x more efficient, demand growth probably absorbs it. If it's 1,000,000x, there'll be a correction. The data centers likely get used eventually; the question is whether the financing structure survives to that point.

On LLM reliability: I'd resist the binary framing of "not intelligent → provides false information → no value." The value proposition isn't perfection - it's whether the productivity gain net of error-checking still beats the alternative. Current adoption suggests it does for many use cases, though I'd agree it's uneven.

Expand full comment
G88's avatar

Hi Kenn, it's great to engage in quality discussion.

1/ The data point about the Chinese companies' gross margin is very interesting. I am not too familiar with the Chinese market structure, that is why I have the following thoughts.

Let us say that in the US model, a user-initiated LLM request goes in to the model provider. The model provider has (a better or worse, ie. more or less accurate) token accounting arrangement with the hardware-owner, or has it's own hardware.

If they are separate entities, as with OpenAI, the actual gross margin is unknown, because the company is charged an amount that might or might not have a tight connection with the actual cost of serving that request, which includes not only the cost of electricity, but also the interest on loans or the opportunity cost of the cash invested. Because hyperscalers had a lot of cash, the real gross margin is quite unknown, it is hidden for us or even OpenAI.

Given that the US providers have a very large chunk of the market, Chinese providers (who are maybe having their own hardware) either also eat losses due to buying the NVDA chips (which we can assume because they bought quite a lot through Singapore and others, and it was not cheaper for them either) and therefore are also subsidizing the demand structurally to be competitive, or Chinese LLM-models or businesses are way more efficient than Western ones.

It would be interesting to see how their gross margin is calculated, and if it actually reflects the overall economic cost incurred, as well as seeing what brings their net margins into the negative.

2/ I would put a bit less stock into S&P companies reporting AI-related growth or savings. As you also pointed out, magnitude matters, and there is no evidence pointing at large or even medium-impact (and "case studies" like Klarna point in the other direction), and that is just a fashionable thing to do, like ESG or diversity policies once were. I have also seen surveys that said adoption stalled and even fell, and most companies saying their trials were not successful.

3/ For all these data centers to get used in a space of time where they won't turn out to be bankrupting their owners, a lot of white-collar workers would need to be fired and replaced by the LLMs in the next 18 months. Things can change fast, but I don't see this happening at all (it's a separate discussion if that would be desirable or not).

4/ This leads back to the reliability and modality issue: for reliability, does someone really know how the productivity gain net error-checking can be checked? What if errors compound, which introduces a time-factor? Who bears liability if a wrong info is served? Is the output actually useful etc.? Very difficult to know. For modality, a human can still do more things than just output text.

Overall in that respect it seems very unlikely that LLMs can possibly drive enough value in that compressed timeframe, especially given that the next batch of chips and data centers can drive maybe 9x of compute for the same cost (I don't know if that number is accurate or not, I just get the impression it's something like that).

It's a bit like, buy 10 bagettes today for 10 bucks, or buy 1 bagette today for 1 buck and buy 90 bagettes tomorrow for the other 9 bucks. If we are not crazy hungry today (and the evidence is we are not), we'd be literally insane to choose the first option.

Expand full comment
James Bryant's avatar

Excellent analysis of the current AI cycle. I believe you are correct that credit will seize up first in some of the overleveraged data center operators and GPU cloud providers like Coreweave that have expanded rapidly without the luxury of the hyperscaler balance sheets. This could help provide an example of the loan document granularity you mentioned in your post that Burry did for the 2008 housing crisis to understand their financing details since they are public company and there might be more visibility (SEC 8-K, 10-Q filings and any prospectuses). I know this is not all the detail required but it would provide a decent starting point.

It would be interesting to see a full post that does a deep dive into Coreweave to understand more about their financing structures, customer commitments and concentration risks (Microsoft is a very large customer), buildout timeframes and current hiring plans. You could show a number of scenarios that would determine how long Coreweave has until credit and liquidity become an issue based on a number of assumptions. The one wild card for Coreweave is their $6.3B backstop with Nvidia to buy unsold capacity through 2032 and if Nvidia would increase it, if necessary.

Expand full comment
Kenn So's avatar

Great idea - a Coreweave deep dive is a candidate for a future post. I looked at Coreweave's 10K and 10Qs while researching for this article. Like you said, it doesn't have the details to really understand how lenders thought about Coreweave's GPUs but got clarity on how debt reliant they really are - they need to constantly refinance and their interest rates are crazy. Early debt was mid-teens but they were able to bring it down to ~10% in more recent senior notes (there's different structures to the different tranches it but that's the gist). If NVIDIA did not backstop Coreweave there would have been serious concerns about Coreweave's solvency. But to be clear, Coreweave is an awesome business it just has a shaky capital stack.

Expand full comment
Elizabeth K. Whitney's avatar

Agree with much of this and with Burry. My view from the energy sector highlights physical constraints on power that I don’t think the markets are fully factoring in.

Expand full comment
Caleb Wagner's avatar

Great analysis. I appreciate that you support your conclusions with hard data.

One tricky part with measuring the demand side is that "AI" can refer to lots of different things. Some of the most widely used machine learning methods (random forests, SVM, clustering, etc.) are not based on deep learning and have much smaller compute requirements. They'll usually run fine on a CPU or (max) 1-8 GPUs. The expected ROI for these methods is fairly well understood, and any increase in their usage is mostly orthogonal to the current GenAI, GPU-centric buildout.

So, for example, the relevance of the statement from Estee Lauder ("AI has driven a 31% increase in ROI from our North American media campaigns") is hard to judge. While they could be using GenAI, it's not clear.

Walmart is more direct: they do say "GenAI", which pretty much guarantees they're partnering with someone like OpenAI or renting GPU clusters directly.

I'll be following your future posts and Burry's as things develop!

Expand full comment
Kenn So's avatar

Great call out. 100%, classical ML is still a great source of AI ROI (I started my career there). I've updated the slides to have clear GenAI examples.

Expand full comment