AI, data centers, and power economics, with Azeem Azhar

This week I'm joined by Azeem Azhar. We started intending on doing a deep dive into data center power economics, with a particular focus on how AI-fueled demand changes the equations, and ended up visiting everything from the history of infrastructure overinvestment cycles to the future of the energy generation mix.
Ready to save time and close deals faster? Inbound security reviews shouldn’t slow down your team or your sales cycle. Leading companies use SafeBase to eliminate up to 98% of inbound security questionnaires, automate workflows, and accelerate pipeline. Go to safebase.io/podcast
Check is the leading payroll infrastructure provider and pioneer of embedded payroll. Check makes it easy for any SaaS platform to build a payroll business, and already powers 60+ popular platforms. Head to checkhq.com/complex and tell them patio11 sent you.
Timestamps
(00:00) Intro
(00:27) The power economics of data centers
(01:12) Historical infrastructure rollouts
(04:58) The telecoms bubble
(06:22) Unprecedented enterprise spend on AI capabilities
(11:12) Let's have your LLM talk to my LLM
(16:44) Is there a saturation point?
(19:25) Sponsors: Safebase | Check
(21:55) What’s in a data center?
(24:52) The challenges of data centers
(29:40) Geographical considerations for data centers
(36:53) Energy consumption and future needs
(40:48) Challenges in building transmission lines
(41:35) The solar power learning curve
(43:51) Small modular nuclear reactors
(51:26) Geothermal energy and fracking
(01:01:34) The future of AI and energy systems
(01:12:57) Wrap
Transcript
Patrick: Hideho, everybody. My name is Patrick McKenzie, better known as patio11 on the internets, and I'm here with my buddy Azeem Azhar.
Azeem: It is so great to be here, I love the name of your podcast.
Patrick: Oh, thank you very much. So Azeem runs a newsletter called Exponential View, and we're going to be talking about the power economics, specifically that of data centers today. People might have heard recently in the news that OpenAI et al. are building a multi-billion dollar Stargate facility down in Texas. People might have heard some sort of hand-wavy or more evidence-based calculations that data center usage is going to be tens of percent of all power usage in the near future.
The power economics of data centers
And I think for folks inside the industry, these are eye-popping numbers, but they're somewhat expected numbers. For people who are outside the industry, this is all just a little bit wild. So let's take the very long view. How does this compare to other infrastructure rollouts over the last, say, couple of centuries and then go into the nitty-gritty? But speaking of things that are a couple of centuries old, we've been around this computer thing for a while, haven't we?
Azeem: We certainly have. I still have my first computer. It's a Z80 processor and the computer's called the ZX81, and I got it in 1981. So it's 43 years old, gonna be 44 this year and probably older than many of the listeners.
Patrick: The first modem I ever used was 300 bits per second. No, that is not misspeaking, for those of you who have had cell phones for your entire life. [Patrick notes: Middle age, it is a trip. One of the early markers of onset for me was, in receiving a compliment from the first boss I had had in a very long time, I said “Oh of course I’m good at this. I have been doing it since you were in middle school. … I meant that in any other way than the way it sounded.” … And that funny story is about as old as my son, who wants to get into programming for the Minecraft utility. Middle age, it is a trip.]
The power economics of data centers
Anyhow, let's talk about slightly more ancient infrastructure. So as we're doing this build out of data centers and the electricity—both generation and transmission apparatus that powers them—I'm sometimes put in mind of other societal-wide infrastructure buildouts. What's one that comes to mind for you?
Azeem: You know, we see these infrastructure buildouts every 50 to 60 years, roughly speaking, and the really, really big ones in the U.S. and in Europe were for the railways, and they were for electrification. Electrification just over a hundred years ago or a hundred to 80 years ago in the U.S. [Patrick notes: The Rural Electrification Act of 1936 is a useful marker for Americans who need one.]
One of the interesting facts around all of these build-outs is just how significant they were. If you look at the build-out of the railways in Great Britain in the 1840s to 1860s, roughly six to seven percent of GDP per annum went into capital investment to build out those rail lines, which in U.S. terms today, given the size of the U.S. economy, would be approaching kind of a trillion, a trillion and a half dollars.
The telecoms bubble
So whenever we see a general purpose technology like this, like artificial intelligence, like the internet, like telecoms and electricity or the railways, infrastructure needs to get built. And the sums of money that go into it are always eye-popping. And so too, by the way, are the dynamics of overinvestment as it's hard to forecast exactly when demand and where demand will arise.
Patrick: Byrne Hobart has a book Boom with Stripe Press that lays out this theory with the co-author, Tobias Huber. [Patrick disclaims: I previously worked for Stripe and, as one of my last job duties, reviewed a pre-publication draft of this. Byrne is a friend of mine.] And essentially he says that infrastructure overinvestment tends to happen with bubble dynamics but that the bubble is actually somewhat positive in that it causes a Schelling point for different people, firms, government entities, et cetera, that all have different pieces of the puzzle that would not coordinate to do a nationwide or even worldwide infrastructural revolution, but for some bubble dynamics.
And then often bubbles pop because, as you mentioned, it is virtually impossible to get these questions right a priori. But the demand sort of backfills over the intervening decades after the popping of the bubble.
I honestly don't know. There's substantial technical uncertainty, et cetera. This might be the bubble that doesn't pop. And it's always dangerous to say "this time is different," [Patrick notes: I apologize to friends with relevant equity interests for tempting fate in this fashion] but what I'm saying is there is some possibility that there is a discontinuity here with previous infrastructural build-outs.
But for people who are our generation, who remember very keenly the dot-com bust happening—people remember the dot-com bust as being about the application layer, about Webvan and Pets.com, we had great dot-com domains back in those days. But they remember that. But by count of money invested, almost all of it was doing coast-to-coast fiber and copper rollouts, and that substantially enabled the development of the modern Internet and its use as e-commerce, et cetera, platforms.
Azeem: Yeah, I agree with that. And in fact, the telecoms bubble is really salutary. The total investment in telecoms infrastructure—you described it as kind of coast to coast, but there was stuff happening outside the U.S. as well—was, as I recall, around $600 billion between 1996 and 2001. [Patrick notes: Many ballpark it there.] And US telecoms firms alone took on about $350 to $370 billion of debt in order to do this. And they did it at a time when the telecoms market was growing at about 7 percent per annum.
Unprecedented enterprise spend on AI capabilities
And so what feels really different about this AI market is that firms are growing much, much faster than that. I mean, 7 percent per week is not unheard of right now. And you're getting these early-stage startups that, like Cursor, are getting to $100-$150 million of annual recurring revenue within 18 months or so, which is an absolutely dramatic result. [Patrick notes: Finger to the wind, we historically expected great SaaS businesses to be at about a million dollars of run rate, not revenue, at that point. One million, not $100 million. One assumes Cursor has COGS far in excess of historical SaaS comparables but I can’t imagine the seed investors cry too much about them.]
And in fact, six months ago, we were looking at data that showed that AI-based software startups were growing three times faster than fast-growing SaaS startups in the pre-AI era. And that's compressed even further. And the reason that's different to the telecoms infrastructure buildout is that the revenues are probably ramping faster than the infrastructure growth is ramping. Whereas the reverse was true back in the telecoms bubble, which has been relabeled the dot-com bubble by some historians.
Patrick: Yeah. So I'm also hearing reports of this anecdotally. Don't treat the following as a quote from an informed market participant, but just treat it as a bit of market color on the grapevine.
At about Series C stage for raising investment in Silicon Valley, you continue to be quickly growing, but are sort of reinvesting in processes to support the next 10x over the next couple of years and getting, let's say, a little less cowboy about the operations.I have heard credible people claim positive knowledge of Series C stage that are growing as quickly as Y Combinator companies that are double-digit weeks old with 10% week-over-week growth, etc.
[Patrick notes: On occasionally gets bits of market color on Twitter, in addition to private conversations.]
And not growth in a vanity metric or growth in eyeballs for viewing cat videos, but growth in enterprise spend on AI capabilities. Which, to the extent that one would trust that observation, is mind-blowing. [Patrick notes: Should you trust it? To you, it’s non-specific third hand market color. Should I trust it? It’s not a company that has contractually promised me information rights. Drats. Should I trust my interlocutor? The information sharing game is one with persistent reputations and repeated interactions over decades, and the ecosystem contains some people who have richly earned a reputation for telling fish stories, and some people who have not.]
Azeem: It's absolutely wild. And I've heard similar things as well. So presuming we're not hearing the same rumor sourced from the same person, let's assume that these are coming from different places, that seems to match somewhat with what I hear as well. [Patrick notes: It’s always fun at dinner when you and another person are trying to say “Wait are we both in on the same information that we can’t share directly or are we swapping independent anecdotes?” No, this does not usually get resolved by a protocol involving one-way hashes, though that would be pleasantly cyberpunk.]
And you can see it from other datasets. So there is a platform called OpenRouter which has access to a whole bunch of LLM APIs, and in some of their recent data, when I looked at it, I saw an 8x growth in token usage, as in tokens served by a set of these models over a somewhat less than one-year period. And so we're going from a few hundred million tokens per month to several billion tokens per month.
And of course, while token prices have come down, that is showing the elasticity of demand and the fact that the demand exists and the demand is not being saturated at all. So I sometimes feel that we are the blindfolded men walking around this elephant, and we have to sort of put the full picture of what's happening in the market together. And I don't really see many of the feelers that suggest we're not dealing with something that's big and fast-growing. I mean, typically every conversation I have—sometimes a bit skeptical about them—points to the kind of dynamic that you talked about, which is that the growth is really, really fast. It continues to be fast. And you know, as prices come down, spending goes up because you can basically make the economics work on newer use cases.
Patrick: I also think that people—there's a comparative advantage in having seen the dynamics of SaaS companies at scale in understanding how this is going to scale because I think some of the best-informed people with regards to what LLMs can actually do these days are folks that are playing with them every day.
But if your view on an LLM's capability is "I chat with Claude all the time. He seems very emotionally supportive. I've done this sort of song generation in Suno, which is a wonderful experience, by the way"—you're probably not predicting what those capabilities in an API plus a two-to-five year enterprise integration cycle looks like.
Because after that exists, it's not going to be you invoking one a couple hours per day, it will be everybody getting a staggering number of LLM invocations on their behalf every day, most in the background. The right model isn't a person having a conversation with an AI. It's more similar to what happens when you open up the New York Times and several hundred robots conduct an instant auction for your attention on your behalf. And there's an entire ecosystem of firms in adtech that make that happen, and you will never know most of their names.
[Patrick notes: DoubleClick rattles around in people’s memory as the acquisition that became Google AdSense, but few non-specialists would recognize an obscure company like Nexxen. They’re publicly traded and serviced ~200 billion ads. Today.
It seems very likely to me that, someday quite soon, there will be a company running 25 LLM invocations for every man, woman, and child on earth every day. It is far less likely that 25 is the ceiling. I don’t think that most listeners, or myself, will necessarily be able to name them off the top of the head.]
Azeem: Yeah, well, can I add to that? I mean, we already got that crossing point on web traffic a couple of years ago where 50 percent of traffic generated is bots traffic doing various things like that and within processes within the enterprise.
There are so many advantages in having LLMs or AI systems talk to each other. Fundamentally because they can be much faster than we can, and that's one of the things that emerges when you see these sort of optimized distilled LLMs running at a thousand tokens per second. Which is that, that is really, really significant in terms of ingesting information, making a decision on that information and sending a signal back out to the next step in the process at millisecond speed, whereas humans work at minute speed.
And at a thousand tokens a second, rather than five to eight tokens a second, which may be where we might sit at the very, very best of times when we're reading something, let alone when we're writing something. And that velocity, I think in of itself will breed faster velocity. And that's why I love the idea of this being a complex system and the Complex Systems podcast because that is ultimately a complex system, right? With all of these feed-forward loops and flywheels.
Let's have your LLM talk to my LLM
Patrick: And if I can give people concrete examples of what is going to happen very, very quickly. If you're a connoisseur of the fictional experience of watching rich people talk to each other [Patrick notes: a surprisingly popular subgenre, almost all inferior to Margin Call, a hobby horse of mine], a line that you hear a lot in those movies and et cetera is "let's have your people talk to my people" where the two principals are aloof from their own calendar management but there's some implicit team in the background that takes care of that for them.
"Let's have your LLM talk to my LLM" is already happening and will certainly happen in the future in every conceivable way. As an example, I recently got a push notification, email, and paper letter, all from a bank asking me, "Hey, we haven't seen you update your address with us in a while. We need to be on top of your address. So do you have an update for us? If not, just tell us so." And that is totally going to be an LLM-driven conversation in the future. You can optimize out the stamp entirely, optimize out the annoying interaction with customer, and also optimize out probably all of the time on the bank side.
(Although there's probably not all that much human time on the bank side, given the nature of the current process. [Patrick notes: On reflection, having seen these done at financial firms, the process that generated that coordinated outreach may have taken a strike team of dozens of professionals grinding for a quarter or more to tick a box requested by a regulator.])
Azeem: Yeah. Can I give you a really concrete example that I use as well? So, one of the tools that I use is a network of LLMs. So I'll have a single LLM that acts as a kind of orchestrator, and I'll have several other LLMs that act as members of a focus group.
And I will put my question in to the orchestrator and the orchestrator will pass that question, which is sort of "evaluate this idea" or "evaluate this product" to the underlying agent LLMs, typically I'll use three or four. Each one of them will have a fairly detailed profile or a pen portrait, right? A persona, a marketing manager or an investor. And the last one might be a recent college grad and they will iterate and argue between themselves about the merits of this particular product with a view to coming to some kind of distinct consensus.
And in that virtual focus group where I run three or four of these, they will go back and forth and they will generate tens of thousands of tokens. And then ultimately the orchestrator will respond to me. Now I do that to help me sense-check ideas that I might want to explore or research, or to kind of really push them or to find a very exacting through-line or weakness, set of weaknesses in the idea.
And there are companies that are out there, like Electric Twin in the UK, that are building panels of LLMs where panels of virtual synthetic personas that are built within LLMs. These conversations will happen much more rapidly and at much larger scale. That will let marketers do in silico, what we might have had to do quite slowly and much more expensively in vivo with real focus groups.
[Patrick notes: A beautiful analogy which I’ll explain for anyone without a medical research background: in vivo is when you’re doing an experiment inside a living organism (e.g. Stage 2 clinical trials), in vitro (Latin for glass) is when you’re doing it in a test tube (which is cheaper / less risky, but perhaps less predictive, given that you don’t ultimately hope to liberate test tubes from the oppressive yoke of disease), and in silico is, by analogy, a still-cheaper software simulation.]
And I think that those are also examples that are beyond AI agents interacting on fixed processes and process flows where they're much, much more open-ended and actually the amount of resource that could go into those could be quite significant.
Patrick: So stepping back for a moment. I think that I've used as part of my writing process for many years is to either have a formal review from other people or more usually have an informal review where I conjure my mental model of a particular person in my head, read it and then, "Okay, what would this person in the industry think of this piece right now?"
Patrick Collison has described using that for his own writing process, but likely I think I stole the idea from him. At any rate, one can increasingly do that with telling an LLM "role-play as someone who has a large corpus on the internet." You don't necessarily need them to successfully anticipate more than 80-90 percent of what the person would say. It just functions as a wellspring of ideas to riff off of.
[Patrick notes: Claude is scary good at this. I am an advisor to Stripe, which pays somewhat more than $20 a month for that service, and before anyone writes me an email, they should ask Claude “What is patio11 going to tell me?” and go 3-5 steps into that conversation then write the email. The speed of iteration and the increased “floor” level of the resulting discussion more than make up for the gap between the artificial cognition and the genuine article.
You can do this for almost anyone with a public body of work and you do not need a consulting agreement with them to do it.]
Patrick: I did it over the weekend on something which was extremely professionally significant—extremely professionally significant that I was also spending social karma and no small amount of money with a number of external professional advisors. And the fact that the LLM needs no social karma at all and trivial amounts of money to step through, like role-playing as say five different potential audiences for a piece, was revelatory for me in terms of the quality, speed of iteration, et cetera, for the advice I got.
And if early adopters like us are using this in production—like this was a real thing for me that met a lot was riding on the line of this, not a fun test just to try out the new toy. If the early adopters are using this in production right now for this, you can imagine when every marketing team in America, marketing team in many places is running, as you said, in silico panels.
Is there a saturation point?
Azeem: I mean, if I help us kind of frame this as well, which is, when DeepSeek was released at the end of last year, sort of both V3 and R1, and then we had the flurry of excitement in late January, the idea of Jevons Paradox surfaced, right, which was Satya Nadella from Microsoft said this, and the point about Jevons Paradox was that essentially, if you've got clogged traffic around Austin and you build another freeway, within a few months, you'll have more traffic jams because ultimately there's positive elasticity of demand. You reduce the cost and demand goes up, up until a saturation point.
And one of the questions is where is that saturation point with AI? The truth is, I think we are nowhere near, and by nowhere near, I can't even find the phrase for the size of the fraction of being nowhere near. [Patrick notes: I’ve got a phrase: “We have probably not written one basis point of the software necessary to cover our current needs, and we will discover more needs as we write the next few basis points.” LLMs getting good very quickly since 2020 does not cause me to expect less software, but it is causing me to rapidly reflect on how the craft of writing software will radically change.]
So much of what we do in business, and often in our personal lives, is fundamentally gated by the fact that we don't have enough time to think through to get to the optimal solution. So we just use a heuristic and we accept that being a bit of slack and a bit of loss. And essentially we're not going to do that anymore, right? We're going to let these AI agents do tons and tons of thinking, enormous amounts of it. Things which we'll just did out of habit, we will get them to improve, particularly in business.
And so I don't think there's any end to the amount of demand that we could individually generate as individuals, either as people at home or in the workplace for thinking to be done by these machines. And the second lever of that is very few people, a few of us are currently doing that and very few organizations are doing that. I mean, I think OpenAI has 125,000 people paying for the pro level of ChatGPT, which is absolutely phenomenal. And if you're in a job that pays $100k a year or more, you should be investing in that straight away, because your ROI will be incredible. [Patrick notes: That is the $200 a month plan.]
And so I think that these two elements, which is one is how deep does each person or organization want to go and what new organizations emerge and how many of us are participating in that are both going to expand very dramatically. And as prices for intelligence or per token come down, the breakeven points will rise and will accelerate demand. And so that gets us to the question of like, what is the infrastructure that's going to serve all of that?
Patrick: So we'll get to that infrastructure in one second, but to remind people of a famous phrase in the technology industry—there was a point where intelligent people, well steeped in the worldwide situation for demand said that there was a worldwide market for perhaps five computers, and it turns out that we can deploy many more than five. [Patrick notes: Like many very hoary bits of apocrypha, it is in dispute that this was ever said.]
We are, I think, right now in the "five computers days" of LLM usage where we've applied it to the obvious applications. And people who are only seeing the obvious applications fail to appreciate what will happen once it gets injected into just about everything. And I feel pretty confident it's going to happen over the course of the next 10-20 years with new developments on a week by week, month by month basis.
What’s in a data center?
Let's talk about that infrastructure. So, data centers—we have a very diverse listener group for this podcast, and some people are intimately familiar with walking to a data center, some people less so. Let's set the scene first. You walk into a data center, look to your left, look to your right. What do you see?
Azeem: It looks like that scene in the Matrix when Neo and Trinity ask for guns and you're in a white room and just racks and racks of guns show up. And that's what you see in a data center. So you see very large numbers of racks that have within them pizza box sized computers stacked up and cooling systems at the back and power delivery.
And what is—I mean, in a sense, what you see today is not too different in my view to what we saw 20 years ago. In reality, everything is much denser, right? The networking bandwidth interconnects are running a hundred times faster than they used to be. The power demand is much, much higher. I mean, I think that the most recent H100 racks, which are the big NVIDIA GPU that companies like Meta use, they'll be running at 130 kilowatts per rack. I think about that as 70 kettles.
To put that into context, the first servers that I ran to serve websites back in 1996 ran to 200 watts each and I had four of them. So I had 800 watts sitting there roughly including the little ethernet switch that was connecting them to the internet. So we've gone from 200-800 watts to 130 kilowatts a rack, and that's the power demand. But of course you then have to cool this as well. And you have storage requirements as well. And I think roughly speaking, about 40 to 45 percent of this goes into actually doing the thinking, right? The flops and the processing, about 40 percent goes into the cooling and then the rest is networking redundancy.
Patrick: The cooling often surprises people who aren't specialists in this, but it basically comes down from, well, data centers don't get to cheat the laws of physics. If you pump X amount of energy into them, that energy has to go somewhere. [Patrick notes: First Law of Thermodynamics.]
And what you typically need to do in most locations is active cooling, taking the energy that you've pumped in, and removing it to the external environment. In some locations that are very cold at certain points of the year, you can use passive cooling, but we're putting data centers all over the place[, including places where this is generally not a viable strategy].
[Patrick notes: In active cooling you’re running a system which uses mechanical, chemical, or electrical means to achieve a higher rate of cooling. In passive cooling, you’re just relying on thermal equilibrium to do your work for you. Passive cooling works a lot better if you have a large temperature differential between your servers and the outside air.]
Azeem: Yeah, and so active cooling basically means we have to pump a liquid in, and we have to have a heat exchanger somewhere else on the other side. I think in Colossus, which is Elon Musk's data center, there's a fantastic video that shows the size of the final cooling pipes and they come up to your shoulder. They're pretty phenomenal.
The evolution and challenges of data centers
Patrick: So one final bit of color before we go into the recent developments in this, but density drives so much of both the economics and the operational concerns of data centers. And as we've heard over the last couple of decades, they're getting more dense per square centimeter—cubic centimeter, I suppose, because height is a material thing. We stack these boxes on top of each other.
Why does density matter fundamentally for the operator? Because a data center is fundamentally a real estate business, with some value added being the power, cooling, and onsite technical services. But an example of a counterintuitive thing with respect to what the density and intensification does for you: data centers, because they have huge amounts of electricity running through them, operating at high temperatures, as high as you can do without damaging the chips, are not the safest places in the world to be, particularly during some failure modes.
And so when I first got the badge that would allow me into a data center when I was working at a Japanese system integrator back in the day, I had to be given a briefing before I got the badge. Because there is a device on the wall called colloquially a big red button. And there are many genres of big red button in the world. The thing you really want your young engineer to understand before they walk into that room is, is this the big red button that just drops all the power in the room? Or is this the big red button that you have 60 seconds after you press it before every living thing in the room dies?
Azeem: The halon gas, right? Comes out to extinguish an electrical fire. [Patrick notes: Halon won’t kill you but a CO2 fire suppression system absolutely will. You’re unlikely to find one in a server room but this salaryman was expected to visit industrial sites, too, and so got the Every Machine In This Facility Can Kill You lecture.]
Patrick: That's one of the things your onsite safety engineer will be really, really interested in making sure that everyone taking even a guided tour of that room understands.
Anyhow. So, now, you know what it looks like in the data center.
Azeem: I want to add to this point, right? So one of the things that you've described which is this, that increasing density also has an impact on the physical real estate asset. So many data centers that exist today—and if we've driven down freeways in parts of the U.S., you'll have seen these buildings that look like warehouses, but are not warehouses. Turns out you can't upgrade them to these modern AI data centers because they actually can't maintain the power delivery and the cooling delivery that the new chips require.
So as the chips get more and more dense, they get hotter, they need better cooling, they need more reliable power. And in fact, you need different physical architectures. You physically need new buildings as well. And there's a kind of unintended consequence, I guess, of Moore's law and Huang's law and whatever else has sort of replaced those laws to make the chips more power efficient and more dense for flops per cubic centimeter is that they need different buildings.
Patrick: Yeah, one of the more mind-blowing things in my career as a systems engineer—systems engineers build combination software and hardware systems. I was definitely more on the software side than the hardware side. If I never own another server in my life, I will be very satisfied with that. But there existed a data center and the physical amount of weight of the server racks was over the physical capacity of the floor that the server racks were on.
And you can't simply run a command on your terminal to make the floor stronger than the architect designed it to be. And at the point where you are saying, "Okay, we'd like to replace not just the thin metal shell that is on top of the floor, but no, actually, we need the structural floor replaced," then you start thinking, okay, it might be time to build a new building.
Azeem: I think that what you've just described there is one of the things that's most misunderstood about the nature of this particular game. Because for the bulk of us, our experience with supercomputers are things that weigh five ounces, right? It's our cell phones. And they've always got smaller and they've largely got lighter. And that's always been the way they've been sold to us. That's true about laptops as well. When we think about our monitors, they get smaller and smaller on our desks.
And the miniaturization, the packing of more transistors onto every square centimeter or square inch of a die has the reverse effect on physical architecture. And it's a really important notion and I think it makes it really complex when you start to think about it. We tend to depreciate buildings over a many, many multi-decade period. You don't depreciate computer hardware over that period of time. And it used to be four years. Now Google and Amazon or Alphabet and Amazon have moved that to six years, but you have this difference in the tenor on the financing side and the depreciation side, it's much more complex than just upgrading your iPhone every three years.
Geographical considerations for data center
Patrick: So, both in the historical perspective and in the near future perspective, where did we build data centers in the physical universe?
Azeem: We started—I mean, the very first data centers, allow me to go back to that, tended to be in cheap bits of land that were reasonably close to where customers were. So in the U.S. it would be in Reston, Virginia as one example because you had Bolt Beranek and Newman, which was one of the architects of the NSF net kind of precursor to the commercial Internet was based over there. And I think that was one of the reasons why AOL ended up being there.
In the United Kingdom, there was a place called Docklands which was very, very cheap light industrial land that was not too far from the city of London, where those banks were among the first users of high-speed cabling. And so you really thought about that particular dimension, which is like relative proximity to customers, but also still being quite cheap for the land, that dynamic has of course continued. And we know that the Northeast bit of the U.S. is really, really big for data centers. But I think there are now considerations around the accessibility to, fundamentally, to electricity, right? Is there sufficient electricity for the work that we need to get done?
Patrick: To add to the electricity point, some fun color on why do we put data centers close to the customers? Back in the day, it was more about putting them close to employees / skilled technicians. And so, if your server breaks down, you might need your system administrator to drive out from Chicago to one of the suburbs to reboot it.
But say in the late 90s, network latency was not that huge of a consideration because who in the late 90s was doing anything where you could tell the difference between an 800 millisecond ping time and a two second ping time? [Patrick notes: Exaggeration for effect, fellow older geeks.] Fast forward to today, network latency is a primary consideration for where these go.
And so, there are worldwide networks of data centers at the largest firms in capitalism and also at firms selling to the rest of the economy. Well, the largest firms in capitalism also sell to the rest of the economy.
But we'll talk about that in a moment. To optimize essentially for network latency. And then this new power constraint is sort of new. The data center usage up until recently in the United States was probably single digit percentage of all the national electricity demand. So no small amount, but we get no small amount of value out of computers, so that's fine. But with the densification, with the notion of having entire buildings full of H100s running on training and inference, we start to have real constraints about, can we physically pump as many electrons through the grid as we need to? Call back to last week's episode.
Azeem: Yeah. But can I also put some context on this as well? Because AI is this technology which I think is incredibly important. It's also turned out to be very divisive in debates, both within the industry and outside the industry. And I can't really remember a technology that triggered such a split in people's perspectives.
Patrick: Oh, can I offer one? Cloud was in, say, the late 2000s to early 2010s, depending on where exactly you were in the world.
There were huge debates within both the engineering community and in the ones who are specifically hands-on the metal for most of their careers. Will big businesses ever consent to use somebody else's server and somebody else's building for their most private customer data?
Azeem: Yes. There used to be this ad by an on-prem company. I forget which—it was an advert and it said "it's not the cloud. It's someone else's computer" as a way of saying something negative and derogatory about the cloud.
Patrick: I remember working at a Japanese system integrator where we had a multi-year debate with the customer base, which were mostly universities in my department. This would have been in the late 2000s. And the universities would say, we really, really want to have all of our student information in a location that we control, where it will be safe, not in some building somewhere where we have no visibility.
The true engineering fact of the matter was the location they controlled was literally an unlocked broom closet on campus where anyone could walk in under the influence of a hangover or similar and walk out with all the data. And there was a bit of an adoption curve in the Japanese enterprise, but the somewhat stodgier members of the Japanese enterprise did get there a few years after the American enterprise and similar did.
But this was a live issue, as recently as 10-15ish years ago.
Azeem: The power issue has also been a longer-term issue than generative AI, large language models. One of the reasons I think this has become so present in people's minds has been, there's a lot of skepticism about the value that AI brings. Before ChatGPT in November 2022, we were all—and anyone knowing this was going to be a thing—we'd already started to see places like Singapore, which hosts a lot of data centers, and Ireland and a number of cities start to say, we can't provision any more data centers. And those were data centers just for pre-AI uses, like just moving customer data, becoming a digital business.
And if you look at the CapEx of a firm like Microsoft, for example. In 2021, Microsoft was going to spend $20 billion on CapEx. Only three years earlier it was at 10 billion. So it was doubling in three years. And this was well before the OpenAI deal had manifested itself. It was well before ChatGPT.
So one of the things I think we need to also contextualize was that even before AI and before this gen AI thing, data center demand was growing really significantly, partly because of the point you made about the cloud, right? Companies want customers' data close to customers and they're moving all of it off-prem. And we are now seeing that accelerate, but it's not first and foremost, in my view, something that is just about AI. AI certainly accelerated it but it hasn't been the sole spark for it all.
Patrick: And if I can give a shout out to Leopold's paper here, Situational Awareness—these are the sort of things which were obvious to some people back in the day, but they were not obvious to extremely informed planners of electricity demand for metropolitan areas and nations. They weren't obvious to hedge funds that were following the space, et cetera. They were conversations at dinner parties in San Francisco that said, "Hey, we might need a trillion dollars worth of new power build-out in the next couple of years. Trillion with a T, that's kind of wild. Huh. What could we do with that?"
[Patrick notes: I still think Situational Awareness is one of the most interesting striking deviations in form factor that I’ve seen on the Internet, and that is saying something. It is an excellent bid to play in what my friend (and previous guest) Dave Kasten calls the Essay Meta.
On the actual substance of it, the thing I most got out of it was a simple, clear argument that massive amounts of resources were already committed towards CapEx buildouts across a variety of domains, in search of the greatest prize ever. This was something I “knew” but had never really had click for me until I saw it on a single sheet of (virtual) paper.
Situational Awareness clicked for a lot of people, I think.]
Energy consumption and future needs
Azeem: A trillion of anything is a lot, but let's also just look at US overall electricity demand. I mean, electricity or energy in general is wealth and energy is prosperity and energy is health. There are no countries with good outcomes for their people broadly defined that don't have high levels of energy consumption per person, whether you're efficient about it or not.
The thing about the US is that as of 2020-2021, electricity was pretty much at the level of 20 years earlier. Now, that is a really important thing to look at. Now, of course you see energy efficiency, right? The switch to LED light bulbs is tremendous. You see environmental standards emerging and those making people think much more about their energy efficiency. It's also good business because electricity costs money. And if you can do the same commercial output with less cost of inputs, that's more profit for you.
But at the same time, for it to be flat says that there is something about the collective agreement by power providers to invest in capacity. And this was off the back of essentially a doubling of electricity consumption between 1975 to it suddenly going flat. And at that time, we also started to see the electrification, certainly of certain types of passenger car transport, right? The Teslas show up and so on.
So I think that when we start to diagnose this, we have to also go a little bit further back. And I'm not a historian of the U.S. electricity system in great detail. It's kind of odd that it flatlines 20 years ago and we don't start to make those investments, frankly, either in the U.S. or in many parts of Western Europe.
Patrick: I'm also myself not exactly an energy economist. I would say one thing which probably contributes to it was the U.S. has undergone some structural economic changes over the course of the last several decades, and there was a bit of a substitution between manufacturing.
Manufacturing output in the United States is as high as it's ever been. Manufacturing employment is lower. People sometimes confuse those two. That said, we largely shifted from a manufacturing focused economy to a services focused economy. Per dollar value of output, services use various resources, including electricity, less intensely.
But I do agree that there was a failure to anticipate future needs and also I think in the United States, in many places in Western Europe and many places near and dear to the hearts of many listeners of this podcast, there's been a real reluctance to build things in the physical world. It's almost like we have lost either the will, the knowledge, the capacity to do so in some places in ways that seem absolutely mind-boggling.
I lived 20 years in Japan. And Japan has many problems, but a refusal to be able to build buildings is not one of them. And then, look over the ocean over to China. China certainly has not forgotten how to do solar deployments, for example. And I think one of the most crucial things in this sort of moment we find ourselves in is rediscovering that—the conflict system, ba dum bum—that will allow us to actually build the infrastructure that our future economic needs depend on.
Azeem: Yeah, I absolutely agree with that. I think with China, we see a willingness, a desire at senior levels of government, but also as the acceptance amongst people that infrastructure's really valuable. And it's not just solar manufacturing capacity, it's also solar deployment and its deployment of solar at utility scale and on rooftops. It's about the deployment and build-out of nuclear power stations very rapidly. It's about high-speed rail. It's also about transmission.
Challenges in building transmission lines
One of the things that of course is really challenging in the U.S.—a lot of which is to do with market structure and regulation—is building transmission lines. But China has 34 ultra-high voltage transmission lines that are very energy efficient and don't leak a lot of energy over those long distances, totaling tens of thousands of miles. And one of the things that does, especially when you deal with intermittent resources like solar and wind, is that allows you to move the electrons to where they need to be consumed. If it's sunny in one place and they're not being consumed locally, you can move them to where they're needed.
The solar power learning curve
Patrick: We had a discussion about this a few weeks ago with Travis Dauwalter on the changing needs of transmission lines in the United States. And if I can elaborate just slightly more on what you've said, I think one of the most important facts of energy economics has been the extreme performance of the learning curve for solar power versus cost over the course of the last 25 years.
I remember at the time where I graduated university about 2004, it did not look likely that solar was ever going to be economical against coal, for example, absent huge subsidies for social reasons. And it turns out that not only did we continue down the cost curve, we actually bent that curve. The learning accelerated as there were multi-billion, tens of billions, hundreds of billions of dollars of investment into solar deployment.
And so, the solarification of the energy grid is one of probably the most central aspects of the coming infrastructure wave. But solar is not the only power generation thing that is going to shake up in the course of the next decade or two. You mentioned China has been doing large-scale nuclear build-outs which I kind of feel a little bit jealous of. But do you want to say a few words about the hottest—ba dum bum—new nuclear technology that we might be co-locating with data centers in the near future?
Azeem: Well, in the near future, let's just talk about China's electrical capacity. They added about 335 gigawatts of capacity in 2023, and it was 29 gigawatts in the U.S. So that's the scale of where we've got to. And I think the point about the learning curves with solar is that they really start actually much further back. Back in 73 or 74, there was a James Bond film called "The Man with the Golden Gun," where this British spy has to steal back solar technology. It was so important, he sent the best secret service agent in the world to get it.
And now solar panels are so cheap that in Germany, they've actually fallen below the price of fence panels. And you're starting to see people build out vertical balcony fences and fences between them and their neighbors, which don't catch as much sunlight, but it's cheaper and it generates some electricity for you.
Small modular nuclear reactors
I think we're hoping for quite a lot from nuclear and in particular, small modular nuclear reactors. So the idea of a small modular reactor is that it's all of those things—it's meant to be small and it's meant to be modular. What's the benefit of that? The benefit of that is that you tend to see better learning effects when you make more of something. And you get to see better learning effects when those things are modular rather than built as products rather than as projects.
And so one reason why solar has had these amazing learning effects is that the panels, whether on my rooftop or in a solar field in Texas are essentially the same. And of course the implementation is slightly different because one's on a roof, the other's on flattish ground mounted in particular ways. Nuclear reactors have been built as bespoke projects. And in many cases, they're sort of N of one, right? So you have to start from the beginning.
Patrick: A multi-decade bespoke engineering process where we get very, very little learning between the nth and nth plus one iteration of it.
Azeem: Right. Absolutely. And the idea between the small modular reactor is that you can build these things in a modular fashion so you can get learning effects because they are small. You scale out rather than by magnifying the scale of things. So if you want more capacity, you buy more of them. And frankly, that's what we've done in the computer industry, right? If you need a super powerful computer, you don't go off and get a massive mainframe with huge chips. You go and get 10 H100s and stick them together, or a thousand H100s and stick them together.
Patrick: This also works well against the nature of the demand for data center electricity because for a large-scale nuclear power plant that would produce enough electricity for a large fraction of a city, you don't have full control in citing where that plant is. You probably can't justify putting one directly next to the newest data center that you popped up by the freeway. But for a small modular nuclear reactor that might fit in a footprint that is about the size of a standard size shipping container, sure, put one right next to every data center. Put two if you want.
Azeem: And I think the other thing about the small modular reactors is that theoretically, they are safe by physics, by the laws of physics, as opposed to safe by layers and layers of containment systems and safety systems. So in some sense, they are much more appealing.
I guess the issue around the SMR that we have to recognize is a TRL—technology readiness level—risk, at least in the West. So there are some SMR units operational in China and Russia. And so I'm not sure how quickly we're going to import them into the U.S. or into Europe. And there are lots of companies who are building new designs with reference designs. And we are hoping to see them take off. And in a sense, if there's a tailwind of demand and capital that's available, you could potentially scale these out much faster than we have scaled out nuclear plants.
But I think there is a recognition also that you need the electricity provisioning today, which is why we're starting to see gas generation on some of these bigger data centers, whether it's Meta's or XAI's at Colossus.
Patrick: So you mentioned TRL there. Can you say a few more words for the benefit of the audience?
Azeem: Yeah, technology readiness level. So it's a kind of standard level of technology readiness that runs from whether something is really at the high-risk scoping stage through to we know exactly how to build it, how to price it, how to implement it, what its total lifecycle looks like. And in a sense, small modular reactors are sort of lower down that scale, probably. And I'm guessing—I'm kind of extemporizing slightly—but certainly in the fours and fives and sixes rather than in the nines and tens, and that creates a certain degree of risk and uncertainty of what the outcome looks like.
Patrick: There's also, of course, a regulatory slash political will issue about nuclear reactors. Were, I'm going to make a terrible pun, but I am a dad, I get to do dad jokes. They were politically radioactive for a number of years in many Western democracies. And I think we are in a moment the last couple of years where we can—partly through a combination of engineering fact, the new technology is simply safer than existing technologies, but partly because we had good substitutes for base load demand, or acceptable substitutes for base load demand, liquid natural gas, etc, for many of the last couple of decades.
And things have changed. One is the climate issue, of course. We would strongly prefer avoiding using combustion of hydrocarbons. And then the geopolitics of energy usage have changed quite radically over the course of the last 10 years or so, to a point where, say, much of Europe, where there is a relatively extreme level of political engagement around environmental issues, it's like, well, you can choose either fulfilling all of one's preferences with respect to domestic constituencies that are vociferously anti-nuclear, or you can choose to be warm during the winter. And when push comes to shove, many of our truest and dearest friends over there will probably choose to be warm during the winter.
Azeem: Yeah, coming back to small modular reactors though, Google has this deal with Kairos Energy for, I think it's six, maybe it's seven small modular reactors. And I think 2030 is the delivery time. So we're talking five or six years out. A couple of interesting things about this is that given that it's seven, we are going beyond first of a kind. So first of a kind tends to be the really expensive one.
And I did write about this a few months ago saying this is kind of a Google gift to humanity because the learning curves will be shared, learning experience will be shared by all of us. And they're the ones who are paying the price to bring these things out at a higher cost. But look at the timing—it's six years. And that's only seven reactors and these are small reactors. And so the electricity requirements across the U.S. economy are really enormous. And we have to ask how quickly can this actually fundamentally scale up.
But what Google did was they addressed something that this complex system has as a roadblock, which is that mezzanine financing, which is not venture capital level risk, but nor is it the low risk guaranteed return of asset finance. And that has been an issue with a lot of these energy and electrification hard technologies, which is that when you're developing the IP, the intellectual property around which the equity value and the extreme return comes, venture capitalists are willing to take that risk.
But venture capital is a really small asset class and they can't fund infrastructure projects. And once they built the first one, you still have risk. You have a lot of deployment risk. You have all of the learning effects before it turns into something that is steady and stable, like a solar farm or a wind farm where infrastructure investors come in and they ask for very, very steady state returns with very little chance for extreme upside, which is what the venture capitalists are after.
So you have this middle period, which is kind of first-of-a-kind financing risk, which has been really, really difficult to address. It's known amongst people in climate tech as one of the valleys of death—there are many valleys of death and it's a struggle to cross it.
Geothermal energy and fracking
Patrick: This is an amazingly fortuitous topic for you to bring up because we didn't plan this in advance, but I've actually spent a good portion of my professional cycles the last two years volunteering with a focused research organization that is attempting to popularize next generation geothermal.
And it is exactly that problem. There are VCs who are willing to write checks into the hopefully defensible IP for power generation using next generation geothermal, but every time you do an experiment in the field, you need to spend 20 to 60 million to ask Halliburton to provide you professional services. And what is the thing they're going to do for you? They're going to dig a really deep hole. It’s $60 million a hole. Let's go. [Patrick notes: Alright technically speaking you get a few holes. Austin Vernon discussed the mechanics of fracking, which are quite similar, if you want to hear more about pads, holes, and subsurface engineering.]
As you said, that is challenging in VC land. In a world where all the technology risk has been shaken out, there are virtually unlimited amounts of capital available to do this. So the oil and gas industry in the United States, for example, it's the same people digging functionally the same holes. But can you go to a bank or other sources of capital and get the marginal gas well financed? Absolutely. There are people who do that every day. Give us your engineer's numbers. I'll put it in my spreadsheet. Green. We are a go—you'll have your wire tomorrow.
And so a bit of the last two years for me has been attempting to cheat the valley of death on behalf of this FRO. [Patrick notes: knock on wood, hope I’ve been helpful, and I wish them every success.]
Azeem: Can I ask about your experience there? That's super interesting to me. And what is next generation geothermal rather than geothermal?
Patrick: Sure. So, the brief version is that the geothermal that most people think of is places where heat energy from the earth bubbles up so close to the surface that you can physically perceive it in some cases. So hot springs, geysers, et cetera. When you think of the places in the world that are the largest geothermal energy producers currently, you think of places like Iceland, and they have been blessed by nature with the particular subsurface formations that give them abundant access to geothermal.
Most places in the world are not similarly blessed by nature. However, due to the, particularly the fracking boom in the United States, we've gotten much, much better at drilling to depths that were not economically drillable before. And if geothermal can only tap energy that is available at the surface of the earth, you have to be blessed by nature to do it.
If on the other hand, you can go down, say, six to 10 kilometers, then essentially everywhere is blessed by nature—and 90 plus percent of the continental United States is a number that I've heard thrown around. And thus there is still technology risk. You know, digging the hole, sure, but you need to figure out what the generation station is that you put on the top of the well, and what the curve looks like for heat in the immediate vicinity of the well that you have fracked.
Heat is a limited resource, when considered from the hyperlocal perspective. [Patrick notes: You can round it to infinite throughout the earth’s crust.]
So, you tend to get a trailing off of the generation over a sub-time scale. And so we're really looking for those next like 1, 2, 20, 40, 100 wells to see what do those curves look like. And then it's just a numbers game. In one version of physical reality, this is not cost competitive with other forms of heat or electricity generation. And in another version of physical reality, we have free, clean, abundant energy available in large portions of the world. And so, ask me in ten years which reality we're living in. I will have a very confident answer in ten years, but I don't currently.
Azeem: And what's the price that you think is cost competitive?
Patrick: Ooh, great question. Not a number that I had cached off the top of my head because I didn't know what we were going to be talking about.
Interestingly, just to say a few more words on why fracking matters—fundamentally fracking, the oil and gas people love to call it subsurface engineering—but we had a prior episode about fracking. But drill the hole, pump a liquid, a working liquid down the hole and use that to break rock around the vicinity of the hole. And then there's a few different technologies to do this. Can't go into the entire thing here. Sorry, folks.
But in one version of the system, you pump water or some other working liquid, which filters into the cracks that you've made. [Patrick notes: “enhanced” geothermal systems (EGS), as compared to “engineered” geothermal systems (... also EGS), in one of the great Somebody Should Have Hired A Marketer Years Ago episodes of my career.]
And because the surface area in those cracks is—the cracks look fractal in nature, and the surface area is absurd relative to the diameter of the hole. And so you can pull heat from the surrounding rocks for a very long time, hopefully. But eventually, the rocks your water touches will have heat drawn from them by the water at a higher rate than heat is drawn into them from surrounding rocks, and your yield will decline to not longer support the numbers you want at the top of the hole.
And yeah, that's long story short. For people who want to learn more about this field, I'll drop some links in the show notes. [Patrick notes: Project InnerSpace curates a roundup of resources; I find this one [PDF link] to be the best single document if you want a primer.]
Azeem: I mean, geothermal is I think, a really interesting and potential technology. And it speaks to the fact that the energy system feels like it's going to continue to be heterogeneous. You know, what happened with computing is that we tend to have these sort of winner-take-alls, although it's a little bit more heterogeneous than it looks at the surface. Cause kind of an ARM chip is different to an Intel chip is different to a GPU. But I do see a kind of world of different energy technologies.
Patrick: Definitely, and as we heard in the episode with Travis Dauwalter a few weeks ago, the heterogeneity of power generation makes the grid more stable, because there are pretty different physical aspects to different power generation technologies. Nuclear, geothermal, et cetera, are a stable base load power. And then solar seems to be scalable to the moon. Oh man, dad joke number two. But solar is of course only available during particular hours of the day.
Azeem: Well, I think it's worth asking a question about how far you can actually go with solar. Cause I suspect that it's further than most models take it. And just here, my case for that, because solar is highly modular the market expands significantly, and that means that homes and small businesses as well as large scale utility providers can get into solar, and we've seen this happen in large parts of the U.S. with community solar, but of course rooftop solar in China is absolutely enormous.
And Pakistan has a great example where businesses got sick and tired of the grid failing. And so just went off and bought loads of solar panels. So at least 10 hours a day, they could run their business. The cost curves are really, really in their favor. And even though we've had 50 years of learning, it's not clear that solar price decline, panel price declines, are going to stop anytime in the next five to 10 years. And there are new technologies bubbling in the wings.
So even though it's far from perfect, the total system cost is something that's quite dynamic. And the other aspects of the total system cost will be whatever happens with batteries and other forms of storage. And batteries are really early in their learning effects. I mean, the cost of kind of prestige batteries, right? The lithium ion battery have declined from 1,200 per kilowatt hour to about 40 per kilowatt hour since 2011-2012 to the beginning of 2025, and there's probably still some room to run and there are cheaper technologies with different physical characteristics like sodium ion and even ion air batteries that are in the wings.
And then you have to think about how do you manage your distribution? Cause that becomes an important part. Cause as you say, it's sunny in one place and it's not sunny somewhere else. And there are a couple of really interesting projects that are going on at the moment. One is built in Australia to take power all the way up to Singapore. Another in Morocco to take power to the United Kingdom with these subsea high voltage, direct current cables that are being built out.
Patrick: There's also some other interesting second, third, maybe even seventh order effects for some of these technologies, where Casey Handmer, previous podcast guest, his company Terraform Industries is attempting to do a direct capture of carbon dioxide to turn into hydrocarbons using "alien science"—which is of course heavily energy intense because you can't cheat the laws of thermodynamics—but given that you have huge amounts of solar generation in one part of a nation, if hypothetically you can do local generation of hydrocarbons, then you can ship extremely energy dense hydrocarbons to wherever you want to put them in the world, combust them there. And then the carbon goes back into the atmosphere. You suck it right back down and turn it into hydrocarbons again.
Azeem: So Casey is amazing, but let's talk about, let's just go through that cycle again. If we take carbon dioxide, carbon out of the atmosphere and we push it over the second law of thermodynamics hump, and we turn it into methane and we combust that methane, we're net zero, right? We've not put any additional CO2 into the atmosphere and the energy density of gasoline or methane or kerosene is absolutely staggering. And we have a whole load of systems that already know how to use that.
I think that's a great example of why it may be that solar could end up being, and I think it will be, the dominant supply of first party energy and electrons into the system. And there are other things that you can start to do with the system, like demand response. So that's where you affect and create incentives for people's behavior to change. There are lots of air conditioners and heat systems, heating and cooling systems in homes in Texas, whose behavior is actually managed by the energy provider to respond to kind of minute by minute and hour by hour changes in electricity demand and pricing.
And that can also extend to how we might shift workloads for compute around data centers at different times a day. Not everything needs to be right on the front end. Your Akamai servers serving up live video need to be close to the customer, but not everything needs to be right close to the customer the whole time.
The future of AI and energy systems
Patrick: I'm not the world expert on this, but we seem to be sort of in land grab mode at the moment with respect to training new AI models, but one can imagine a future in which…
For those of you who aren't familiar, these wonderful AI models that we're using these days have typically two phases. There is a training phase and then an inference phase.
The training phase is the months of hard work that OpenAI or Anthropic or another lab put into putting out one of their new numbered releases of a model. And then the inference phase is what happens in the few seconds between when you ask a question and when an answer comes back to you. My guess—finger to the wind without inside knowledge—is that the chips that have been doing training have been running hot essentially 24 hours a day, seven days a week for the last while.
However, one can imagine future iterations of this technology where there is actually some sort of cost benefits curve associated with it. And so at times where electricity is particularly expensive, stop training for a while and continue doing the inference on an on-demand basis. Or, again, currently, the dominant public deployment of AI is with a user sitting at the keyboard typing commands into a computer, but that will not be the case for forever.
It might make sense to provision more intelligence when electricity is cheap to do these sort of "offline calculations" on behalf of industry than to simply continue running inference at the same levels everywhere in the 24 hour clock. Or we could end up in a world where cognition is just so stupidly valuable that why would you ever turn it off just to save on electricity bills?
Azeem: Well, and what I love about this is these are a few scenarios and let's throw out some other scenarios. One is the algorithmic optimizations of the functional level of intelligence that we want at a given time. And we think about Moore's law being this remarkable thing, right? 60 percent cost declines on price performance every year for decades and decades. But software optimizations can be orders of magnitude of improvement instantly, like a fast Fourier transform or doing something with a bloom filter rather than mechanically walking through—
Patrick: Bloom filters. That brings me back. [Patrick notes: Many, many years ago, as an undergrad research assistant, I worked on a project which implemented them on an FPGA.]
Azeem: Sorry. I'm just an old dude. I can't help it. And so there's one thing that we have to think about, which is like software optimizations. There are also novel architectures. So I invested in probably the world's first reversible computing semiconductor company. So what reversible computing does is it has a different way of processing information.
And the reason why NVIDIA chips give off so much information is that they're irreversible. So they increase entropy and you destroy information and that appears as heat loss. If you don't do that and you have reversible processes, you actually can be a couple of orders of magnitude in theory, more energy efficient. And it comes at a certain cost of complexity for building the gates up that are required in a chip.
You could see 10, 20, 50X improvement in energy efficiency, which is faster than the wonderful improvements of energy efficiency we've seen over the last 30 years in computing, since we started to move towards laptops and then cellular devices, but none of that actually—that is all extremely helpful and the market will surely move to more energy efficient systems because electricity will always cost something.
But we still have the specter of Jevons paradox, which is I think your observation, which is we never know how useful cognition will be. And as we said earlier in our discussion, I think we're barely scratching the surface. You said we're at the five computers stage of LLMs. And so net-net, I'm sure all of this stuff is going to become orders of magnitude more efficient, right? Sort of digital IQ points per watt will get far, far better than it is today. And boy, are we going to demand a lot of it.
Patrick: Points on the scale—IQ is a useful abstraction to bat around casually about it. I think we'll probably have more powerful abstractions in the next while to describe something that is like, this LLM is extremely limited with respect to its capabilities, but doesn't have to be a super genius to successfully route a package from point A to point B versus we will have others that are assisting people in doing cutting edge scientific research.
And every time there's a new model release every six months, the research assistants will be getting shockingly more capable over very compressed timespans.
Azeem: I think what you've described there, though, that ecology is so important for everyone, for us to understand. You don't always need a PhD in negotiation to help you decide how much to pay for the pair of socks in Walmart that you're about to buy. That is the price—you just pay for it and you're done.
And I think that will also be true for the way in which we embed intelligence in our system. But we've only really started to sketch that out. I mean, if you think about humanoid robots that are down to a few thousand dollars from Unitree in China—how much intelligence or whatever proxies for common sense do we want?
I would say that it's got to be at least at a GPT-4 level. I mean, you wouldn't want a robot like that understanding the world as well as GPT-2 did, which was random sentence fragments and then going off anywhere. And you'd certainly want more controllability. I'm not saying you could just lump one of these GPT-4 class open source models into a Unitree robot and say, "go and look after my kids." I'm saying that that's kind of surely the baseline. As you say, you don't necessarily need it to make an Einstein level discovery while it's loading your dishwasher.
Patrick: Yeah. And these will be subcomponents of larger engineered systems. I think people—we've had robots for a very long time. There's many folks, science fiction aficionados, that can quote Asimov's Three Laws of Robotics and talk about a day in which a robot could kill someone for the first time. And the system engineer in me who worked in Central Japan says, "Oh, that day actually happened in the 1970s in an industrial accident."
But we can do things in factories like saying, "Okay, it is possible that a human factor system might not be sufficiently intelligent to walk in an uncontrolled environment right now. So, all right, let's cheat all those assumptions. One, it won't be human factor. It will be just a grabbing arm. Two, we're going to put it in the middle of a factory where we control everything around it and put in some factories yellow hazard tape describing the physical maximum extent that the arm can move. And so you can guarantee the robot system the invariant, there will never be a human skull inside this physical hazard tape during your operation, and therefore you cannot crush anyone like an egg."
And then there's sort of a—we will not be in a stable equilibrium as the software gets smarter, as the LLMs get smarter, as they unlock additional fun toys for us on the software side, we'll find new ways to build hardware to take advantage of those new capabilities to build out larger engineered systems to put the smaller hardware systems in such that they can produce even more value at scale. So, wild time to be alive.
Azeem: It's a wild time to be alive, and what we have to do is fix our mental models around how these technologies emerge, and this intersection between electricity and computing, or AI is bringing together two very different worlds.
So what's happened in the computer industry since the 1960s and the 1970s is that we have tried to bring the computing closer and closer to the end user—from the mainframe to the mini time-sharing, then personal computers and then smartphones and even smaller smart devices. I'm wearing a smart ring on my hand right now.
And we've also then moved into this hybrid environment where there are certain tasks that I do locally, like my sleep tracking, and there are other tasks that I push out where there's lots and lots of compute and lots of storage and I just need to get them done in the cloud.
The energy system was never like that. The energy system was all mainframes. There's Three Mile Island, there's the Hoover Dam, there's a huge coal station or power station, and then you pipe that power over and we're all consumers. And so all of our mental models around this are based around those types of ideas.
And I think a really good analogy for how the system changes is what happened with telecoms moving from that world, which is what the old telephony, pre-internet telephony system looked like, to what the new internet looks like. We went through a period of time when the new internet was sort of highly decentralized point-to-point, but all the servers were back in Reston, Virginia or Telehouse in Docklands or the Palo Alto Internet Exchange.
And then we've started to hybridize this. So quite often, when you access a resource that is a resource in another country, that's actually served quite local to you, perhaps only 20 miles away from a front-end caching system, like an Akamai edge server or Cloudflare server, that is a tiered topology.
And I don't see any reason why that's not what AI infrastructure ends up looking like. And the bit that's a real mind shock for people is that's what the energy system might end up looking like that supports this—with localized generation and vehicle-to-grid and community batteries part of the mix.
Patrick: Yeah. To analogize directly to the mainframe—love that analogy. The large scale nuclear plant is a mainframe. We had a few decades of usage of the local server room with, particularly for industrial uses, co-located, perhaps behind the meter, small, typically combustion-based electricity generation. And then we've had rooftop solar over the last couple of years.
I'm somewhat bearish on rooftop solar, but that's neither here nor there. At least there are places in the world where it makes a lot of sense where grids are less reliable. I think it—like California! I'm teasing you.
An interesting rabbit hole for people to go down: California and Texas made very different bets with respect to both loving solar. California thought "we really, really want rooftop generation. That seems extremely incentive compatible. It will be green, et cetera, but it won't spoil our beautiful landscapes." And Texas said "the desert is basically free. We are going all in on utility scale generation." And that experiment was run, the results are in—Texas won by a lot. [Patrick notes: I will accept a minor knuckle rapping here from California partisans for imprecision in the closing minute, but I think I’m substantively right.]
But there might be a battery in every garage in the very near future. Certainly Elon Musk would love to make that happen. And community scale batteries operating at material scale might very much be a thing in the next couple of years. It'll be wild times.
Wrap
So, I feel like we could continue having this discussion for a very long time, but do want to be respectful of your time and the audience's attention as well. Where can people find you on the Internet, Azeem?
Azeem: The best way to find me is at Exponential View. You can sign up to my newsletter there.
Patrick: Awesome. Thank you very much for coming on today and for the audience, thank you very much for joining us. Again, we'll be back next week.
Azeem: Thank you.