From Data Centers to Dyson Spheres: P-1 AI’s Path to Hardware Engineering AGI

Training Data: Ep47

Former Airbus CTO Paul Eremenko shares his vision for bringing AI to physical engineering, starting with Archie—an AI agent that works alongside human engineers. P-1 AI is tackling the challenge of generating synthetic training data to teach AI systems about complex physical systems, from data center cooling to aircraft design and beyond. Eremenko explains how Archie breaks down engineering tasks into primitive operations and uses a federated approach combining multiple AI models. The goal is to progress from entry-level engineering capabilities to eventually achieving engineering AGI that can design things humans cannot.

Listen Now

Stream On

Summary

Former Airbus CTO Paul Eremenko co-founded P-1 AI with the ambitious goal of building engineering AGI for the physical world. In this episode, he describes how his company is developing Archie, an AI engineering agent that can join existing teams to help design complex physical systems—starting with data center cooling systems and progressing toward aircraft and eventually starships.

Synthetic data is foundational to progress: Unlike software, physical engineering domains lack large, accessible datasets. Building a scalable AI solution requires generating massive, high-fidelity synthetic datasets that are physics-based and supply chain-aware. Clever sampling—densely around dominant designs, sparsely around outliers—maximizes learning and model robustness while overcoming real-world data scarcity.

Focus on cognitive automation, not tool replacement: The path to product-market fit is by mimicking the cognitive workflows of human engineers: distilling requirements, synthesizing solutions, evaluating designs and knowing which specialized tools to use. Rather than supplanting existing design and simulation software, successful AI agents should orchestrate and leverage these tools just as a human would, reducing friction for adoption and integration.

Complex physical systems require a federated AI approach: P-1 uses multiple specialized models—including physics-based surrogate models, geometric reasoners, and “lobotomized” LLMs—orchestrated by a central reasoning system to handle different aspects of engineering analysis and design.

Success requires starting simple and scaling up methodically: P-1 began with residential cooling systems as a proof of concept, progressing to data center cooling systems with ~1,000 parts, and aims to tackle systems an order of magnitude more complex each year—targeting aerospace systems with millions of parts within 3-4 years.

The path to engineering AGI is incremental but transformative: Starting with entry-level engineering capabilities and learning from real-world data and interactions behind corporate firewalls, AI agents can progressively develop more sophisticated abilities, potentially leading to breakthrough designs beyond current human capabilities.

Transcript

Chapters

Progress in AI for physical engineering
What is Archie?
Multiple models for engineering reasoning
What’s ahead for Archie?
The map of product verticals to get to AGI
Evals and Bloom’s Taxonomy
Scaling synthetic data
How this will change engineering orgs
Tackling product customization
Error rates vs humans
Lightning round
Mentioned in this episode

Paul Eremenko: When I was asking the question over the last couple years of, like, why isn’t anybody working on AI for building the physical world? The answer was training data, right? Fundamentally, if you want an AI engineer that can help you design an airplane or modify an airplane and you say, “Hey, what happens if I change the wing on an A320 by 10 percent, increase the wing area by 10 percent?” In order to be able to answer that, your model has to be trained on millions of airplane designs, ideally. And there just haven’t been millions of airplanes designed since the Wright brothers. Even if you did magically have access to all of them—which you don’t—and if they were all modeled in a coherent sort of semantically integrated way, which they aren’t, right? But even hypothetically, you would have maybe a thousand designs since the birth of aviation, and so nowhere near enough to train a large model.
Sonya Huang: Today we’re excited to welcome Paul Eremenko, CEO of P-1 AI. Paul was a director at DARPA and the youngest CTO of Airbus at age 35 and now he’s getting to turn his science fiction dreams into reality at P-1 AI. P-1 AI is attempting to build engineering AGI for the physical world. So we already have fantastic companies like Anthropic, Cursor, and Devin that are transforming software engineering, but hardware engineering and the physical world–whether it’s data center coolers or airplanes–has yet to be transformed radically by AI. We talked to Paul about the opportunity, the key bottlenecks in gathering data, and how he envisions their agent Archie evolving to help build the physical world around us from fighter jets to starships.

Sonya Huang: Paul, thank you so much for joining us today, and we’re delighted to have both you and your Jack Russell terrier-beagle mix, Lee, on the show. Welcome. Let’s start off with—we just had our AI conference, AI Ascent. And, you know, at the conference Jeff Dean was talking about the potential for vibe coding, and how a 24/7 junior software engineer is going to be possible through AI within the next year or so. So it seems like software engineering is really going through this vertical takeoff moment right now. What do you think is happening in the physical world as it pertains to physical engineering?

Paul Eremenko: So not a lot is the short answer. And one of the reasons we founded P-1 AI is because I grew up on hard sci-fi, and I was promised AI that would help us build the physical world, the world around us, and eventually starships and Dyson spheres. And when the kind of deep learning revolution really started to take off, I asked the question of, “Well, who’s building this stuff? Like, who is doing that AI that’s going to help us build the physical world?” And the answer was nobody was working on it, and it really wasn’t even on the agenda of the kind of foundation labs.

And some years later, today, 2025, it still isn’t, right? And so we ask the question of why that is. We can talk about why that is maybe later in the podcast. And we think we have a solution to remedying some of the reasons, some of the challenges, and actually bringing it to market.

So I think—and Jeff, by the way, is—we’re very grateful to have him as an angel investor in the company. And I think coding AI has been a long time coming. One of my co-founders, Susmit Jha, did his PhD in 2011 on program synthesis. So this is not a new technology, but it’s just now, I think, finding that product market fit, the right packaging and the right business model, the right pricing models. I think physical AI, we have the benefit of standing on the shoulders of a lot of the coding AI work. So if you can have a programmatic representation of your physical system, you can use some of the program synthesis type techniques to create physical designs. So we’re not—you know, it’s not going to take a decade or 15 years. We think that we can put the technology bricks together this year, and hopefully start finding product market fit as early as next year.

Pat Grady: Can we—yeah, can we double click on that a little bit? What are those technology bricks? What pieces need to be in place for this to become a reality?

Paul Eremenko: Yeah, so the biggest one—and again, when I was asking the question over the last couple years of, like, why isn’t anybody working on AI for building the physical world? The answer was training data, right? Fundamentally, if you want an AI engineer that can help you design an airplane or modify an airplane and you say, “Hey, what happens if I change the wing on an A320 by 10 percent, increase the wing area by 10 percent?” In order to be able to answer that, your model has to be trained on millions of airplane designs, ideally. And there just haven’t been millions of airplanes designed since the Wright brothers. Even if you did magically have access to all of them—which you don’t—and if they were all modeled in a coherent sort of semantically integrated way, which they aren’t, right? But even hypothetically, you would have maybe a thousand designs since the birth of aviation, and so nowhere near enough to train a large model.

And so the most sort of foundational technology break for us is creating this training data set. It is synthetic, that is physics-based and supply chain-informed, of hypothetical designs in whatever physical product domain. So it could be airplanes, could be something else. And making it large enough and making it interesting enough. So the design space for most physical products is almost infinitely large, right? Like, it’s huge. And so you can’t randomly sample it, you can’t evenly sample it. You have to very cleverly sample it. You want to sample kind of densely around dominant designs, but you want to sample sparsely around the corners and edges of the design space, because that teaches you something, even if that corner edge of the design space is not somewhere where you would ever want to go, it teaches your model something about why that is, right?

And so creating these data sets for training models, that was sort of the core of our approach. Then, of course, if you just take—if you now have a million airplane designs and a performance vector for each one, and you throw it at an LLM in post training or even in pre training, you’re not going to magically get a good engineer. So then there is the question of: What does the model architecture look like? And today we use a federated approach of a bunch of different models—and we can talk more about them—that do different parts of engineering reasoning. And then they’re all orchestrated by kind of an orchestrator-reasoner LLM that also acts as the interface to the user.

Sonya Huang: Actually, can you say more about that? How do you get your models to be capable of doing the physics-based reasoning? And is this stuff done in kind of design software today? Is this stuff inside an engineer’s brain? And how do you kind of put that knowledge into a model?

Pat Grady: And can I add to that the supply chain-informed piece of the equation? How does all that come into play?

Paul Eremenko: Sure, absolutely. So first let me maybe describe what the product actually is, right? Because I think that’ll help answer part of the question. So we are focused very narrowly in some ways on cognitive automation of what a human engineer does in designing physical systems. And so what does a human engineer do? So humans are very good at taking a bunch of requirements and distilling what are the key design drivers that come out of those requirements, postulating one or more possible solutions that meet those design drivers, doing first order sizing of what does the answer look like roughly, right? And what is the relevant phenomenology in doing that sizing? And by “phenomenology” I mean, like, what are the different physics? Because it’s not just about geometry, right? These are multiphysics systems, so they have electrical and thermal and vibrations and electromagnetic interference. And sometimes those matter, sometimes they don’t, right? And humans are very good at—good engineers are very good at selecting which modalities matter in doing this first order sizing. And is this really going to close and is this really going to be a viable design?

And then humans are very good at knowing what tools there are for detailed design and analysis. What is the range of applicability of those tools and how do you use them? How do you set up the problem for those tools? And that’s exactly what we’re trying to tackle is that cognitive automation. So the first product is called Archie. So if I refer to Archie, that’s not Lee.

Pat Grady: [laughs]

Paul Eremenko: Archie is the agent. And a really important consequence of this focus on cognitive automation is that we are not trying to play at the tools layer. There are existing detailed design and analysis and simulation tools, and we want Archie to know how to use those tools the same way that a human knows how to use them. But we don’t try to replace the tool, we don’t try to make it better, we don’t try to compete with it, we don’t try to supplant it in any way, right? We just learned that they are there and their range of validity.

Pat Grady: Right on topic, just like a human.

Paul Eremenko: That’s right.

Pat Grady: Yeah.

Paul Eremenko: So your question was around so what are the different models, right? And how do you do the engineering reasoning? And basically, all of the things that I just described: distilling requirements, picking key design drivers, sizing, et cetera, they all simplify to a couple of primitive operations. And the operations are design evaluation, right? So if you have a particular design, what is the performance of that design? Again, modeling the relevant phenomenology that’s in the design. Another one is design synthesis. So if I have a specified performance or a specified requirements vector, what is the design?

Pat Grady: Mm-hmm.

Paul Eremenko: And a third class is a little more complicated, which is finding errors and infilling inside a design. But basically, any engineering query, any engineering task that a human engineer does reduces to some sequence of these operations. And so what we then have to do is first of all, have a reasoner-orchestrator that’s good at taking tasking from humans in an organization and decomposing them into the right sequence of primitive operations. And then some models are neural and some don’t need to be neural, that are actually good at then carrying out those operations.

Pat Grady: Mm-hmm.

Paul Eremenko: And so some of the things that are behind the orchestrator-reasoner are for instance a graph neural network that’s just very good at being a physics-based surrogate model over the performance space. That’s one example. Another one is a geometric reasoner model that allows you to answer questions about relative positioning and packing and interference and things like that. Some of those geometric reasoning operations are very easy to do just algorithmically, like software 1.0 style, right? You don’t need neural capability. Some of the more complex ones you can do with VLMs. I think that there is yet another category of physical reasoning operations that we don’t yet know how to solve, and I think that there will be a generation of AI models that’s coming that are physical world models that will have better intuition for spatial, for some of the more complex, higher-order spatial reasoning tasks. And then you have physics reasoning, right? You have sort of your multiphysics reasoning. There’s a few different approaches—some of them software 1.0, some of them are neural. One example is we have what I call a lobotomized LLM.

Pat Grady: [laughs]

Paul Eremenko: Which is an LLM. It’s no longer good at English, but it is very good at doing programmatic representations of—multiphysics representations of physical system designs and reasoning over those. So that’s kind of a federated assembly of models that are all orchestrated by an LLM reasoner that is also the interface to the user.

Sonya Huang: What is Archie capable of doing today? How does that compare to your average hardware systems engineer today? And what’s ahead for Archie?

Paul Eremenko: Yeah, that’s a great question. So what we’ve done today—so we’re about nine months old as a company, what we did in our pre-seed is basically a toy demo around residential cooling systems, right? So those, like, air conditioning units, those kinds of things. And the reason we chose that is because it’s a fairly multiphysics domain, so you have fluid flows, you have air flows, you have thermal interactions, you have electrical systems, right? So it’s rich, but the number of components in a system is not very large, and a lot of the physics phenomenology is pretty linearizable, right? Like, you can simplify it. So it’s kind of rich enough to be convincing, but not so complex that we’re bogged down in data generation, for instance. Or the supply chain piece, which I want to come back to, getting that right.

And so that demo exists, we’ve put it out publicly, and the question, of course is: So what is—like, how good is it? And there’s no—other than a vibe test, right? Where you have a human interact with it and you’re like, “Oh, that’s pretty good,” there isn’t really a good answer today. And so one of the things that we’ve invested quite a bit of energy into is evals for physical system AIs, for physical engineering AIs. And by the time this airs, I think we’ll have an archive paper out that describes our approach to evals. We call it Archie IQ. And the goal is to administer the evals to humans. So an entry-level human engineer, average human engineer, expert-level human engineer, and to Archie. And for us to have a closed loop process of improving Archie to move up that IQ scale.

Sonya Huang: Do you think you’ll keep pushing on residential cooling systems and you’ll have a residential cooling system agent that’ll eventually be an airplane design agent, a starship design agent? Is that the right way to think about this, or is this a single agent that you’re building?

Paul Eremenko: No, I think it’s the right way to think about it is at least initially, we have to create distinct training data sets for each product domain, for each product vertical.

Pat Grady: How do you guys think about that map? Like, you know, if the map starts with the residential cooling systems, how does it progress from there? Like, what does that overall map look like to get to the point of, you know, engineering AGI for the physical world? What’s on that map?

Paul Eremenko: Yeah, so first of all, residential for us was just kind of a toy problem that we chose. Our first market where we plan to deploy with a customer, with a design partner, is actually data center cooling systems. Which are still thermodynamic engines, so they’re not that different from residential HVAC, but they’re an order of magnitude more complex, obviously much larger, and a very interesting market because they’re having trouble coping with demand from data center customers. And we’re at a point where cooling systems are like the long lead items, pacing data center development, which is kind of wild.

And so it is an acute pain point. It is, in many ways—the delivery of those systems is in many ways limited by engineering bandwidth, of being able to deliver sort of semi-custom solutions to each data center. And so we have a very enthusiastic customer base for that early deployment. And these systems are—you know, these are now on order of a thousand unique parts in the system.

Pat Grady: Okay.

Paul Eremenko: The physics domains are quite rich, but the physics again are still pretty linearizable. So from a synthetic data generation perspective, it’s a fairly manageable problem, which is why we like it as a first vertical. And then we progress, and I think we progress principally on the basis of synthetic training data, this physics-based synthetic training data complexity. And so our expectation is that we will go roughly an order of magnitude up in product complexity every year.

Pat Grady: Okay.

Paul Eremenko: So the second vertical is probably industrial systems, so things that go into a factory from material handling, industrial robots, mills, lathes, those kinds of things. Then we move into mobility domains, which could be automotive, it could be agriculture, mining equipment, those kind of automotive and heavy machinery, and then aerospace and defense.

Pat Grady: Yeah.

Paul Eremenko: But just to give you sort of the order of magnitude progression: data center cooling systems, roughly a thousand unique parts. Airplane, roughly a million unique parts, right? So three orders of magnitude between them. And we think based on sort of our current projections, this is roughly one year for each order of magnitude.

Pat Grady: How much of the data that’s required to train the system comes from the usage of the system such that the simple-use cases start to bootstrap the more complex-use cases? How much of it is fed to the system from some other training data generation technique that you have?

Paul Eremenko: So we think we can train Archie to be at the level of an entry-level engineer. So like college educated, but not particularly savvy in a specific company’s products, or some of the in-depth processes and practices, or a lot of the detailed supply chain cost data. That’s not something you learn in college.

Pat Grady: Yeah.

Paul Eremenko: Right? So we think we can do that just based on non-proprietary synthetic data that we produce, meaning non-proprietary to a customer. And so the goal is get Archie hired as an entry-level engineer, right? Get him in the door. We then have a relationship with the customer, we have a data-sharing agreement and all of those things sorted. And then Archie can start learning on the things behind the firewall, obviously subject to the customer’s acquiescence. But we can then ingest their PLM system, we can ingest all of their model-based tools and models. We can ingest a lot of the real-world performance of that system. Quality escapes, right? There is a bunch of stuff there. And so we think that Archie can move up the expertise scale fairly rapidly from entry level to kind of average to expert engineer on the basis of a lot of that real-world data, and of course improvements in the AI models as well.

Pat Grady: And do you have a definition—when you talk about engineering AGI, we haven’t found sort of a generally agreed upon definition of AGI. What’s your definition of AGI, and how does it fit into the test of someday when you have an engineering AGI, you know, how will you know you have it?

Paul Eremenko: Yeah, so back to the evals. We have adopted what’s called Bloom’s Taxonomy, which is a cognitive knowledge taxonomy for human learning developed in the ’50s, and has been applied to LLMs in recent years. We have adapted it kind of to the engineering task. And so the taxonomy is kind of a pyramid, right? The lowest level you have just recall of information. That’s relatively straightforward. Then you have semantic understanding of the design. So in addition to recall, like, what does this part do? Then you have the ability to evaluate a design or a change to a design. So what is the performance impact of changing this component, for instance, or resizing something?

Then there is the ability to find mistakes in a design, right? So this is the error correction and infilling. Then to synthesize a brand new design or a significant change to an existing design. And then kind of the highest, the pinnacle, which we call EAGI—engineering AGI—is reflection, which is some degree of self awareness of what process did I just use to do the preceding five levels in this hierarchy? What process did I use? What are the limitations of that process? Is there an alternative process? Where could I have gone wrong?

These are the kinds of things that actually most engineers in the field don’t do very well, and is reserved for kind of the senior levels, the experts or the technical fellows in large industrial companies. And so to us that is certainly the pinnacle of human engineering intelligence is the self awareness of and your own limitations of the engineering process.

And then there is a different dimension which is can it generalize across domains without us having to train it on the domain? So I would say those are the two axes, and you could argue that you can accomplish sort of AGI on one axis, AGI on the other axis, or AGI on both axes. Pick your poison. We hope to do both.

Sonya Huang: What do you think it’s going to take to be able to solve systems of the current order of magnitude of parts complexity, all the way up to airplanes and more in terms of the number of parts? Is it simply a matter of scaling laws and the LLMs will get better. You’re going to be able to generate more synthetic data and more data, more compute, bigger models? You’re going to be able to kind of solve these much more complex systems in the future? Or do you think there’s going to be research breakthroughs that are needed to get there?

Paul Eremenko: No research breakthroughs needed. I think we operate squarely in the kind of applied research domain, of where we take existing research that the frontier labs are doing and applying it to our very specific—our very specific problem. I mean, so obviously there are limitations in scaling in terms of compute to generate. So there’s CPU compute to generate the synthetic data, because that’s a lot of simulation and sampling and things like that. And then there is GPU compute to train GPU compute for inference, and all of those today I don’t think we could do for a million part system, right?

Because if you think about it, and maybe to tie back to your question, Pat, about where does the supply chain come in? So how do we create these synthetic data sets? So if you have a million unique parts in a system, in order to compose, to kind of span the design space and create, you know, a very large number of adjacent systems and some faraway systems, you need a catalog of components, a catalog of component models, and some rules by which you can compose those components into systems.

Pat Grady: Mm-hmm.

Paul Eremenko: And your component catalog needs to be a couple orders of magnitude bigger than a typical system design. So if you have a million unique parts in a system, your component catalog maybe needs to be 100 million or a billion parts. And so A) you need to create that component catalog.

Pat Grady: Okay.

Paul Eremenko: Today we do it manually. We are building a lot of automation and a lot of actually AI tools to help us build that component catalog of component models. Then you have to intelligently assemble those components, so it’s not a tornado going through a junkyard and assembling a 747, but you actually have some method for creating it. And then you have to simulate each of those and get a performance vector. That’s the training data set. And so it’s supply chain informed because in theory, all of the components in your catalog either reflect a real component in the supply chain, or you can introduce hypothetical components, right? Because sometimes innovation is not just assembling things that exist, but saying, “Hey, I need a new motor or I need a new compressor, I need a new this, a new that.”

Pat Grady: Yeah.

Paul Eremenko: Right? And so you can introduce new components that don’t exist, but you know what those are and how you plan to get them.

Pat Grady: Yes.

Paul Eremenko: So that’s what we mean by supply chain informed. And physics-based means that the rules of composing those components model all of the relevant modalities of interaction that you care about, the phenomenology of how they interact, and that the designs that are produced are, in fact, realizable designs.

Sonya Huang: I’d love to hear the customer back perspective. So you were previously—you know, you’ve been the customer before—notably you were the CTO of Airbus. Maybe can you just walk us through, for those of us that haven’t been inside the belly of the beast of an industrial heavyweight, what is the process like to design a new airplane? Or, you know, what are all the engineers at these companies doing, and what does their life look like before and after engineering AGI?

Paul Eremenko: Yeah, it’s a very good question. So I think I gave you a reasonable abstraction of what an engineer does, which is they operate with some set of requirements. They may not be system-level requirements, right? The engineer may be working on a subsystem or an assembly or a widget, but they still have requirements, they still need to pick the key design drivers from those requirements, figure out what are the solutions, do first order sizing and then do the detailed analysis, right? That workflow gets replicated in kind of a fractal way throughout the system and throughout the engineering organization, which is designed to mirror roughly the product that you’re building, right?

And one of the reasons that we position Archie as both an agent, meaning that he’s fairly autonomous, so it’s not an assistant. He’s really designed to augment a team versus helping an individual, right? So we are trying to position Archie as an employee that joins a team. One of our sort of mission statements is, “An Archie on every team in every major industrial company in the world.” And Archie joins the team, and the goal is to sell work, not software to these companies.

Pat Grady: Yes.

Paul Eremenko: It’s very, very difficult to sell software—engineering software to a company like Airbus. There are hundreds if not thousands of engineering tools in the ecosystem, and they are connected in various intricate, to put it politely, sometimes inelegant kind of glueware ways. And introducing a new tool into that ecosystem is very, very complex. On top of it, the labor budget of these companies is much bigger than the methods and tools sort of software budget. So you want to tackle the labor piece, not the tools piece.

And so Archie is really designed to show up on the team and be a remote engineer. So obviously, there’s no embodiment, but he shows up on Slack or on Teams or whatever collaboration tool you’re using, and you task him as you would a junior engineer who happens to be maybe at an offshore engineering center, and you interact with him that way. So there’s really minimal friction to introducing an Archie into the organization. You don’t need to do anything differently. You don’t need to change your processes. You just have this lower cost entity that shows up. Archie will probably be better at some things, maybe worse at other things, but the goal is to position him as a worker.

Pat Grady: Why Archie? Where did the name come from?

Paul Eremenko: Well, so it’s letter A. So it allows us to have a Bob and a Charlotte and a Daniel right down the road. Archimedes, architect, right? All of those are, I think, connotations that are relevant to what we’re doing.

Sonya Huang: What sorts of problems do you think Archie will be tackling, and how do you expect that changes what the human engineers on the team are doing?

Paul Eremenko: So in the data center application, which is the first one that we expect to pilot this year, we think that the probably most promising but also the most applicable use case for Archie as we bring him to other domains is doing basically product customization, so semi custom—they call it specials in the data center cooling world. And this is taking an existing product platform and customizing it for a specific customer’s use case, and so to meet architectural requirements, to meet functional requirements, to meet building codes, et cetera. And that tends to be different and fairly bespoke on a case-by-case basis. And that’s where most of the engineering hours go. And so that’s the problem that we’re tackling first with Archie.

But that problem translates to other domains pretty well. Airbus for instance, very seldom does a clean sheet airplane design, but does a lot of derivatives or a lot of what’s called “head of variants,” which are a particular product for an airline, right? With a specific cabin, specific in flight configuration, in flight entertainment configuration, specific cockpit requirements, et cetera, right? So that’s what most engineers at most industrial companies do is sort of semi customization.

Pat Grady: If we go to, like, 2030, 2040, some long-term time horizon, and there are millions and millions and millions of Archies—and maybe Bobs and Charlottes and Daniels—out there in the world, and you’ve achieved engineering AGI for the physical world, how will sort of the average person feel the impact of that? Like, how will they notice that their life is different as a result of engineering AGI becoming a thing?

Paul Eremenko: So I think it’s a time horizon question, right? And I am hesitant to predict anything that’s more than, like, three years out, [laughs] especially in these steeply exponential times. But I think in the first instance where Archie shows up on engineering teams and makes the team more productive and maybe helps the team do things more efficiently, one use case that we’ve talked about is if you have an Archie on every team, can the Archies coordinate amongst themselves better than the humans between the teams and sort of speak their own—kind of speak their own shorthand?

Pat Grady: Yeah, yeah.

Paul Eremenko: And do those kinds of things. So that’s really about improving the efficiency and the efficacy of existing engineering organizations. So for the average person, the impact is lower-cost goods and products.

Sonya Huang: So you’re saying I can buy an airplane?

Paul Eremenko: Perhaps. Perhaps.

Pat Grady: [laughs]

Paul Eremenko: I think the really interesting stuff starts when Archie can design things that we can’t. And that’s kind of the super intelligence part where it’s not just about efficiencies of existing organizations or increasing the bandwidth of existing organizations, but really designing the stuff that was promised to us in the sci-fi books.

Pat Grady: Yeah. Yeah.

Paul Eremenko: So the starships and Dyson spheres and matrioshka brains and those kinds of things. So, like, ultimately I’m a dreamer. That’s why I started this company. And that’s the future that I want, and that’s squarely the north star that guides us. But of course, we want to build a pragmatic and profitable business in the meantime.

Sonya Huang: Our partner Konstantine has this term, “the stochastic mindset,” which is if you think about, you know, working with computers in the past, it was predetermined—you ask for this, you get this back. Versus with models, there’s a stochastic part of the nature by definition. How do you think about managing around that in your domain? Because if I think about it, I can vibe code a web app, and it’s okay if it breaks. It’s not great if I vibe code an airplane and it breaks, right? That’s disastrous. And so how do you think about managing around the stochastic nature for the physical world?

Paul Eremenko: Well, humans are pretty stochastic as well.

Sonya Huang: Fair.

Paul Eremenko: So if you have a junior engineer working on a task, they’ll make mistakes, they may not do the right thing, they may not be repeatable. [laughs] So I think the question that we need to quantify, and we expect to quantify in our pilot later this year, is what is the error rate coming out of Archie? And if that error rate is comparable to human engineers, then there are a lot of checks and balances built into the existing engineering organizations to ensure that a mistake that a junior engineer makes doesn’t bring down an airplane.

Pat Grady: Yeah.

Paul Eremenko: Right? So there’s layers of review, there’s milestones, there’s tests, there’s a lot of those layers. And so if Archie has a comparable error rate or better error rate, then it should be a pretty seamless slotting into the existing processes.

Sonya Huang: What does the engineering org of the future look like? Do you think we’ll have one person Airbus equivalents in the future?

Paul Eremenko: So again, I’m reluctant to forecast the future beyond sort of three years out. And I think in the next couple of years, our goal is again, an Archie on every team. So 10 percent of the workforce is Archies. They do the work that humans maybe find boring, dull, repetitive. And maybe there’s additional value adds like inter-Archies coordination and things like that. And then I can imagine a super intelligence where you tell it, “I want you to start building a Dyson sphere,” and it starts building the Dyson sphere. What’s in between? Difficult to forecast.

Pat Grady: Hmm.

Sonya Huang: Okay, lightning round. I’ll go first. What application or application category do you think will break out this year?

Paul Eremenko: So I think we’re getting close to physical AIs. Not in the sense that we’re talking about them, but in the sense of robotics as well as foundation models for ingesting real world sensor data. And I think both of those are actually quite important building blocks to what we’re trying to build. And I think they’re very, very close.

Sonya Huang: Humanoids, yes or no?

Paul Eremenko: Yeah. I think humanoids—yes. Humanoids, yes. On the same basis that we are trying to build an agent that slots into existing teams, I think humanoid robots can slot into existing environments much more easily, even if they’re not sort of the optimal configuration.

Pat Grady: What one piece of content should AI people consume?

Paul Eremenko: I think everybody should read or go reread Asimov’s Robot series.

Pat Grady: Ah, good one!

Paul Eremenko: Because I think the laws of robotics were very carefully thought out, and are a lot of what actually needs to be built somehow very deeply into these models to ensure alignment.

Pat Grady: Very good one.

Sonya Huang: What other startups do you admire?

Paul Eremenko: I think that a lot of the work that is being done on models for ingesting physical world data, I think are kind of unsung, but are incredibly important. And the reason—if you don’t mind a slightly longer answer to the question, the reason I think they’re important is, like, look, we don’t know why neural networks work fundamentally, right? But we have a vague, neuromorphic or anthropomorphic kind of view that, oh, we’re trying to replicate what a human neuron does, and then you do enough of them and you get these wonderful emergent properties.

But then if you take that further and you say, well, how do humans acquire knowledge, like a human baby? The very first thing they do is touch, taste, hearing, eventually vision, then language, then higher order engineering reasoning, spatial reasoning, those kinds of things that are maybe built on top of language or maybe built on top of some of the other perception and sensory models that they have.

With LLM or with deep learning, we’ve replicated the neural structure to some approximation, but then we said because of data availability, we’re going to go language first and we’re going to scrape the whole internet and then we’re going to do video. We’re also going to do imagery, so vision, but we’ve skipped touch, taste, hearing, et cetera, right? And touch, I think, is particularly important for building a sense of perception. And I keep coming back to spatial reasoning and the ability to abstractly think about three dimensional objects and three dimensional structures.

And so I’m very bullish on—there’s a number of companies. One—Archetype is a good example, founded by one of my former colleagues at Google, that’s working on a foundation model for ingesting sensor data. And that foundation model has actually demonstrated that it can infer some of the physics underlying that data, right? Which I think is immensely cool. And I think all of those building blocks ultimately may need to be there for the engineering AGI to happen, that just language and vision is not enough.

Pat Grady: All right, last question. What AI app is your personal favorite to use?

Paul Eremenko: The less interesting answer would be, like, ChatGPT and Cursor, which are both there. The perhaps more interesting answer is we just recently, as we were coming out of stealth, we wanted to produce a video that kind of shows that north star vision that we’ve been talking about of ultimately engineering AGI and the path to get there. So we worked with a studio called EyeMix, which is an Israeli-LA kind of thing. They did the Trump Gaza video, if you guys know, that went viral maybe a month or so ago.

Pat Grady: Okay.

Paul Eremenko: And they did a fully AI-generated kind of two-minute Archie biopic clip which people can see it on our website. And it was completely AI generated. It was done in two weeks, and it was done at about, I would say, a 50th of the cost of what a comparable piece of content would have been without AI. But everything—voice, video, music, everything in that short film is completely AI generated using a variety of models, some of which are their own, many of which they stitch together from the ecosystem. But to me, I was—I was absolutely blown away.

Pat Grady: Very cool.

Sonya Huang: Wonderful. Paul, Lee, thank you so much for joining us today to share more about your vision for the future of engineering AGI for the physical world. We’re excited for the day where you bring down the cost of buying an airplane. And in the meantime, excited to see what Archie can do.

Paul Eremenko: It’s our pleasure. Thanks for inviting us.

Sonya Huang: Thank you.

Mentioned in this episode:

Dyson sphere: a hypothetical structure surrounding a star that captures its power output, first proposed as a thought experiment by physicist Freeman Dyson in 1960.
Towards Automated System Synthesis Using SCIDUCTION: P-1 AI co-founder, Susmit Jha’s 2011 PhD on program synthesis
On the Evaluation of Engineering Artificial General Intelligence: 2025 paper describing P-1’s approach to evals
Bloom’s Taxonomy: A cognitive knowledge taxonomy for human learning developed by Benjamin Bloom in 1956, now applied to LLMs
The Stochastic Mindset: Essay by Konstantine Buhler based on Training Data episode with Dust founders Gabriel Hubert and Stanislas Polu
Isaac Asimov, Robot Series: Set of short stories and novels published between 1940 and 1995 that Paul says are highly prescient
Archetype: Ai company that Paul mentions that is working on a foundation model for ingesting sensor data
Archie biopic: AI generated short film produced for P-1 by EyeMix

From Data Centers to Dyson Spheres: P-1 AI’s Path to Hardware Engineering AGI

Training Data: Ep47

Listen Now

Stream On

Summary

Transcript

Chapters

Contents

Progress in AI for physical engineering

What is Archie?

Multiple models for engineering reasoning

What’s ahead for Archie?

The map of product verticals to get to AGI

Evals and Bloom’s Taxonomy

Scaling synthetic data

How this will change engineering orgs

Tackling product customization

Error rates vs humans

Lightning round

Mentioned in this episode