Back to Podcasts
Machine Learning Street Talk

Google Researcher Shows Life "Emerges From Code" - Blaise Agüera y Arcas

Machine Learning Street Talk

Tuesday, October 21, 202559m
Google Researcher Shows Life "Emerges From Code" - Blaise Agüera y Arcas

Google Researcher Shows Life "Emerges From Code" - Blaise Agüera y Arcas

Machine Learning Street Talk

0:0059:53

What You'll Learn

  • DNA is a computer program that can self-replicate, making life a fundamentally computational process
  • Cellular automata and universal Turing machines provide a model for understanding the computational nature of biological systems
  • Recursion and parallelism are key features that allow complex, intelligent systems to emerge from simple computational building blocks
  • The 'purpose' or drive of complex adaptive systems may arise naturally from the underlying computational dynamics, rather than requiring an external source
  • The computational view of life and intelligence challenges traditional notions and opens up new perspectives on the nature of mind and evolution

AI Summary

The episode discusses the idea that life and intelligence are fundamentally computational in nature. The guest, Blaise Agüera y Arcas, explains how John von Neumann's work on cellular automata and universal Turing machines provides a framework for understanding DNA as a computational system that can self-replicate. The conversation also touches on the role of recursion and parallelism in building complex, intelligent systems, as well as the question of where the 'purpose' or drive of such systems comes from.

Key Points

  • 1DNA is a computer program that can self-replicate, making life a fundamentally computational process
  • 2Cellular automata and universal Turing machines provide a model for understanding the computational nature of biological systems
  • 3Recursion and parallelism are key features that allow complex, intelligent systems to emerge from simple computational building blocks
  • 4The 'purpose' or drive of complex adaptive systems may arise naturally from the underlying computational dynamics, rather than requiring an external source
  • 5The computational view of life and intelligence challenges traditional notions and opens up new perspectives on the nature of mind and evolution

Topics Discussed

#Computational theory of life#Cellular automata and universal Turing machines#Recursion and parallelism in complex systems#Emergence of purpose and drive in computational systems#Implications for understanding intelligence and evolution

Frequently Asked Questions

What is "Google Researcher Shows Life "Emerges From Code" - Blaise Agüera y Arcas" about?

The episode discusses the idea that life and intelligence are fundamentally computational in nature. The guest, Blaise Agüera y Arcas, explains how John von Neumann's work on cellular automata and universal Turing machines provides a framework for understanding DNA as a computational system that can self-replicate. The conversation also touches on the role of recursion and parallelism in building complex, intelligent systems, as well as the question of where the 'purpose' or drive of such systems comes from.

What topics are discussed in this episode?

This episode covers the following topics: Computational theory of life, Cellular automata and universal Turing machines, Recursion and parallelism in complex systems, Emergence of purpose and drive in computational systems, Implications for understanding intelligence and evolution.

What is key insight #1 from this episode?

DNA is a computer program that can self-replicate, making life a fundamentally computational process

What is key insight #2 from this episode?

Cellular automata and universal Turing machines provide a model for understanding the computational nature of biological systems

What is key insight #3 from this episode?

Recursion and parallelism are key features that allow complex, intelligent systems to emerge from simple computational building blocks

What is key insight #4 from this episode?

The 'purpose' or drive of complex adaptive systems may arise naturally from the underlying computational dynamics, rather than requiring an external source

Who should listen to this episode?

This episode is recommended for anyone interested in Computational theory of life, Cellular automata and universal Turing machines, Recursion and parallelism in complex systems, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

<p>Blaise Agüera y Arcas explores some mind-bending ideas about what intelligence and life really are—and why they might be more similar than we think (filmed at ALIFE conference, 2025 - https://2025.alife.org/).</p><p><br></p><p>Life and intelligence are both fundamentally computational (he says). From the very beginning, living things have been running programs. Your DNA? It&#39;s literally a computer program, and the ribosomes in your cells are tiny universal computers building you according to those instructions.</p><p><br></p><p>**SPONSOR MESSAGES**</p><p>—</p><p>Prolific - Quality data. From real people. For faster breakthroughs.</p><p>https://www.prolific.com/?utm_source=mlst</p><p>—</p><p>cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy</p><p>Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++</p><p>Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst</p><p>Submit investment deck: https://cyber.fund/contact?utm_source=mlst</p><p>— </p><p><br></p><p>Blaise argues that there is more to evolution than random mutations (like most people think). The secret to increasing complexity is *merging* i.e. when different organisms or systems come together and combine their histories and capabilities.</p><p><br></p><p>Blaise describes his &quot;BFF&quot; experiment where random computer code spontaneously evolved into self-replicating programs, showing how purpose and complexity can emerge from pure randomness through computational processes.</p><p><br></p><p>https://en.wikipedia.org/wiki/Blaise_Ag%C3%BCera_y_Arcas</p><p>https://x.com/blaiseaguera?lang=en</p><p><br></p><p>TRANSCRIPT:</p><p>https://app.rescript.info/public/share/VX7Gktfr3_wIn4Bj7cl9StPBO1MN4R5lcJ11NE99hLg</p><p><br></p><p>TOC:</p><p>00:00:00 Introduction - New book &quot;What is Intelligence?&quot;</p><p>00:01:45 Life as computation - Von Neumann&#39;s insights</p><p>00:12:00 BFF experiment - How purpose emerges</p><p>00:26:00 Symbiogenesis and evolutionary complexity</p><p>00:40:00 Functionalism and consciousness</p><p>00:49:45 AI as part of collective human intelligence</p><p>00:57:00 Comparing AI and human cognition</p><p><br></p><p>REFS:</p><p>What is intelligence [Blaise Agüera y Arcas]</p><p>https://whatisintelligence.antikythera.org/ [Read free online, interactive rich media]</p><p>https://mitpress.mit.edu/9780262049955/what-is-intelligence/ [MIT Press]</p><p><br></p><p>Large Language Models and Emergence: A Complex Systems Perspective</p><p>https://arxiv.org/abs/2506.11135</p><p><br></p><p>Our first Noam Chomsky MLST interview</p><p>https://www.youtube.com/watch?v=axuGfh4UR9Q </p><p><br></p><p>Chance and Necessity [Jacques Monod]</p><p>https://monoskop.org/images/9/99/Monod_Jacques_Chance_and_Necessity.pdf</p><p><br></p><p>Wonderful Life: The Burgess Shale and the History of Nature [Stephen Jay Gould]</p><p>https://www.amazon.co.uk/Wonderful-Life-Burgess-Nature-History/dp/0099273454 </p><p><br></p><p>The major evolutionary transitions [E Szathmáry, J M Smith]</p><p>https://wiki.santafe.edu/images/0/0e/Szathmary.MaynardSmith_1995_Nature.pdf</p><p><br></p><p>Don&#39;t Sleep, There Are Snakes: Life and Language in the Amazonian Jungle [Dan Everett]</p><p>https://www.amazon.com/Dont-Sleep-There-Are-Snakes/dp/0307386120 </p><p><br></p><p>The Nature of Technology: What It Is and How It Evolves [W. Brian Arthur]</p><p> https://www.amazon.com/Nature-Technology-What-How-Evolves-ebook/dp/B002RI9W16/ </p><p><br></p><p>The MANIAC [Benjamin Labatut]</p><p>https://www.amazon.com/MANIAC-Benjam%C3%ADn-Labatut/dp/1782279814 </p><p><br></p><p>When We Cease to Understand the World [Benjamin Labatut]</p><p>https://www.amazon.com/When-We-Cease-Understand-World/dp/1681375664/ </p><p><br></p><p>The Boys in the Boat [Dan Brown]</p><p>https://www.amazon.com/Boys-Boat-Americans-Berlin-Olympics/dp/0143125478 </p><p><br></p><p>[Petter Johansson] (Split brain)</p><p>https://www.lucs.lu.se/fileadmin/user_upload/lucs/2011/01/Johansson-et-al.-2006-How-Something-Can-Be-Said-About-Telling-More-Than-We-Can-Know.pdf</p><p><br></p><p>If Anyone Builds It, Everyone Dies [Eliezer Yudkowsky, Nate Soares]</p><p>https://www.amazon.com/Anyone-Builds-Everyone-Dies-Superhuman/dp/0316595640 </p><p><br></p><p>The science of cycology</p><p>https://link.springer.com/content/pdf/10.3758/bf03195929.pdf </p><p><br></p><p>&lt;trunc, see YT desc for more&gt;</p><p><br></p>

Full Transcript

MinuteMade Zero Sugar tastes so great, it sells itself. So we didn't bother to write this ad. MinuteMade Zero Sugar. Great taste. Zero sugar. Sells itself. This episode is brought to you by Indeed. You're ready to move your business forward, but first you need to find the right team. Start your search with Indeed Sponsored Jobs. It can help you reach qualified candidates fast, ensuring your listing is the first one they see. According to Indeed data, sponsored jobs are 90% more likely to report a hire than non-sponsored jobs. See the results for yourself. Get a $75 sponsored job credit at Indeed.com slash podcast. Terms and conditions apply. The new book, so it's called What is Intelligence? Thank you for asking. It was just published by MIT Press, you know, three or so weeks ago. So it's very, it's fresh off the presses. There's an online version of it as well that is free. and very rich. It's full of all kinds of rich media. Chapter one of that book is called What is Life. So What is Life is sort of the single to the album and works as a book in its own right. So that book is also for sale from MIT Press. I think I've explained why I think of life as being a subset of intelligence and why the story of artificial life and abiogenesis and so on is relevant to the story of intelligence and what it is. But yeah, the subtitle of the book is Lessons from AI about evolution and minds and something, something. So, you know, it's basically documenting the time since about 2020 when I got really shocked by seeing that these large sequence models seemed to be generally intelligent and just starting to think through the implications of that. What would it mean if we believe our eyes and that is what intelligence is? What does that tell us about ourselves, about the properties of intelligence more broadly, and the whole intellectual journey that that has taken us on in the last few years? MLST is supported by Cyberfund. So I'm Phelan. I'm the co-founder and CEO of Prolific. And Prolific is a human data infrastructure company. So we make it easy for people developing frontier AI models and running research to get access to trustworthy, high-quality participants for high-quality online data collection. I am at Google. I've been there for about 12 years now. And I am their CTO of Technology and Society and also the founder of a new research group, well, new-ish, we've been around for a couple of years, called Paradigms of Intelligence, or Pi. It's much smaller than the previous organization that I that I ran at Google Research. It's about 50 people, so enough to do some real damage. And the idea is to really focus on fundamentals of artificial intelligence and go beyond the sort of exploiting of the current models and paradigms that are working well. We believe in those, but we also think that we have to sort of refill the bucket with new insights and new ideas as well. I've just watched your talk, and you said that life and intelligence are the same thing. They are both computational. What do you mean by that? Yeah, this is a surprising claim. I know it sounds a bit odd. But what I mean by that is, well, let's begin with life and why life is computational. So in the middle of the 20th century, John von Neumann, who was one of the founders of computer science, of course, realized that in order for a robot paddling around on a pond, let's suppose the robot is made out of Legos and its job is to make another robot out of loose Legos that it finds floating around in the pond, just like itself. In order to do that, it needs to have instructions inside itself. So you imagine the tape with instructions for how to assemble a Mii. And the robot would also have to have a machine inside itself that would be able to walk along that tape and follow those instructions to take the loose Legos and put them together into its own form. And it would also have to have a tape copier so that it could endow the offspring with that tape. And the tape would have to include the instructions for the copier and the assembler, the universal constructor, as he called it. So the cool thing is that he made all of those predictions, if you like, on the basis of pure theory before Watson and Crick and their unacknowledged collaborators had figured out the structure and function of DNA, before we knew how ribosomes worked, which are in fact exactly that universal constructor, before we had discovered DNA polymerase, which is the tape copier. And in fact, all of those things have to exist in order for an organism to not only be able to reproduce itself, but to do so heritably, such that if you make a change in its genome and what's on the tape, then the offspring will also have that change. The kicker is that this universal constructor is a universal Turing machine. In other words, you have to have a computer inside yourself as a cell in order to make another cell. And DNA is, in this very literal sense, a Turing tape. So it's a very profound insight because basically he's saying you cannot be a living organism without literally being a computer, a universal computer. Very interesting. So are you saying that DNA is basically a computer program? DNA is a computer program. Yes. Very cool. Because many folks in the audience would have been inspired by Conway's Game of Life, for example, and you were talking about computational equivalence. So the Game of Life, of course, is Turing complete because it has an expandable memory. Just as DNA has an expandable memory, the grid size could just keep growing. As you were pointing to, when we see these weakly emergent behaviors, they appear very lifelike. But does that in any way downplay the biochemical and thermodynamic realities in the physical world? Of course. I mean, a cellular automaton like the game of life is a lot less complex than the real world. It's in two dimensions rather than three. There's no thermal randomness, which turns out to be very important, actually. And the fact that the computation is deterministic is a little different from real life, where those thermal fluctuations mean that there's always a probabilistic element to things. And you do have to extend Turing's original ideas about computation to make a so-called stochastic Turing machine in order to really do a proper job. But what von Neumann was getting at, and I'm glad you bring up cellular automata because they're really a generalization of the Turing machine to the laws of physics, where essentially every pixel on your game board, if you like, is performing a computation. So it has a state, and it's performing some very simple computation to say what's the next state on the basis of the neighbors. so the idea is that those rules that are determining the next state of a particular pixel are the laws of physics of that universe and the reason that von Neumann came up with this idea with cellular automata is because he wanted a system that would allow one to do computation but in which the computation is embodied and what I mean by that is in a Turing machine there's a tape and there's a head and then there are the symbols that are written on the tape but the symbols are not the same stuff as the tape and the head and the instruction uh or the or the um the the table of rules right those things are are abstract and they're separate from the symbols that are written whereas in a in a cellular automaton the machine can literally print itself so it's like not just a laptop if you like but like a laptop and a 3d printer in one that can print another laptop. So embodied computation is computation where the memory is written and read in atoms rather than in bits. And therefore, the machine can make another of itself. David Krakauer said to me that, I mean, you know, we can agree that intelligence, I mean, he says it's an adaptivity inference and representation. Adaptivity is very important. So yes, DNA is adaptive, but it's very slow. And he was saying that the nervous system and the brain and culture is evolution at light speed, because it allows us to kind of overcome the information transfer with successive generations. So does it make sense to think of the program of intelligence at the DNA level, when so much of the adaptivity seems to be happening higher up? A lot of the adaptivity very much happens higher up. So, you know, in humans, we have cultural evolution, which goes, you know, as David says, at light speed relative to genetic evolution. You know, when I say that, you know, DNA is a Turing tape and that, you know, ribosomes are universal computers that construct life, that's really only the ground level. Or maybe it's level one or level two. I mean, there's physics underneath that. But there's a level three, a level four, level five, and so on. There are computers built out of computers built out of computers. The thing about it is that once you have that ground level, then you can build as many floors above that as you like. The point is that the moment you have life, meaning that you have something that can build a copy of itself, you have a general computer, which allows you to do anything. And what that means is that life from the very beginning is computational and can start to compute in parallel. This gets us into symbiogenesis, which I guess we'll cover in a bit. The fact that the ground floor is computational answers the question, you know, why are brains computational? It's because cells were computational from long before there were action potentials and other fast electrical processes allowing us to think. Can you speak to this notion of recursion that you were just pointing to? So Carl Friston, for example, he thinks about this division in systems called a Markov blanket, like a statistical independence and we seem to observe empirically that complex intelligent adaptive systems have nesting and you were just kind of speaking to having levels upon levels upon levels and what does that buy you is is it a kind of recursion how does that improve the sophistication of the intelligence yeah well two things happen uh one of them is things inside things inside things and the other one is parallelism. In other words, a lot of things at the same level happening at once. And they're both important. They're both important parts of the story. So first of all, when you have a cellular automaton, like von Neumann was imagining, that's already massively parallel computation because every pixel is like a little computer doing its thing. In the same way that in physical space, there can be molecules in a lot of spots, all of which are doing something. You can think of them as operations, as computational operations, perhaps. And they're all happening at once. So, you know, in your body, you have quintillions of ribosomes. And all of those ribosomes are little tiny universal computers working in all of your cells at once. All of your cells are working at once. But also there is nesting because, you know, you are a person. Of course, you're already a part of a society, which in some sense is an intelligence bigger than an individual human. You're made out of cells. Those cells are made out of organelles. Those organelles are made out of proteins. Those proteins are made out of molecules. You know, that sort of nestedness is really important as well. You're not only a lot of computers working together in parallel, but you're also, you know, a system of computers made of computers. One thing I find fascinating that you've thought a lot about, Blaze, is where the purpose comes from. Folks like Half-Ristin, for example, they have a no-nonsense physics interpretation that it's the second law of thermodynamics at the end of the day. Because there's some notion of valence, that we build these complex adaptive systems, and there needs to be something that drives them forward, something that propels them in a certain direction. And your experiments have shown that this kind of falls out of computation. What do you mean by that? Yes. And to be clear, computation and the second law very much work together here. So the experiment that I did a couple of years ago that really sort of got me started with artificial life is called BFF. It's based on a language called BrainFuck, which is where the first BF come from. I didn't name it that, although I admit I enjoyed that it was called that. But this is a minimal Turing-complete language designed by a grad student. I think he was a grad student in physics, Urban Miller, in the 90s. It's a very minimal language. It has only eight instructions. I only use seven of them. And the basic setup is that I begin with a bunch of tapes of length 64. So just 64 byte long tapes, like those Turing tapes or von Neumann's tapes. They start off filled with random bytes. There are only seven instructions. So the great majority of those bytes, about 31 out of 32 of them, are no-ops, meaning that they don't code for any instruction at all. So they start off random and very much purposeless. You have a thousand of them in your soup, and the procedure is really simple. It's just plucking two of those tapes at random out of the soup, sticking them end-to-end, so you make one tape that is 128 long, and then running it. And this modification of brainfuck is self-modifying, meaning that when you run it, it can modify values on that combined tape. And then pulling the tapes back apart and dropping them back in the soup, and that's it. And you just repeat that process. If you do that a few million times, you start off with nothing much going on. I mean, again, the huge majority of those bytes are not even instructions. They're only an average of two of them or so on each tape. So the likelihood of them doing anything is almost zero. You might once in a while see one byte somewhere change. But after a few million interactions, something apparently magical happens, which is that suddenly the entropy of the soup drops dramatically. So it goes from being incompressible because it's all random bytes to being very highly compressible and programs emerge on those tapes. And those programs are complex. They take some real effort to reverse engineer. And you can see that they're occurring in a lot of copies. That's why it's compressible. The fact that they're occurring in a lot of copies tells you what the programs are doing. They're reproducing. They're copying themselves. So what's so cool about this experiment is that it really shows you how life emerges from nothing. And the emergence of life is, in some sense, the emergence of purpose. In this case, what is the purpose of one of these programs? Well, it is to reproduce. If you were to mess with one of those bytes if you were to change it you would in most cases break the program And when you break the program it no longer functions to reproduce So you know something that can break is something that is functional or that has purpose. Absolutely fascinating. So you said there was a phase change, which was quite sudden. Would David Krakauer acknowledge that as being a form of emergence? I think so. Yeah, I mean, I've actually never asked David that question. We disagree on a lot of things AI related, but I think he would acknowledge that this is a phase change and that it is an example of emergence, yes. I think he would because he has a bunch of criteria, but one of them is a fundamental coarse graining and reorganization of the micro-substrate such that the new phenomena can be described with a simple new variable. And this seems to match that description. Is it possible that there's some kind of design bias? You know, like when we design machine learning architectures, there's so much information in the architecture. And in this case, there's so much information in the brain fuck language and the terms and so on. Could that have kind of influenced it to emerge in a certain way? Yes, yes. And the structure of those programs does change depending on the language. We have tried this with other languages. We've tried it with Z80 assembly language, which is the assembly language of these Zilog chips that were invented sometime in the 70s and just got discontinued last year, a very long-running microprocessor architecture. So the phenomenon is very generic. What those programs look like is shaped by the specifics of the language. But the reason that those programs emerge, the reason that they develop purpose, is actually thermodynamic. So, you know, that might seem puzzling because you would think thermodynamics is about things becoming more random. And apparently the exact opposite is happening here. You start with randomness and you get order. And, you know, how could that be? Well, I think the answer was actually well characterized by a chemist, by an organic chemist, Adi Pross, who at the University of the Negev in Israel, who is now emeritus. And he did a lot of work on so-called dynamic kinetic stability. The idea being that it's an extension of the second law that says things seek their most stable state, their most stable form. Usually we think about those stabilities as only being fixed points, but those stabilities can be cycles too. So if something dynamically makes itself, if something forms more copies of itself, that's more stable than something that just settles. It's like the old joke about DNA being the stablest molecule in the universe. Obviously DNA is fragile, but at the same time, if the DNA makes more DNA, then it will be around a long time after granite, which can only erode. In terms of this valence question, does that imply to you that there is a natural drive to survive almost? I mean, for these systems to kind of maintain their existence, assuming that's a primary force, they would need to have a degree of sophistication. They would need to be doing modeling. They would need to be doing sophisticated things. But is that something that just always, you know, is it a convergent property? Yes, it is. In that sense, evolution is the second law at work. Meaning if you have a bunch of things that are not copying themselves in the BFF soup, and you have something that emerges that can copy itself, then that thing that can copy itself will write over the things that can't copy themselves, which means it's more fit or more stable, if you like. And so that is written into the laws of statistics in just the same way that the second law is. It's just the kinetic or the cyclic form of that same law rather than just the steady state. You said in your talk that merging is more important than mutation. Tell me more. Yes. So the usual thing, what we learned in school, was that Darwinian evolution consists of mutation and selection. or what Jacques Monod, the Nobel winner, called chance and necessity. So, you know, in other words, mutations maybe from cosmic rays or whatever to our DNA sort of throw spaghetti at the wall and whatever sticks is what remains, whatever doesn't kill us and whatever hopefully makes us stronger. That was my assumption as well, you know, starting out with these BFF experiments. And so I had a mutation rate, you know, where a byte could change at random with probability 1 in 10,000 or something, you know, with every interaction. And then I began playing with the mutation rate and found that this emergence of these complex programs occurred even when the mutation rate was turned down to zero, which is really a surprising finding. So it tells you that this emergence of purpose comes about, you know, even without any random changes in the code. It's not explainable in purely Darwinian terms. But the other things that are not explainable in purely Darwinian terms are the emergence of life in the first place, this greatly puzzled Darwin. He thought this problem of abiogenesis or the emergence of life was just impossible to reckon with. You might as well talk about the origin of matter, is how he put it in one of his letters. And the other thing I can't explain is the increases in complexity that occur. You know, why is life now more complex than bacterial life, you know, a billion years after it began on Earth? You know, why do we have human societies now? And if we go back, you know, 100 million years, we had only, you know, things with much simpler brains. We had octopuses, they had pretty complex brains, but anyway. But, you know, the tendency has been toward greater complexity. There are some people who have argued against that. You know, famously, Stephen Jay Gould, you know, has said things like, you know, everything on Earth is the same amount evolved. We've all been evolving for, you know, three billion years. It's all equally evolved. I think Gould was wrong when he said this. The reason being symbiogenesis. when you have a eukaryote formed by a mitochondrion finding itself inside an archaea and then becoming a eukaryote that resulting composite organism is more complex than either of the two parts that made it up in the same way that a spear is more complex than a stick and a stone point you put two things together you now have something more complex than the parts And if this idea that symbiosis or symbiogenesis is an essential part of evolution is correct, then you absolutely get more sophisticated things coming about later in evolution because they're being put together from pre-existing parts. for more details Whether it's a movie night or just midday, Skinny Pop is a salty snack that keeps on giving. Made with just three simple ingredients for an irresistibly delicious taste and a large serving size that lasts. Deliciously popped, perfectly salted. Skinny Pop, popular for a reason. Shop Skinny Pop now. Yeah, I wanted to touch on, because I said the same thing on the show many times, inspired by Kenneth Stanley, actually, that we see this monotonic increase in complexity in evolution. In standard Darwinian evolution, there is no reason for things to become more complex. So in other words, if you just do the spaghetti throwing at the wall thing, then you could get simplifications or complexifications, and there's nothing to favor one over the other a priori. So ordinary Darwinian evolution can either make things simpler or more complex. But symbiogenesis, which is this coming together of parts to make wholes, and these major evolutionary transitions that you just mentioned, this is the theory of Jor Sathmari and John Maynard Smith that they published in Nature in 1995. they only had like eight of them in their original paper, and they've since extended it to maybe a dozen. You know, but things like single cells coming together to make bodies, individuals coming together to make colonies, the emergence of sexual reproduction, the endosymbiosis of chloroplasts and of mitochondria, there are a few others, right? Those are very clearly steps upward in complexity. And the reason that it's trivial to prove that there are steps upward is because if you have A, which is reproducing and can make more of itself, and you have B, which is reproducing and can make more of itself, think of them each as having a tape, right, that says how to make a me. Then when they come together, the result has to both know how to make A and how to make B and how to put them together. And that little extra bit of information, how to put them together, is what makes the whole necessarily more complex than the parts. So that is the latter. And where I go beyond what Smith and Sathmary say is that for them, major transitions are a rare and exceptional event. But I think that if you look more closely at the way biology works, that's just the tip of the iceberg. Those are just the really big transitions that involved two large, highly consequential things merging in some way, or many cells, you know, merging into something qualitatively extremely different. But when you look more closely, you see horizontal gene transfer in bacteria all the time. That's also a form of symbiogenesis where, you know, parts of one thing get muddled up in another. You see horizontal gene transfer in eukaryotes like us all the time. Apparently a quarter of the cow genome is this Bove B transposon, which has jumped around among lizards and all kinds of other animals as well. viruses do this all the time. Retroviruses insert big chunks of their genomes into ours. And the big shock when you look at our genome when it was first sequenced in 2001 is that only one and a half percent of that is even our proteins. And what the hell is the rest of it? There's the junk DNA. Now, we know it's not all junk. A lot of it has regulatory functions and so on. But even so, there's a lot of other stuff in there. And a huge amount of it is retrotransposons and retroviruses that have been endogenized. They serve functional purposes in many cases. The mammalian placenta was made out of a virus related to the RSV virus, which fuses lung cells together and can make babies sick. That fuses together the cells in our placenta to make this blood-blood barrier. Or there's an ARC virus. We don't really understand how it works, but it lives in our brains. And we know that if we knock it out in mice, they can't form new memories. And it goes on and on. And especially in the last decade, we find more and more examples of functional instances of bits of genome from one thing ending up in another and changing it. One thing I want to touch on is the importance of the merge operator. We were talking about that earlier. And even Chomsky spoke about this. And you could argue whether the merge operator in language evolution was the Prometheus moment, whether it was phylogenetic or ontogenic. Because you were just talking about this symbiosis and merging in a physical substrate. but it also happens in the information substrate you get you get this kind of like memetic computer programs that ensconce themselves and maybe language was that you know i don't know but i i have a theory why merge is so important as opposed to to random selection so i think creativity is is about grounding um it's about path dependence basically so um even the retroviruses and and all of these things they they actually they form a lineage and i think that if you don't to use merge, you lose the lineage. And also something about the recursive merge operation allows you to build more complex computer programs, but allowing for this kind of reuse and canalization. There's something very natural about that. Yeah, I completely buy everything that you're saying. I think that's exactly right, except that I dislike Chomsky. So I think he's wrong. He's wrong about language. I'm much more of a fan of Dan Everett. I don't know if you're familiar with his work with the Piraha. Oh, it's wonderful. So he spent a long time with the Piraha in Brazil, who are a people whose language does not obey Chomsky's requirements for language. They don't have recursion. They don't have anything like center embedding. They also don't have numbers, and they don't have past and future tenses. Everett wrote a great book some years ago called Don't Sleep, There Are Snakes, which talks both about his experiences among the Piraha and their language, and also his big fight with Chomsky over this. Chomsky's papers are filled with theory and pseudomath and have no time to give to ethnography or to actually studying any real languages. But anyway, I'm digressing. You know, putting Chomsky aside, though, what you're saying about merge, or as I would see it, composition, functional composition, I think is absolutely fundamental. It's how all technology is built. W. Brian Arthur has written about this and how technology evolves. You know, every technological invention, you know, it's funny, like every technology gets invented a dozen times around the same time as if everybody's in telepathic communication. And the reason is that every technology has precursors. You know, you can't get a light bulb until you know how to blow glass, how to make a vacuum, how to draw a filament, how to generate an electric current. And when all those things were there and, you know, the need for light was there, the light bulb was going to get invented. But it was invented, you know, a dozen times by different inventors with different contingent choices. You know, they might be, you know, which kind of filament do you use? Or is it prongs? Or do you screw in the light bulb? Which way do you screw it in? You know, what's the diameter? And those decisions, as they get locked in, determine the course of everything after that, that incorporates light bulbs. So, you know, in a way, this contingency, these choices about exactly which way those combinations go is actually what the entire genome or whatever it is, is made out of. In the case of BFF, the original replicators are really just single instructions that sometimes randomly, weekly, might copy themselves one byte that moves here and there. But as those bytes get copied around, sometimes a couple of them end up together, and then they'll copy as a group, they'll do better together. And so the contingent thing, which way they ended up getting copied, was it A, B, or B, A, that they stuck together, that's the information that the bigger thing is made out of, that little extra bit. Because in this case, you just had single bytes, there was no information there to begin with So the merger tree ends up being exactly the information that is encoded in the final genome It all about the history Yes absolutely fascinating I mean in a sense I surprised you not a fan of Chomsky because he was talking about automata and Turing completeness, and he was the ultimate computationalist. And in a sense, what you're describing is Chomsky's ideas just applied lower down the stack. That's right. So in that sense, I think he was correct. But I also think, you know, all of those ideas were already there. In von Neumann in the 1950s, even Niels Al-Baricelli, the first artificial life researcher who, you know, worked on one of von Neumann's machines, I think he sort of snagged time on the maniac to do some of his first A-life experiments. They're kind of pseudo-documented in Mithamin Labatut's book, Maniac. It was really fun. um but um or no that was it was in his first book i think when we cease to understand the world um but anyway so so yeah my point is those ideas were there before uh before chomsky uh the thing that chomsky really pushed um you know during his reign of terror over linguistics sorry i'm being a little bit mean but the thing that he really pushed was was the um the movement in artificial intelligence that we now call uh gofi good old-fashioned ai which held that you could you could formalize what AI is as grammars and programs, which turned out to be, you know, wrong. That turned out to be a false start in AI and why there were so many AI winters. There seems to be a bit of attention because the GoFi folks, they had some very interesting ideas. I mean, I'm a big fan of Foda and Polition, for example, and they spoke about this strong compositionality. We have semantics and intentions and, you know, it's possible to build these cognitive representations but we have the issue that we can't really design them to represent the world in in a in a high fidelity way and we have the semantics divergence and then you're pointing to this this very interesting um constructive thing and i think um a constructive form of ai and compositionality solves a lot of problems because of this path dependence problem and this canalization that we're talking about that when you build intelligence brick by brick you can build artifacts of incredible sophistication. But unfortunately, we can't design the artifacts to do exactly what we want. We can gently steer them in a certain direction. And even with Friston, I feel that even though he's talking about the what of intelligence is prediction and adaptivity, I think the implementation matters. I think adaptivity means structure learning. I think there's something about having a substrate which actually does this form of composition that you're talking about. That seems to be like a mechanistic, necessary condition for intelligence. Yes. I think in many ways, what we're talking about is sort of the tension between analog and digital ways of thinking or bottom up and top down ways of thinking. So for instance, let's talk about how you would recognize a bicycle. You know, in the good old fashioned AI world, you would say, well, you've got a circle detector, you know, and a line detector that will detect, you know, the lines that make up the frame of the bike and so on. And you'll write, you'll handwrite code for all of those things. And, and of course, the problem is that, you know, there are many ways of looking at a bike where you're not, you're not going to see the wheels at once, or maybe the bike is of a weird design. You know, there are those funny bikes that have shoes instead of wheels, you know. And when you look at one of those in a gestalt sort of way, you recognize a bike immediately, even if all of the rules are broken, as it were. And that's really important because when you're looking as an intelligent being at the world, you have to cluster, you have to find regularities in the world whose shapes are not well-defined by a set of rules. They're not just sort of carved up by hyperplanes, they're blobby. And so intelligence requires methods that are very neural net-like, that look more like continuous function approximators, And that's why gradient descent is a good idea, for instance. And learning these things via smooth functions is a good idea. And learning them, or not learning them, but trying to encode them with rules never worked out well. Now, on the other hand, DNA is discrete, right? There are four symbols, and you order them in a certain way, and that's it. It doesn't mean that there's no randomness in the way proteins are folded and so on. But composition at the level of DNA really does have to do with chopping up programs essentially made of discrete symbols and inserting bits of code and so on. So when you're looking from the bottom up, it's a very, very quantized world. But when you start to look at giant complex things like us from a high level, you have to begin from a more continuous perspective. I think you've hinted that there are natural convergent patterns in computation. I mean, can we sort of get a convex hull of your philosophy? We could try. I mean, I hesitate to say I'm an anything-ist, but probably functionalist comes closest. Functionalist. Yeah. So the reason for that is that, you know, in the old days, in the 19th century, we used to think that, you know, to be alive meant that there was some vital spirit or vital force, you know, that living things have and dead things don't. And as we started to figure out that the laws of chemistry were the same for living things and dead things, and, you know, urea can be synthesized in a test tube and so on, you know, those ideas really went out of fashion. And we went into a very strict materialist kind of perspective, right, where everything is just physics. And, you know, I mean, I was trained as a physicist, you know, I think I believe in physics fully. But I also think that there is more to life in the sense that, you know, if everything is just physics, then you have no way of saying what it means for you or me to be alive. And to understand what that is, what it means to be alive, I think you have to come to grips with the idea of purpose. You have to bring teleology back into the equation. What I mean by that is, you know, a kidney is not just a collection of atoms. It's an organ that performs a function, right? The function is to filter urea. And if you implant an artificial kidney that works on totally different principles, but also filters urea, it's an artificial kidney. You know, it's still meaningful to say that. So that means that there is something about that word kidney that means something that goes beyond the matter that the kidney is made out of. You know, conversely, if I come back from the future and show you an object and you're like, what is that? And I tell you it's an artificial kidney. You know, there's nothing about this set of weird carbon nanotubes and so on inside that would say to you kidney. It's just that, you know, if you happen to implant it in a body, you know, and set it and sewed it, you know, sewed it in the right way and so on, then all of those relationships would show up in the right way for your body to persist. So this idea of things serving functions for other things and functions only have meaning in the context of yet other functions. So there's something ecological about this idea of functions. I think this is really central. You know, a rock on an inanimate planet has no function. If I break it in half, I now have two rocks. But a living thing has function. And, you know, the hallmark of function is multiple realizability, just like Turing talked about for Turing machines, because, you know, Turing and von Neumann are functionalists, meaning that, you know, if you have a need to make ATP for energy inside your cells, you know, you're going to have multiple pathways for you. doing it because sometimes the aerobic way works sometimes you need the anaerobic way whenever you start to have multiple uh pathways you know wings and insects wings and bats you know that there is a function in play the alternative position would be essentialism so folks like anil seth and um john searl they they think that certain types of material have a certain type of causal graph and you know so for example brains might give rise to consciousness and if we simulated a brain and it wouldn't have the same causal graph, therefore it would be different. But I would like to, I mean, we'll just park that, you know, just for the moment. It seems a little bit like you're talking about this like a computer software architecture diagram. And we can, you know, it's like that ship of Theseus type thing where we can kind of, you know, swap things out. And is it still the same thing? But I think path dependence is very important. So the kidney evolved. It has this kind of this rich phylogeny of evolution. and when you replace it with something you know which came from a different substrate which has a different provenance then it's almost like it is a kidney now and it works now but it breaks the ecology like imagine in an ecology if I swapped a plant out with an artificial plant and I kept doing that it might work now but doesn't that affect its future trajectory yes it does but that's exactly what symbiogenesis is all about you know often often you will have a repurposing of something that was designed, if you like, by nature. And one of the cool things about the BFF experiment is that it shows you how, if you like, intelligent design can happen without any intelligent designer. But something that was designed for one purpose or to serve one function can come back around and serve another function. And yeah, that brings a whole different contingent history with it. That RSV example that I gave, the ability to fuse cell membranes together, came from a virus whose original purpose had nothing to do with building placentas. But it gets incorporated and repurposed, and this is the kind of bricolage that life is made out of. So yeah, I think that that kind of not only replacement, but parallel pathing, etc., it doesn't just happen when we make artificial kidneys. It's happening all the time in nature and is the very hallmark of life. So yes, I disagree strongly with Anil Seth and with John Searle on this point. The brain of Theseus kind of experiments that you've alluded to, the idea that if you took an emulator or a simulator of a neuron and you plugged it into your brain so that its inputs and outputs are connected to the other neurons, then the other neurons wouldn't know the difference. Well, what if you do that for half of your neurons, for all of them? Will your consciousness get dialed down even if you behave the same way? Of course not. You know, for me, your consciousness is obviously a function of the functions, of the relationships of all of those things with each other. It doesn't mean that it's so simple as a computer program where you can just, you know, substitute a subroutine for another one. I mean, we've made computers very kind of abstract in that way, you know, and biology is wet and messy. The interfaces are complex and hard. But this same idea of multiple realizability and repurposability is the very stuff of life. What is your position on consciousness? So what is it? What's its purpose? Is it epiphenomenal? Can it be measured, et cetera, et cetera? Yes, great question. So I think that the idea of philosophical zombies, which David Chalmers has talked about, that maybe something could behave just like you or me, but be dead on the inside, not have any experiences, not feel anything, is actually a lot less coherent than it sounds. So I'm a functionalist about consciousness too. And what I mean by that is twofold. One is that I don't think consciousness is some kind of epiphenomenon that, you know, just weirdly, you know, we happen to have for reasons that have nothing to do with our behavior, nor do I think that it is somehow tied to anything about the way we're physically made. I think it is functional. So why do we have it? Well, in my team, Paradigms of Intelligence, we've been doing a lot of work over the last year on multi-agent reinforcement learning. And the reason is that we're very interested in the precondition for symbiogenesis, which is symbiosis, cooperation. You know, when two things or 700 things or whatever start to cooperate closely, you know, that's the beginning of them really fusing together and becoming one thing. And in order for two agents that are intelligent to cooperate, it turns out they have to have theory of mind. They have to model each other. They have to be able to put themselves in the place of the other. And we have a whole long theory called mu pi about how that all works. But, you know, and I guess the CliffsNote version of it is that it requires that you do induction over a universe that includes not only the game that we're playing, but that also includes what is happening in your head and what is happening in my head. In other words, you have to have a universe that includes yourself in it and the other in it, and that allows you to generalize over the class of you and me, you know, so that I know, you know, my internal state is happy when I smile. And when I see you smile, I know that you're happy as well on the inside. I can make that inference, you know, in the same way that if I see a bunch of peaches, you know, then I know that they're all the same object and I know what the backside of it will look like and so on. So this ability to do psychological, you know, induction is really important for cooperation and that's why we have it. So, you know, and one of the consequences of that is that we model ourselves and we model our own models of others, models of our models and so on. There's a kind of a strange loop as Douglas Hofstadter would have called it in that. Yes, I love Douglas Hofstadter. So there's this kind of self-modelling and then second-order self-modelling and third-order self-modelling, which could be applied to other agents. And of course, you know, in the real world, we are computationally bounded, right? You know, we can't make sense of all of the complexity. So when we do this modelling of other agents, our modelling is quite cartoonish and it's quite structured. And it only goes up to sixth order as well, at most. Oh, interesting. Interesting. I mean, how does this affect, I mean, we haven't really spoken about agency yet. Presumably you could have a strong agent, which is just doing something quite trivial. But when we have this collective intelligence and this information synchrony between agents, how does that affect your ideas of, you know, purposeful behavior? I sometimes use the example of rowing to describe what's happening when purposes merge into a single purpose and consciousnesses, you know, merge into a single consciousness. There's this term that I learned from Dan Brown's book, The Boys in the Boat, Swing, which is when the six oarsmen, or eight oarsmen, sorry, you know, all achieve this kind of state where they're in perfect sync with each other. And, you know, you know it when you experience it. The boat acquires a soul, as it were. You know, you all feel like you're pulling as one. and boats with that property go a lot faster than boats where people haven't quite achieved that sink. That, I think, is kind of what happens when we think of ourselves as being a self, despite the fact that our brain actually consists of a lot of parts, like the same way as the oarsmen that in some sense began with their own purposes and their own self and their own models of the other parts of the brain But you know this process of subjective symbiogenesis I guess you could call it where all of those wills become one and all of those selves become oneself. In hiring, for example, you want folks with high agency, but you also want alignment, which is the potential for this kind of synchrony. and we often have it we do a thought experiment on mlst that you can look at a boat or a flotilla of boats and you're trying to draw a boundary and the boundary for the agent should be the minimal description it should be you know where where is most of the agency where is most of the planning and future modeling happening and usually it's it's the pilot it's it's the driver of of the boat but you're talking about the situation where there is such a synchrony and alignment between the agencies that almost like the the best intentional stance if you like is to draw a boundary around all of them. Yeah. I also think that there's not necessarily a single right answer. So, you know, in my book, What is Intelligence? I talk about a few interesting cases. One of them is, for instance, the conjoined twins, Abby and Brittany Hansel, who've, I don't know if you've seen them on YouTube or, you know, the TV shows. Fascinating case. And, you know, so these are two people who share one pair of arms and one pair of legs. You know, each of them controls one arm and one leg. So they're in a sort of three-legged race, two-legged race. They often speak in synchrony. And, you know, they play volleyball and sports and stuff. They drive a car. They can write emails, no problem. And, you know, they also sometimes, you know, have, you know, differences of opinion. So, you know, they'll sort of, you know, come together and apart in a remarkably fluid way. And all of that is done purely with behavioral cross-queuing, as Mike Gazaniga would call it, meaning their nervous systems are separate, separate brains, separate spinal cords. So, you know, in that case, they are able to model each other extremely well because, you know, their entire lives they've been right next to each other. Another interesting case would be split brain patients of the kind that Casanaga spent a lot of his career studying. And those are cases where in adulthood, the brain is essentially cut in half. So each hemisphere can only see the left or the right hemifield, controls one arm, one leg. And the most fascinating thing about these split brain experiments is that from the outside point of view, it is obvious that there are two consciousnesses in there. You know, each hemisphere is conscious of different things. You can make disjunctions between what shows up in the left and right hemifield and, you know, the left and right hands can be drawing different things, you know, and so on. But if you talk to somebody, you know, who's a split brain patient, they're always like, yeah, I'm still one person. They will never admit that there are two people in there. So, you know, is there somebody who is right and somebody who is wrong? No. You know, this is entirely relational. It's a relational description. And the fact that for them, they are, you know, they're the same person they always were, just, you know, occasionally something takes a little more work. Occasionally one hand will be buttoning the shirt while the other hand is unbuttoning it. You know, it's just an inconvenience. There are split brain experiments as well, even just with a normal brain. And I can believe that we are sort of separately conscious in different parts of our brain. You get out of bed in the morning and you must be a slightly different person, but we kind of gloss over that, don't we? Absolutely. We make a narrative. The coolest experiments about this, I think, are the ones from Peter Johansson at University of Uppsala. So he was the one who discovered choice blindness. In these experiments, a subject is, I think the very first one was face choice blindness. So you'd be shown two faces on cards and asked which one is more attractive. And you pick, and every so often, the one that you're handed to then explain why you thought that face was more attractive is the one you didn't pick. So there's a kind of slate of hand trick. And the cool thing is that very, very few people notice that they're being handed the wrong face, and there is no difference in the fluency or the latency of the description. You have an inner lawyer ready to spring up and justify whatever choice you made, even if it's not the choice you made. And that narrative that you invent then influences your future choices. It's as if we all make up a story about ourselves. And of course, the reason is that we're all split brain patients in a way. The left hemisphere interpreter that generates the speech, you know, is likely not the same part of the brain that actually, you know, sort of did the choosing. And yet all of those parts of your brain are invested in the idea that they're all in the same boat, you know, that it's all one me. So they're all covering for each other in the same way that in a split brain patient, you know, if you show to, you know, to the non-left brain interpreter hemisphere, you know, stand up, person stands up and you ask them, why did you stand up? And, you know, they'll say, oh, I was thirsty. I'm going to the kitchen for a drink of water. Same thing. Artificial intelligence, it's becoming more sophisticated. And there's the social question. And I suppose actually you can think of it as a ship of Theseus for society. So we're going to be having agents embedded in society and we're going to form a large collective intelligence. Do you worry about that future? I mean, what do you predict is going to happen? Well, I mean, there are certainly things that I worry about. I don't want to come across as a Pollyanna. I'm worried about polarization. I'm worried about disinformation. I'm worried about our political and economic systems, you know, not necessarily being fit for purpose in the world that we'll all be living in in 20 years. But I'm certainly not concerned about a lot of the kinds of things that I hear Eliezer Yudkowsky talking about, for instance. And in particular, you know, one of the reasons that I feel very differently is because I feel like human intelligence in the usual sense that we think of it is already a collective phenomenon. We're not that smart individually. We're not that much better individually than our primate cousins. It's only because we get together in large societies of millions and billions of people that we can do these amazing things, that we can transplant organs and go to space and so on. You know, individually, we're just not all that. So for me, you know, AI is actually a part of human intelligence. It's literally already the same thing. You know, I find it very interesting that we only achieved general AI when we began to literally train the models on reams and reams and reams of human language. So, you know, AI was human intelligence from the start. Because I suppose the thesis of Eliezer is that it is possible to have artifacts which are dramatically more intelligent than we are. Maybe you think there's some kind of a limit, but do you think in principle that we could build artifacts which are significantly more intelligent? Well, I think that collective humanity is already vastly more intelligent than individual humans. And in many cases operates at very different timescales, for instance. I think these things are already true. um now you know the the ideas about um so in in a sense in a sense our difference our biggest difference is about thinking about it as an other versus is already a part of ourselves you know what do we even mean by human um there was a wonderful paper um from 2006 called the science of psychology uh in which um i'm not remembering her name but uh she's a psychologist and the science of psychology, it's called CYC. She asks people to draw bicycles. First to say, do you know how a bicycle works? Everybody says, yeah, of course I know how a bicycle works. Okay, draw one. Nobody can draw it. Even if it's just looking at a sketch of a bicycle and saying, okay, where does the chain go? Or where are the pedals? Most people don't know. They make some very fundamental error in this. And it's a very funny paper. But the point is we all have these illusions about what our own knowledge is, our own capabilities, our own intelligence are. We already have swing in the sense that we identify what we think of as our intelligence with something that is actually in a bunch of other people and a bunch of other stuff around us. And we do that kind of unconsciously. So for me, there's not really a discontinuity between what's already going on and AI it's it's really just more of that interesting I think they would make the argument that you could build a single artifact which is more intelligent than the totality of humans but just parking that to one side I spoke with Judith Fan she's a wonderful professor at Stanford and she's done studies on on drawing so comparing how humans draw to computers using clip models and stuff like that and she found something fascinating which is that because we have quite an abstract understanding when we make sketches you know and she was kind of um grading it on you know like um the progression one progression two progression three and we we kind of start very coarse and very abstract and ai systems they start with the edges and the details and that to me indicates that ai models today they don't really understand things at a very deep abstract level like we do but perhaps because we have this this compositional synthesis of knowledge that we were alluding to earlier. Do you see that as a gap? There are a few questions, I guess, hidden in there. You know, one of them is, you know, do I think of LLMs, for instance, of today's, you know, sort of frontier models as being less than or different than in some basic way, you know, our brains, what are those gaps? So first of all, I mean, they're obviously very different. I mean, you know, their architectures are different. They're trained in a very different way. The fact that, you know, for me, the remarkable thing is actually how convergent a lot of their properties are with those of brains, despite all of that. You know, the fact that you find internal representations in many of them that surprisingly resemble, you know, ones that you can measure in human brains. brain score type measures of Martin Schrimpf and co. Or sensory modalities in humans can be reproduced remarkably well, even by models trained on pure language, which is really remarkable. It speaks to how much is encoded in language and how much of what is encoded in language is a reflection of architectural properties of our brains and umwelts and how much of that is then reconstructed essentially by those models. Now, the question of, you know, what we draw first when we draw a picture and how that all works, I mean, remember that, you know, image synthesis models like CLIP or what have you are working in pixel space to begin with. And, you know, diffusion models, by the way, you know, work very differently from various other kinds of models. I mean, we now know that, you know, you can drive a robot with a transformer. So if you give one of those robots a paintbrush or a pen and you say, now draw, what it will draw is going to be very, very different from what you get from a diffusion model that starts filling in pixels. And for that matter, all of that is different from what happens in your own head when you're visualizing something. So I think a lot of this is not so straightforward to analyze because of all those differences in the way that IO and the representation space works. I do think that today's models are highly compositional. I mean, even with a lot of those original, you know, image synthesis models, the fact that you could say, you know, a teddy bear at the bottom of the sea playing with a speak and spell or whatever, and it'll do it, you know, tells you that they can compose. Again, are their capabilities like ours? No, I mean, there are definitely places where they're better, places where they're worse, places where they have surprising gaps. So it's different. But I wouldn't say that there's a fundamental lack of composition there at all. I think if anything, the biggest gap between transformer-based models and what we do is actually narrative memory, right, or being able to form long-term memories and that way have a kind of persistence of a self over long periods of time. They don't have that yet. I'm conflicted. You were pointing to this universal representation hypothesis i think chris ola popularized it with some of his visualization experiments and it's true the representations are very convergent and and other things lead me to to believe that the models produce these kind of superficial imposters that they give you exactly the right answer but for the wrong reasons and one of the hints of that is when you um do variations on on the input it's not robust there's the there's the turing machine argument as well so that you know these llms are finite state automata but they can access tools which are turing complete so you know perhaps we could say the system is turing complete but i don't believe that um chat gpt is is effectively searching the space of turing machine algorithms it hasn't been trained to do that but it is surprisingly robust with the arc challenge it can actually um you know it can it can do really well especially if you do some evolution and you do some refinement and so on So it feels like we're knocking on the door, but there's something missing. I think that in many of those cases, we're not doing a fair human comparison. You know, we often, you know, and this is a little bit similar to our illusions about knowing how bicycles work and so on. You know, I hear a lot of people, you know, say things like, well, you know, but look at this case where we just flip the logic. You know, we change it from do to don't and then, you know, it gets it wrong 30% more often and so on. You know, my first question is always, have we done the human baseline? And it turns out that surprisingly often the human baseline shows the same property, you know. And this doesn't mean that humans are incapable of doing, you know, the fully robust, fully general version of these things, right? If you're a logician or if you think about it carefully, you know, you can really write down your premises and be super robust to, you know, flipping the knots, you know, in the way something is formulated. but most of us don't operate that way most of the time. And we're highly susceptible to logical illusions, cognitive illusions, et cetera, which turn out to be in many cases, surprisingly similar to the machine case. So I'm kind of unmoved by a lot of those. And I think often we're being a little sloppy about how we do it. It's certainly the case that transformers aren't searching systematically over all possible Turing machines. I mean, we don't know how to do that. You know, you have to take shortcuts of various kinds in order to make that whole problem of induction over programs computationally tractable, whether you're a brain or a transformer. Lace, thank you so much for joining us today. It's been an honor. Thank you. Thank you for the really thoughtful. Ever spend all day fishing and catch nothing? That's what happens to hackers when Cisco Duo is on watch. every login, every device, every user protected. Cisco Duo. Fishing season is over. Learn more at duo.com.

Share on XShare on LinkedIn

Related Episodes

Comments
?

No comments yet

Be the first to comment

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies