The Day AI Solves My Puzzles Is The Day I Worry (Prof. Cristopher Moore)

Machine Learning Street Talk • Machine Learning Street Talk (MLST)

Thursday, September 4, 20251h 34m

Spotify Apple

Machine Learning Street Talk

0:001:34:52

Episode Description

We are joined by Cristopher Moore, a professor at the Santa Fe Institute with a diverse background in physics, computer science, and machine learning. The conversation begins with Cristopher, who calls himself a "frog" explaining that he prefers to dive deep into specific, concrete problems rather than taking a high-level "bird's-eye view". They explore why current AI models, like transformers, are so surprisingly effective. Cristopher argues it's because the real world isn't random; it's full of rich structures, patterns, and hierarchies that these models can learn to exploit, even if we don't fully understand how. **SPONSORS**Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!---cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy.Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst*** Cristopher Moore:https://sites.santafe.edu/~moore/ TOC:00:00:00 - Introduction00:02:05 - Meet Christopher Moore: A Frog in the World of Science00:05:14 - The Limits of Transformers and Real-World Data00:11:19 - Intelligence as Creative Problem-Solving00:23:30 - Grounding, Meaning, and Shared Reality00:31:09 - The Nature of Creativity and Aesthetics00:44:31 - Computational Irreducibility and Universality00:53:06 - Turing Completeness, Recursion, and Intelligence01:11:26 - The Universe Through a Computational Lens01:26:45 - Algorithmic Justice and the Need for Transparency TRANSCRIPT: https://app.rescript.info/public/share/VRe2uQSvKZOm0oIBoDsrNwt46OMCqRnShVnUF3qyoFk Filmed at DISI (Diverse Intelligences Summer Institute)https://disi.org/ REFS:The Nature of computation [Chris Moore]https://nature-of-computation.org/ Birds and Frogs [Freeman Dyson]https://www.ams.org/notices/200902/rtx090200212p.pdf Replica Theory [Parisi et al]https://arxiv.org/pdf/1409.2722 Janossy pooling [Fabian Fuchs]https://fabianfuchsml.github.io/equilibriumaggregation/ Cracking the cryptic [YT channel]https://www.youtube.com/c/CrackingTheCryptic Sudoko Bench [Sakana]https://sakana.ai/sudoku-bench/ Fractured entangled representations “phylogenetic locking in comment” [Kumar/Stanley]https://arxiv.org/pdf/2505.11581 (see our shows on this) The War Against Cliché: [Martin Amis]https://www.amazon.com/War-Against-Cliche-Reviews-1971-2000/dp/0375727167 Rule 110 (CA)https://mathworld.wolfram.com/Rule150.html Universality in Elementary Cellular Automata [Matt Cooke]https://wpmedia.wolfram.com/sites/13/2018/02/15-1-1.pdf Small Semi-Weakly Universal Turing Machines [Damien Woods] https://tilde.ini.uzh.ch/users/tneary/public_html/WoodsNeary-FI09.pdf COMPUTING MACHINERY AND INTELLIGENCE [Turing, 1950]https://courses.cs.umbc.edu/471/papers/turing.pdf Comment on Space Time as a causal set [Moore, 88]https://sites.santafe.edu/~moore/comment.pdf Recursion Theory on the Reals and Continuous-time Computation [Moore, 96]

Full Transcript

Hi, this is Joe from Vanta. In today's digital world, compliance regulations are changing constantly, and earning customer trust has never mattered more. Vanta helps companies get compliant fast and stay secure with the most advanced AI, automation, and continuous monitoring out there. So whether you're a startup going for your first SOC 2 or ISO 27001, or a growing enterprise managing vendor risk, Vanta makes it quick, easy, and scalable. And I'm not just saying that because I work here. Get started at Vanta.com. This episode is brought to you by Casamigos Tequila. What do you bring to a holiday party? Simple, a bottle of Casamigos. Because nothing gets the party started like a Casamigos margarita, which isn't just for summer. In fact, it's the perfect pour all year round. Casamigos is the gift that always feels right, because anything goes with my Casamigos. Please drink responsibly. Imported by Casamigos Spirits Company, White Plains, New York. Casamigos Tequila, 40% alcohol by volume. You say, oh, well, if we could solve the halting problem, we could ask it about itself and then halt if it doesn't. And they're like, that's it? That's what Turing is famous for, besides fighting the Nazis? You know, this business of feeding programs to themselves, this was kind of astonishing. Put it past that. Just past. Because you don't want them to bounce off. You know, the real world has all this rich hierarchy of objects and parts of objects. I think what's fascinating is that that real-world structure seems very hard to mathematize. We need more compute. I'm like, oh, that does not sound right in my ears. Are you a bird or a frog? I'm more of a frog. A lot of 20th century mathematics was about soaring above, right? Real-world data is not designed by an adversary to be as tricky as possible. So I'm proud to say eight of my puzzles are in that data set. So I'm waiting to see if AI can solve my puzzles. But I hope that we understand that, wow, this is actually really deep and amazing. What's fascinating about the LLM world is that... Quick pause before we kick off with Chris. Human data is shaping the direction of Frontier AI, yet there's little visibility about how teams are actually using it. Our sponsor, Prolific, are putting together their first report on human data in AI, and they need volunteers. It just takes a few minutes to fill out and you'll also get early access to their findings so you can see how you compare. They're just asking about things like evaluation methods and data sourcing approaches. So nothing personally identifiable. Check the link in the description. Much appreciated. And the episode is also brought to you by Cyberfund, which is a thesis driven investment firm led by founders who built companies from zero to billions. They've sponsored MLST for the next year. So I'm absolutely thrilled to have their support. It's amazing. They're looking for the few out there who are going to define the next decade of AI. And if that's you, they want to talk. So if you ship even faster than Yannick Kilcher used to read machine learning papers before ChatGPT came out, of course, visit cyber.com to learn more. Back to Chris. I'm Christopher Moore. You can call me Chris. I'm a professor at the Santa Fe Institute. I'm originally trained in physics. And then I read Gödel-Escher Bach and got excited by computer science. and then I got into network theory and then I got into machine learning. Chris, welcome to MLSD. It's amazing to have you here. Thank you very much for having me. So you spent decades of your career looking at impossibility theorems and in a sense, why are you biased towards looking at things which are not possible? I guess this is because after I got my PhD in physics and I moved into theoretical computer science, there's a lot of focus there on proving that things are hard. Right. And I guess what I like about computer science is you put on one hat and you look for efficient algorithms for things. And then if you fail to find a good algorithm, you can switch hats and try to prove that the problem is hard. I haven't done very much work in cryptography, just a little bit in post-quantum cryptography. And there, of course, if a problem is hard, maybe you can use it to build a secure crypto system. Um, um, so I, I like that two-sided nature of computer science and, uh, computational complexity theory. Uh, you were saying yesterday in your talk that in the 20th century, there were many birds where, where birds as scientists are, you know, let's have a helicopter view. Let's, let's, let's look at things from zoomed out all the way. And there were also frogs who were sort of down in the weeds a little bit. Are you a bird or a frog? I'm more of a frog. So yeah, this comes from Freeman Dyson and a lot of 20th century mathematics was about soaring above, like you say, and finding grand analogies between things. I really like concrete examples. I like things I can visualize that I can hold in my hand. I have a lot of desk toys. I'm a very tactile thinker and it's actually very hard for me to do much abstract thinking. Every time I'm trying to understand a proof or something, I'm constantly touching down and measuring the steps of that proof against my favorite examples to understand why they work, why is it true here, why might it be true elsewhere. So yeah, I, and then I also like moving back and forth on the rigor spectrum. So I'm originally a physicist, and I often do numerical experiments, simulations of various things. But I do like proving things when I can. But, you know, it's very nice. If I can prove it, I publish in a math or computer science journal. If I can't prove it, I publish in a physics journal and I get to publish either way. So it's a good career strategy. Very cool. Now, we're in the regime of transformers, which are these huge over-parameterized models that, you know, kind of predict the next token and sequence of tokens. And it's just so good to have you in the room with me because you know it's interesting to think about how they're limited in terms of you know learning and optimization but also complexity and computability perhaps as well in terms of the classes of automata you know from your expert position how do you think about the limits of these types of models? I mean most of the work that I'm familiar with is where you can show that something is hard but as some of your viewers know traditionally in computer science, when we say a problem is hard, we mean there exist hard examples, if those are cleverly designed by an adversary to be as hard as possible. And then in some interdisciplinary work at the boundary between statistical physics and machine learning and high-dimensional statistics, different people, different names for it, there you can prove that things are hard in the context of really random examples. So synthetic data which is drawn from some simple probabilistic model. And of course, real world data is neither of these, right? Real world data is not designed by an adversary to be as tricky as possible. And it's very far from random. It is all kinds of structure that both human intelligence and animal intelligence and artificial intelligence can exploit. And I think that's why a lot of people in machine learning, they often feel like, well, you know, proving that something is hard in theory isn't really, you know, I don't care. I'm just going to go solve it anyway. And that's somehow because the real world presents us with examples of these problems where there is so much rich structure to sink your teeth into, whether that's the structure in text, the structure in images, and so on. And I think what's fascinating about the LLM transformer world is that I feel like a few years from now, we're going to look back and say, yeah, that architecture works. A lot of architectures work. Almost in some sense, any sufficiently rich architecture will work. What matters is that the world is structured and any architecture which is capable of capturing some of that structure is going to do well at prediction. And, you know, whether it does well at other things and the whole debate about whether they understand and so on, that's, You know, I have thoughts, but they're probably thoughts that other people have said just as well as I would or better. I do think, though, that some of this work on phase transitions is quite interesting. So this is where I've spent the past decade or two. and this is where some ideas from spin glass theory and the theory of disordered materials from physics has met with machine learning. And the idea here is that just as a magnet, which is if you heat a magnet up above a certain critical temperature, it suddenly loses its ability to hold a magnetic field. Below that temperature, the atoms will automatically align and you'll get a nice strong magnetic field. Above that, it just becomes very noisy. And there are similar phase transitions, in fact, using a lot of the same ideas from physics, in machine learning. So if you have some ground truth and then some noise process, which then produces some noisy data, then depending on how much noise you have, that's a little bit like the temperature. if there's too much noise then there's literally nothing you can do to discover the ground truth the underlying pattern it's just no longer present in the data has been washed out if there's very little noise or if you like if the signals noise ratio is very high then it's very easy and a lot of our favorite algorithms work very quickly spectral algorithms pca what have you message passing algorithms like belief propagation and so on then there can also be these interesting middle ranges where you can find the ground truth if you do an exhaustive search. But we actually believe that there is no efficient algorithm that will succeed in that regime because you're wandering around in this high dimensional landscape of possible fits to the data. And the accurate ones are kind of hidden behind what in physics we call an energy barrier and all of our favorite algorithms, whether they're Monte Carlo or gradient descent or message passing, get stuck for an exponential amount of time in a kind of amorphous mush of inaccurate fits to the data. And only if you have the luxury of exhaustive search, would you find the accurate fit. So I love this work. I love its interdisciplinary nature. It connects with like replica theory and the stuff that Giorgio Parisi recently got the Nobel Prize for. I work with a number of his students and grand students. So it's a wonderful interdisciplinary community. That said, though, all of this is theory about random problems. And again, real world problems have structure that can help guide us. And I think what's fascinating is that that real world structure seems very hard to mathematize. How do we talk about that structure? It's much more than just correlations. You know, the real world has all this rich hierarchy of objects and parts of objects. Ultimately, I feel like what LLMs are going to do and what transformers are going to do is help us mathematize that structure. I think that ultimately, we're going to learn a lot about the world from the fact that they succeed in addition to learning things about them. Yes. It feels though that we do something slightly more sophisticated. I completely agree with what you said about this reification, simplification, abstraction. There's so much more information which we're leaving out in these processes. But we can design the Linux operating system. And it feels that even though these transformers can learn structure, the types of algorithms that they can perform are limited in very, very problematic ways. I follow these debates about how good these things are at coding. I follow Jonathan Blow on Twitter, who I love his work on game design. He's very opinionated. I don't really have an informed opinion about this because the coding that I do tends to be relatively small scale I don't build large modular things with many interacting parts I build some code to run some physical system on my laptop so yeah from the outside I mean, I see that there's this debate about, is it really just copying GitHub? And, you know, how much is it really kind of understanding the code the way a human coder does? I guess I also know that people are talking about or are doing taking an LLM and giving it a module that it can use the way we use specialized modules. When I have a certain kind of mathematics problem, I fire up Mathematica, right? if I want to know how some function behaves, I graph it and look at it. And I think once LLMs are given these various playgrounds and given the ability to fire them up, to literally doodle and look at it in a two-dimensional way, the way we can with our eyes, as opposed to treating everything as one-dimensional strings of text, I expect that we'll see much more multimodal abilities. And I know that this is already happening it's just so interesting that's a layer in the you know the self-attention layer it's it's this first order genossi pooling where you just kind of you know take all of the possible pairs of the tokens and you stack it end time stick an mlp on the end and like just the sanity test there just you know it seems to me how could it possibly learn a deeply factorized structured representation of problems yeah i i agree with you and yet it seems to work surprisingly well at a lot of things and we keep moving the goalposts and we should move the goal posts right the you know the interesting area of research is the velocity of the goalposts and uh you know one thing that i do on the side is i design puzzles so there's this fantastic youtube channel called cracking the cryptic where these two puzzle champions from england mark and Simon do pencil puzzles and many of them are, well, they're online nowadays, many of these are modern variants of Sudoku. And so, you know, people take traditional Sudoku and they invent all these cool new rules, like there are thermometers, which are paths along which the digits have to increase, or there'll be a box within which you're told what the total of the digits in that box is, or there are additional constraints like cells and knights move apart have to be different. So there's an AI company, I think Sakana. Oh, yes, I know them, yeah. Yeah. So they worked with these guys to compile a lot of these. And so the goal is, can you get an AI to read these rules in English and then solve some of these Sudokus? Last time I looked, the behavior so far was pitiful. It was like, you know, they had done a couple of four by four Sudokus, right? Maybe a six by six, I can't remember. And of course, these are puzzles that are designed by humans to have interesting insights about them. And, you know, cool global constraints that cause various kinds of logic, which is just not present in traditional Sudoku. And so, so far, the ability of AI to absorb these rules and then use that to do some kind of intelligence search, that hasn't happened yet. Now, I'm sure that it will improve. I think one of the reasons why LLMs do poorly on these things, I think, is again, this basis of one dimensional text. And at least a year or two ago, when I tried ChatGPT on very simple tasks involving two dimensional arrays of like, you know, like the classic Queens problem and things like that, it really couldn't do it. Whereas we have this sensorium, right? We're used to being able to look at a two-dimensional image, our eyes, you know, our pupils can saccade around very easily. One of the reasons why we like Sudoku is it's very easy to scan a row, scan a column, and scan a little three-by-three box. So it fits with how we can address that data structure, if you will. And that lets us do, I think, much more directed kinds of search. I mean, the last thing you would want to do is translate it onto a big Boolean satisfiability problem and then use your favorite Boolean satisfiability solver. You could do that, but that's certainly not what Mark and Simon do on their YouTube channel. They sort of sit there and think about the rules and derive from them some heuristic or some high-level logical constraint and then use that. And I don't know, For me, that's a really interesting benchmark, and I'll be very excited and a little annoyed if AI start solving those problems. I'm proud to say eight of my puzzles are in that data set, and so I'm waiting to see if AI can solve my puzzles. Yes, I suppose the paradox is that even though they are kind of compute restricted, you made a wonderful observation yesterday that um you know we have these hard problems and the art is transforming them into simpler problems with with heuristics so in a sense the the intelligence is about doing more with less it's about making hard problems simple and if only it were possible just to make that transformation then the language models would be able to do it but what kind of intelligent process do you need i mean what goes through your mind when you come up with these creative flashes of insight. Yeah, one thing I like about the puzzle design community, it's like there are like 10,000 people on this Discord sort of built up around this channel, is that they talk a lot, not typically in a formal mathematical way, but they talk a lot about the art and science of designing insights and then finding insights. And so acting not as an adversary, but as a challenging but ultimately compassionate teacher who's trying to create fun insights for the solver to have. And then, you know, a lot of talk about, because I think the sensation you want to have as a puzzle solver, whether it's a wooden puzzle or, you know, fitting little tiles into some tray or a Sudoku, you want to have this at the first this vertiginous sense when you look at it like oh my god i'm in this exponentially large search space a priori the last thing i want to have to do is exhaustive search it's boring humans are bad at it that's the last thing anyone wants um and so you want to sort of feel that sort of looking over this vast forbidding landscape, and then you see an insight and you start realizing things. I think one thing which is really fascinating is that humans are quite good at designing on the fly different kinds of partial knowledge or partial solution to a problem. So, you know, if you go back to the days of good old-fashioned AI where people were doing different branching rules for backtracking search, Davis Putnam search, certainly there was a lot of clever ideas about if you have some big Boolean problem, which variable should you try setting first? And people came up with these sensible heuristics like if a variable occurs in many different constraints, well, we should set it first because that way, whichever way we set it, we'll satisfy a bunch of constraints, we'll make a bunch of other constraints more upset and that will narrow the search space and that's great. Okay. But humans do something richer than that. So like, imagine you're solving one of these wooden puzzles where you have tiles with different shapes. Pentominos are my favorite. You're trying to fit them together. Humans will very fluidly switch from asking, which piece can fit here and where can this piece go, and which are two different kinds of variables. In these modern Sudoku variants, powered partly by these really awesome apps, there are, you know, traditionally Sudoku fans had invented these different kinds of pencil marks, one of which means the thing here is either a two or a seven, which is one kind of partial knowledge, and another is the three in this box is either here or there or there, which is a different kind of partial knowledge. Now people are inventing new kinds of partial knowledge like these two cells, I don't know what they are, but they have to be the same. So I'll color them both blue and then figure out what their numerical value is later. Or these three cells, I don't know what they are, but they all have to be different. And so I don't know. This to me is a really interesting frontier for AI, where you take the problem and you invent on the fly what kind of variable you should use to address the problem, right? Which is very different from being told, here are the variables, here are the constraints. There's already a lot of interesting questions there. But here it's more like, okay, you know, fit these things in. You formalize the problem. You mathematize the problem. And then make some progress. And maybe even fluidly jump from one mathematization to another during the solving process. And it's a lot, to me this is a lot like science. And when you're doing mathematical modeling, right, in many cases the challenge if you're working with a social scientist or a biologist or whatever, 90% of the work is the mathematization, figuring out what kind of mathematical structure could fit here. And often, once you do that, it's relatively easy to simulate the model or solve the model or prove something about it, whatever kind of work you're trying to do. That formalization process is something that I think is a really interesting kind of task for an AI to do. Close your eyes, exhale, feel your body relax, and let go of whatever you're carrying today. Well, I'm letting go of the worry that I wouldn't get my new contacts in time for this class. I got them delivered free from 1-800-CONTACTS. Oh my gosh, they're so fast. And breathe. Oh, sorry. I almost couldn't breathe when I saw the discount they gave me on my first order. Oh, sorry. Namaste. Visit 1-800-CONTACTS.com today to save on your first order. 1-800-CONTACTS. This episode is brought to you by Nespresso. Gift magical mornings with Nespresso Virtual Pop. Compact and stylish, Virtual Pop is made to meet every morning coffee craving. From espresso to coffee, hot or iced, at the click of a button. And celebrate the season with Nespresso's limited edition coffee flavors. Sweet almond and hibiscus, cinnamon and candied tamarind, and festive double espresso. Magic in the making. Shop the holiday gift collection exclusively at Nespresso.com. Yes yes There always this lingering problem of residuals what happens when we leave things out But this process of epistemic foraging fascinates me right And also from a Good phrase Very good phrase It's one of, I got it from my friend Carl Friston. Oh, okay, the free energy guy. Yeah, he's a great guy. And so there's also this phylogenetic locking in, which I think is good as a form of constraints, but it's also interesting from a flexibility point of view. But we're trying to explore this space, right, to forge a path. And the other thing is, I'm not sure whether you would call yourself a Platonist or not, but there's this kind of interesting juxtaposition between are we converging on the real thing or are we constructing our own reality and where does culture and all of these different things come into it? Because we are very much just kind of laying down the, you called it partial knowledge. We're laying down the stepping stones and we're trying to move forward. Right. I mean, I guess when it's a puzzle, there is a ground truth. You've been promised there's a ground truth. In the Sudoku world, you've been promised that it's a unique solution. And when you found it, you know you found it. In real world problems, as you say, it's very hard to know when we've found the real thing and whether we've failed to see something else. And I guess, right. So I mean, if what these things know is what's on the internet, well that is a world it is not the same as the physical world and uh you know the so the grounding and meaning so my friend Henry Farrell who is a historian um did a one of you know he tried out one of these things where he wrote an essay and and then his essay he you know I can't actually remember what the topic was, but he made a kind of subtle point that was really a little bit sideways to the various points that various people had made. And then he asked, I forget which system, to summarize his essay. And it sort of blandified it, right? It kind of lowest common denominator-ed it. And it did kind of what people at a cocktail party might do. If they're thinking pretty informally, maybe trying to impress each other a little bit. And it basically, it saw what he was writing about. And then it produced a summary based on the most common things that people say about that. And it totally missed the unique thing he was trying to say that was different from the common arguments on either side. And this is interesting. And I think, you know, for him, this was an indication, again, that these systems are not grounded in meaning. They don't really catch the, oh, that's an interesting point. Now, you know, you could say, oh, well, given a more sophisticated use of the statistics of text, even if that's all they have, and then you can argue about whether, you know, compressing them forces them to build world models, et cetera, et cetera. Maybe a better summarizer would do, you know, would catch the cool thing, right? Just as maybe a better music, maybe a better music or book recommendation system would challenge you the way a friend challenges you in that wonderful kind of directed way that friends do. Like, I know you don't think science fiction is good literature, but you have to check out Gene Wolfe because the prose is amazing and the characters are amazing. And I think it will meet your literary needs. But I want to bring you over to science fiction. That sort of thing that our friends do for us that I don't think any recommendation system really does for us. It's like, you like this music. Here's some more music like that. Oh, you kind of like that. Here's some more like that. It's like, well, give me something different. Challenge me. Um, and I think one source of those challenges is the meaning of the real world. Like, look at this cool thing, or, you know, this essay is actually about real things. Think about those real things. Don't just look at the text. And, you know, of course, these are, again, like I, I promised you that I would say things that other people have said better. Um, so, yeah, but this, this is the. This seems like the debate. And on the other hand, I do think, like I said, just as I think that once these things, as they're already doing, can not just write code but run it and see whether it works and then debug the code if it doesn't work or if it is a question about a three-dimensional object, they could fire up a three-dimensional workspace or a seven-dimensional workspace where they can doodle and then kind of perceive the way we perceive right um i am a bit of a platonist because if you and i close our eyes and we each think of a cube admittedly we live in a society with a lot of right angles and we've seen wireframes rotating on screen. So we've had a lot of practice with this. But both of us can see in our minds a cube and we can count the fact, we can just by counting, just by perception, see that it has eight corners and see that it has 12 edges. And if one of us thought it had 14 edges, the other would say, no, it's 12. And the other one would look again at the cube in their mind and say, oh yeah, you're right it's 12 right so we're really perceiving something there and uh the fact that we can have that shared perception gives me and a lot of other mathematicians a sense that there is some reality to these things these are not just subjective objects and so i do think that once these systems can switch on the fly what kind of workspaces they have and what kind of reasoning they do, then I think that they'll be much closer to what we do, right? Yeah. I mean, even if you ask them to do proofs, of course there are proof-finding systems that are very formalized systems, If you ask an LLM to construct a proof, it will often construct some BS. It will be stylistically similar to proofs that's read. But so far, it doesn't seem to be able to do that reflection process and really check the steps in the proof and see if it works. But of course, that is also a very specific thing that humans don't do very often, right? Specific humans in specific cultures do this and have tools for doing this. and we might whip out a sheet of paper and we might start writing things with formal symbols and formal logic to see if our informal proof written in English or whatever actually holds. But when we do that, we're firing up some special mental models and we're using some external tools, paper, pencil, blackboards, computers, to help us with them because actually formal logic is not something that we're built to do. So I guess I expect these systems to, once they can really play with all of these modules, including ones that we don't have, like visualizing things in seven dimensions, they'll be able to do a lot. Just as when they start, I'm not sure if we should do this, But when we give them access to 3D printers and fab labs so that they can start building things and seeing whether they work, well, maybe we should solve the alignment problem first. Yes. What you were saying about the pastiche, you know, like the kind of GPT-generated text was very interesting to me. So they model this statistical distribution, and they're greedily sampled, and they just give you tokens from the bulk. and you know one school of thought is well we'll make them more creative we'll just turn up the temperature and we'll just sample tokens from the tail and and you you really get garbage there because you're kind of you know you're a little bit out of distribution now and one thing that would lead me to think they were learning these kind of you know factored representations of the world is when when you did sample the tail it actually gave you something creative and useful but but i did i did want to kind of like say that i'm not entirely sure whether it's that whether it is about learning meaningful structured representations about things which are grounded in the world or um i'm a creative professional so i've learned about video editing and i hire script writers and so on and and i've noticed that there might be something else at play i've noticed you know you were saying you can add noise to problems and and that actually makes them more tractable and in in audio and video if you add noise then you're actually training you know just textures high frequency patterns you're training human perception to look away so you could blur you could add a texture and and so on and it's the same in writing that there are so many creative motifs just using slightly different language deliberately taking it away from the head of the distribution but such that it respects the epistemic phylogeny so it needs to be meaningful but but still creative and and it doesn't necessarily have to have any grand meaning or be grounded in the world. So I guess the question is, is it just kind of aesthetic creativity or does it really need to respect the rules? This reminds me of Martin Amos and he has this book called The War Against Cliché. And he makes some of the same points in a memoir about his friendship with Christopher Hitchens. And his feeling as a prose writer is that any string of three words which other people have used should be avoided, basically. I mean, I'm paraphrasing him, but, you know, if you say, okay, now I'm having trouble thinking of strings of three words that other people put together, you know, but, you know, if you say anything which is a visible reference to something else, you better be doing it on purpose, but you shouldn't just be doing it because you've heard it before and because it has a high probability in the distribution. So I think his goal as a prose writer was to constantly produce new juxtapositions and in order to, well, to intrigue the reader and to access a space in the, you know, to access a region of the creative space, the writing space, which hadn't been accessed before. And I guess, and I'm sure you know this as a creative person as well, just as a mathematician might look at a proof and say, hold it to the fire and say, okay, does this proof really work? And we have lots of processes, both individual and collective, to do that. Artists make something and then hold it to the fire. is this really good, right? And of course, the pain of artists and mathematicians is that we crumple up a lot of pieces of paper and throw them in the trash. And it can be emotionally exhausting. But we have this very, you know, we have this very high standard for our own work. We don't just produce things. We then reflect on them and show them to our friends and perhaps show them to are critics and then try to modify them or improve them or abandon them. And this loop, right, I guess it's a little bit like if you were a physicist, you'd do an experiment and see if the experiment works out. If you're a mathematician, you do the, quote, experiment, but in a formal space of whether it's logically sound. And if you're an artist, you do the experiment of looking at it and judging it in the ways we do right and you don't one thing I've learned from artists is that art is not this kind of floppy thing right it is a very exacting thing my PhD advisor Philip Holmes wrote poetry and he said this is much harder than doing math you know I completely agree um on on that note so when I look at um the video someone else has edited because you know i'm very experienced now and um it's very it's very similar to mathematics or even the arc challenge or something like that you know intelligence is the process of decomposing you know something into the constituent parts which made it and as a video editor as any creative professional you're trying to create this progressive disclosure of complexity so you're you know you're delivering a sequence of artifacts to the reader or the consumer which just increase in complexity and you have to just put it past their prediction or their cognitive horizon right so so so you know just past just past so you know you don't want them to bounce off and you don't want them to be bored there's a sweet spot exactly even with audio production you have these sound effects and they can hear the transitions so you add texturing you add noise but i can still hear it i'm going to add a little bit more noise and at some point the whole thing just becomes more than the sum of its parts and and the art the the process of art is just building this up layer by layer and just having this almost synchrony with the audience of knowing what their prediction horizon is right and you're and in order to do that you're you're doing a great deal of mental modeling of the viewer you're constantly putting yourself in their shoes and uh and if i go back if i can jump back to puzzles for a little bit right so so like when you design a puzzle you're also constantly putting yourself in the shoes of the solver and like would they get this are they going to see this is it you know is it going to be uh visible but just hard enough to see that it will be a wonderful aha moment and i think one interesting thing philosophically is I think there are both subjective and objective aspects to this, right? So subjectively, of course, you're designing things for humans and you as an editor and a creator are designing things for humans with, well, a certain level of literacy and familiarity with, for instance, the things that are talked about on your channel. And similarly, you know, if you're designing a puzzle well you're designing it for a human who has a certain a certain tolerance for search but not much more um a uh again a certain it's like if if you're in a chess playing society you kind of know about the knight's move right so you so there are certain things that you're familiar with um and like there there's a There's a variant rule in Sudoku, which for some reason is called disjoint groups, that for me is very headache-inducing, which is that if there's a 7 in the top middle of this 3x3 box, there cannot be a 7 in the top middle of any of the other 3x3 boxes. This does not fit with my sensorium. I find it on a subjective level. I know that mathematically and logically, it's a very nice extension of rows, columns, and boxes. It's sort of like a three or four dimensional extension treating the thing more like in a more hypercube-y way. But I hate it because I like have to look over here and then look over there and then kind of painstakingly look over there. I can't scan it in the nice way I can scan rows, columns, and boxes. so that's an area where subjectively I find puzzles involving that constraint both harder and less fun it's also the case that if I were a much more cognitively powerful creature then maybe I would experience just as much pleasure out of 100 by 100 Sudoku as I do out of 9 by 9s So it's true that, I mean, I'm just two pounds of meat with a one hertz processor. I can only handle the nine by nine things. Is it two pounds? I haven't weighed my brain. You know, on the other hand, I also, I can't help but feel that there are almost mathematically objective aspects to aha moments. That maybe there are big aha moments and little aha moments, but that we can all agree, we can all sort of recognize them as insights. It's like when you're designing a video, you can agree that, okay, at this point now this concept is being, like you have a cognitive map of what concepts are being gained at each step and then used to build the next step. And maybe for some viewers, some steps would be very challenging, others they'd be kind of obvious, but they would all kind of understand that that's a step. so I and in the puzzle world there's a lot of recognition that a good puzzle and a hard puzzle these are orthogonal axes and there are simple but beautiful puzzles there are hard and beautiful puzzles there's also simple boring and hard and boring these are really very different things and yeah I wish so theoretical computer science supposedly helps us figure out what problems are easy and what problems are hard and what qualitatively makes them easier or harder. What is it about their structure that makes them easy or hard? Why is this problem a smooth landscape that a greedy algorithm can just find the optimum? And this problem is a very rugged landscape in physics, we would say a very glassy landscape with our many local optimates, hard to navigate, blah, blah, blah. I've tried a little bit to formalize what it is about these aha moments, and I haven't succeeded. It's a little bit like public key cryptography, where you have a function. Everyone can run the function forward. The challenge is inverting the function. and if you're given the private key, then inverting the function becomes very easy. But this is different. You have to find the key yourself. You have to find the insight yourself. Or there's this notion in computational complexity called computation with advice, where again, you're given a big string of advice. Well, but again, this is about finding the advice. I feel like it's more like the meta problem of designing an algorithm. So imagine that I show you an example of a potentially hard problem, an NP-hard problem. But I promise you, actually, this example is easy. I promise you this example belongs to a large subclass of problems for which there is an efficient algorithm. Now you have to go find the algorithm. That seems a little closer to this puzzle design and maybe also a little closer to you're an intelligent entity dealing with a very structured world. you're not having to parse arbitrary images you're parsing natural images which through the processes of natural selection through the structure of the built environment which is made by systems not entirely unlike you that are building an environment that they can understand and navigate, now your task is to understand, navigate, predict, segment this data. And what's really fascinating, right, is it's not so human solving puzzles that were invented by humans. Well, of course, that's a kind of ultimately it's you're being communicated to by something with cognitive capacities and cognitive tastes, right? Enjoyments similar to yours. And then, you know, you can grab onto that. What's amazing is that even the non-living world and even the natural non-human world has all sorts of stuff that we can grab onto. You can share it with other folks because I'm interested in creativity and whether it's socially constructed or whether it's grounded. And as we were saying, the other artists or the other mathematicians, they can decompose the structure and they can see if it fits the phylogeny and they can identify what the creative steps are. So there's a kind of intrinsic value to it, which is fascinating. and on on the other point about you know actually forging paths in in this space i i wonder whether you would agree it's related to undecidability and even wolfram's um computational irreducibility i mean you know just imagine wolfram would talk about you have a cellular automaton yeah you know um what's one of his famous ones is that rule 134 110 110 sorry my bad but you know i know you you've studied, you know, like the three-body problem, right? So you do this computation step-by-step, and there are no analytical shortcuts, right? You just have to do this wide-ranging divergent, and then you find something. And that's amazing. You've hit this stepping stone, but there was no shortcut to get that. Right. Or like a chaotic dynamical system where there's no closed-form solution, if you want to know the state it will be in at some future time you can't just plug t into some formula uh you have to numerically integrate it and um you have to do the work you have to you can't skip over its intervening history and so yeah i mean cellular automata are a great playground for this there are some the for the geeks like rule 150 which are kind of linear and they're like linear mod 2 or something and so if you want to if I give you the initial state and you want to know the state at some future time you can almost just plug it into a formula you can create that future state with much less computation than it would take you to actually simulate and then there are others where we strongly believe that you have to go step by step, the irreducibility, as you said, that Wolfram likes. One of the fascinating things about this is that our only techniques for proving that prediction is hard, that you have to do the simulation, is to build a computer out of the thing. So, you know, what happened with Rule 110 was that Wolfram observed all these cool particles and thought, gee, you know, these particles are doing all this cool collisions, almost like chemistry or almost like reading and writing symbols on a Turing machine's tape. And then Matt Cook came along and, you know, with Wolfram completed this proof and proved Wolfram's conjecture. And then my friend Damian Woods came along and did it more efficiently and so on So this is a lot like NP completeness right We prove that problems are hard because they have some kind of universal ability to encode or express other problems. Therefore, if they were easy, these other problems would be easy too. And a funny thing though, is there are a lot of systems where we don't know how to build a computer in them, but they still look really irreducible. They're still doing all kinds of stuff that looks really nonlinear and it really doesn't look like you could jump forward in time. But the stuff is so uncontrolled that we don't see how to build a computer out of it. So we can't prove that we can't skip over the simulation. So, you know, like imagine that you were walking around in a world imagine that you were thinking about computational complexity several thousands of years ago, which I guess you could have done, and maybe in some philosophical sense some people did. But suppose you don't yet have like wires or pipes or in general things that can transmit some information, a bit or whatever, very cleanly from here to there. and you didn't have little gates that take these clean wires and then produce something else and send it out along another wire. Suppose you just had this kind of what a friend of mine calls lava of just this chaotic stuff going all over the place. Or imagine looking at the flow of plasma in the sun, these amazing videos we have now from these solar telescopes, you see things briefly forming and then breaking apart, and it's very chaotic. It's sort of like the planet Solaris or something. What you don't see is stuff that's out of which you could say, oh, that is a nice controlled building block. I could use that to store a bit that I could then write to later or read from later or like combine to. So some cellular automata have this kind of very chaotic, very nonlinear looking structure. But what they don't have that we know of are these nice particles that we can use to transmit and modify information and simulate a Turing machine or whatever. and I wonder if a lot of natural systems are in this weird middle ground right you can build hydrodynamic computers if you have pipes and valves and before transistors were coming along people were trying to build microfluidic computers right there's some wonderful alternate history in which we don't have transistors and what we have is microfluidics everywhere uh and that would be fun to think about it's a little bit like the difference engine uh the stuff with uh right bruce sterling and william gibson have a novel about that where the kind of the babbage succeeded in building these mechanical computers and that's the technology we have um but you uh but can you build a computer just out of water just the flows of water maybe you know just out of the navier stokes equations using little flux donuts to travel from here to there maybe um and some people say yes and uh but it seems harder because you things are not channeled yes so there's a difference between the complexity of a system and whether we can get it to do the computations we want it to do, right? It might be doing very complicated computations internally that are indigenous to its own dynamics. That doesn't mean that we could say, oh, good, now I can use it to build a computer. Formally, we know that there are problems which are undecidable, but which are not Turing complete in the sense that if I gave you a box that solves this problem, an oracle for this problem, that you could then solve the halting problem. So they're undecidable, but not because the halting problem can be reduced to them. Similarly, we know that if P and NP are different, which we believe, the academic we, almost everyone I know believes that, not everyone, But if P and NP are different, then there are problems in the middle ground which are outside P. They cannot be solved in polynomial time. They cannot be solved efficiently. And yet they're not NP complete. They don't have the ability to capture other things. But the annoying thing is the only way we can prove that a problem is hard is by showing that it is complete. essentially by building a computer out of it. The Subaru Share the Love event is on from November 20th to January 2nd. By the end of this event, Subaru and its retailers will have donated over $350 million to charity. When you purchase or lease a new vehicle during the 2025 Subaru Share the Love event, Subaru and its retailers will make a minimum $300 donation to charity. Visit Subaru.com slash share to learn more. it at indeed.com slash podcast. Terms and conditions apply. My co-host, Dr. Duggar, he's a big fan of, you know, Turing machines, basically. And he thinks that current AI is limited because transformers are not Turing complete. And he thinks the reason for that is that they're finite state automata and they're trained in such a way that means, you know, you can't really have a recursive thing when you're doing stochastic radio descent, because it would just go on forever. But his fundamental hypothesis is that he kind of thinks of GIs as being Turing machines. And even, you know, you can have different strengths of agency. So a strong agent is a Turing machine, you know, a thing which does some computation, you know, it has an environment signal, it takes an action. But if the C, if the block in the middle is a Turing machine, then it's capable of strong agency. So I guess my question... So GIs, general intelligences, generative, what's the G? Oh, general, yeah. Yeah, but I mean, I don't know whether you would, I mean, you're the perfect person to ask about this out of all the people we've ever interviewed. And, you know, do you think he's right to think about, like, you know, Turing completeness as being a way to demarcate different forms of intelligence? And do you agree with his theory that if it were possible for us to train a Turing machine, you know, rather than a finite state of autonomous, you know, so imagine we could empirically train it, would that lead to amazing things? well yeah i mean of course turing in his 1951 paper is this classic paper where he says that uh artificial intelligences although i don't remember him i'm not even sure if he used that phrase artificial minds or something would be trained or almost raised like children are rather than programmed. Turing machines as an architecture, I think, are rather brittle. And I mean, I think that the partly analog nature of neural networks and LLMs, this ability to, I mean, I know that they can be made discrete and so on, but somehow their ability to work in a continuous way with high dimensional vector spaces and embeddings, I think that that is important to their trainability, even if it's not ultimately important to their cognitive abilities. I mean, I guess an easy riposte to your question is that I am also a finite state machine. I mean, I have a very large number of states. I will not have within my lifetime the ability to explore more than a few of them. But I'm composed of a finite number of neurons, a finite number of elementary particles. So I have a very large but finite number of states. Now, I think the difference is that because I am also a tool-using and tool-making entity, if I realize that there's a problem which is difficult for me to do in my head, which is most problems of any size at all, I can then build things, whether that's a clay tablet or an abacus or a computer that extends my workspace, extends, if you will, the tape of my Turing machine. And that gives me, in principle, recursion. I mean, then we can get into, oh, well, is the universe actually finite, blah, blah, blah. that's not very interesting to me because we can reach fairly far into the kind of asymptopia of recursion. Famously, humans, people joke about German speakers having a stack depth of three or four and English speakers having a stack depth of one or two. I'm not sure that I contain a stack, unlike what Noam Chomsky supposedly said, I mean, if I had a stack in me, then it would be a lot easier for me to repeat a string of words backwards. And that's very hard. If you give me a short string of words, it'll be a lot easier for me to repeat it in the original order than backwards. So I don't think I'm very good at pushing and popping. I don't seem to have that kind of data structure in my mind. But if I need it, I can build it with pencil and paper or a stack of plates on a table. So I think it's that extensibility which gives us access to recursion and universality. And that's partly why I guess I'm excited by the idea of AIs that can say, gee, this problem has a recursive nature. I cannot do this just in my own context window or my own embedding. I need a data structure which lets me push and pop easily. And I know that before Transformers came along, people were already working on these hybrid structures where you have a deep network. Rather than asking it to create a stack in its own state space or like train a stack part of it, which would be very challenging, give it a stack and let it take actions on that stack as a data structure and let it learn how to use it and play with it. Could I just refine this? I don't want to misrepresent Keith. So everything you've said is absolutely true. And when we have this discussion, there are many folks who say exactly as you have done that. Oh, the brain is an FSA and we extend, you know, like we can expand our memory by writing things down on a hard, you know, I can get another whiteboard, I can get another whiteboard and so on. But his argument is slightly more nuanced. He's saying that, yes, our brain is a finite state automata. But if you look at all of the algorithms that are inside of that class, there are a subset of algorithms, which are those that can control a Turing machine and expand memory and so on. And those algorithms are not traversable with stochastic gradient descent. So he's roughly saying that we were, you know, maybe the Chomsky argument, maybe we've got the merge operation or something. You know, somehow our brains have learned the special class of FSA algorithms that can expand our memory. I see. That's an interesting claim. I mean, I think, you know, we're at this workshop this week where there was a whole discussion yesterday about what do we actually need language for? and you know what do we actually need symbolic thinking for because that's where recursion seems to start and there are plenty of intelligent entities out there like our close relatives the great apes and possibly our ancestors who were already making stone tools and teaching each other to make stone tools using gestures. They didn't need, you know, the full-on modular structure of language that we have. And you can do a lot to navigate the world without, if you will, Turing completeness. And, well, assuming that what we mean by Turing completeness is kind of the ability to do symbolic recursion and so on. And the funny thing is, you know, then we're starting LLMs as language first things. They're not tactile first things or visual first things or find food first things the way we were. They're language first things. And because language is the medium in which we do symbolic thinking and recursion, then we're like, oh good, they should be able to leap to all this formal stuff in mathematics, but they're not formal systems, right? They're token producing systems. The same way most human speech is a token producing system, right? Formal reasoning is something that we, is kind of a thin veneer that we do in specific settings on top of token producing, right? You know, when we're chatting with each other or even talking about topics that we've had conversations with before, we're acting very much like an LLM. You know, we're cheerfully in a distribution we're pretty familiar with. We're cheerfully emitting tokens. We're doing it. We don't really need to do that much self-reflection about it. It's when we hit some edge that we're forced to do, well, kind of the self-reflection we were talking about before, about, okay, is what I'm about to say, is it actually, does this actually make sense? And that's something, most of us don't do that most of the time, right? You know, and so, yeah, I mean, I feel like the Turing machine itself, so for instance, when I teach theoretical computer science, I don't do it in a Turing machine-centric way, right? And I think if you look at some more recent textbooks, they don't do what the older textbooks did, where the first thing you do is, here is a Turing machine. And, you know, the Turing machine is partly of historical interest now. I mean, it's a cool, minimal thing that's universal, but there are other many very small things that are universal. and whether those are families of Boolean circuits of increasing size with yes admittedly some kind of uniformity to them or whether it's counter machines or finite state automata with two stacks you know people like Minsky and others had a lot of fun in the 60s and 70s finding these smallest possible machines that can do that or cellular automata or whatever um so for me the turing machine isn't central actually it was the first thing like it which had this ability to uh simulate it had this universal ability to simulate other machines of its own kind had this paradoxical ability to simulate itself and therefore the halting problem and so on. But I don't view that architecture as central. It's very von Neumann-y, right? You have a CPU, you have a memory. It's very magnetic tape-y. You know, you roll the tape over to this part. So, yeah, I feel, I guess for me, mathematically, when I think about computational universality, I think about things like our favorite programming languages and their relationship with the theory of partial recursive functions, right? So as I'm sure a lot of your viewers know, these basic notions of recursion that were invented before Turing came along, primitive recursion is basically a for loop. There's this other operator called minimization, don't worry about it, which is basically a while loop. And then function composition is basically, well, function composition. so these tools can generate all of what are called the partial recursive functions which are more familiarly now called the computable things right these are the same things a turing machine can do and then church can you know has his wonderful lambda calculus and that shows up in haskell and lisb and so on so to me like the wonderful thing is that these rather different architectures and the turing machine the grand unification which occurred in like 1936 what is marvelous is that they can all do the same thing yes maybe i should clarify that keith wasn't talking about a physical turing machine he was talking about the strength of the computation i think in in the practical sense he was saying exactly as you were just saying that being able to do arbitrary loops and recursion in an algorithm. Right. Yeah. So ultimately, if you can take building blocks and use them to make more complicated things and then use those things as building blocks and then wire them to each other and wire them to themselves, you can do everything we're talking about computationally. And that is what we do as technological beings. We build these incredible scaffolded technologies, which we haven't seen the end of yet. And if you can do that in a virtual space, then you are an unbounded technological being in the world of mathematics. You can build these arbitrary computable functions. You can compute anything which any other reasonable architecture can compute. And now, right, so yes, Turing machines can do this, although they're very close to the hardware, as it were, right? And of course, Turing's great achievement was showing that we can do software on top of a Turing machine, in a sense, right? And just as Goodall's great achievement was showing that mathematical formulas can talk about themselves, he showed how to do that, right? They built compilers, if you will. and it's funny like when you teach students nowadays the halting the undecidability of the halting problem you say oh well if we could solve the halting problem we could ask it about itself and then halt if it doesn't and and not halt if it does and there we're done and they're like that's it that's what Turing is famous for besides fighting the Nazis it's like well you gotta understand you know this business of feeding programs to themselves this was kind of astonishing, right? I mean, in the early, even in the early 20th century, mathematics had this very stratified structure. There were numbers. Then there were functions, which act on numbers and produce other numbers. Then there were sort of metafunctions or functionales, like taking the derivative, which takes a function and produces another function. You couldn't feed things to themselves that, what does that even mean? That's nonsense. And then along come Turing and church and goodle and show that you can that's amazing right and like doug hoschdatter talks about and go to lesher bach it's sort of like enzymes and proteins are both programs that can act on each other and data like strings of amino acids and nowadays this is the air we breathe a text editor is a program that works on other programs a compiler an operating system people in compiler class compile their own compilers, right? But I hope that we understand that, wow, this is actually really deep and amazing. So yeah, that self-reflexivity, that ability to build things on top of other things. You know, what I like about the Chomsky hierarchy, if you look at a Turing machine, it is kind of a finite state machine with an infinite number of states, right? And And, you know, if you're used to having these little graphs of like states, well, it is one of those. It's just an infinite graph. Then you go to this higher level of description and say, actually, this thing has a finite description, right? Just as if you have a stack, of course, you could draw an infinite series of states where you push, push, push, push, and then pop, pop, pop, pop. and it would be a big binary tree if you're pushing and popping binary symbols and so on. It's an infinite thing, but then you move to a higher level and it has a finite description. That move, I guess, so let's, that move is something that, yeah, I don't, saying Turing machines can do it, I'm not sure. you can use turing machines to build it seeing the ability to do that recognizing this next level uh which makes a previous infinite thing finite that's a very cool thing for an intelligent entity to do so i know this is a little sideways to your friend's question um but yeah that jump um that jump to me is really fascinating yeah and i think that that's you know it's a little bit like oh recognizing a statistical regularity and then being able to predict yeah well but it seems like more it seems like more um and you've really recognized and captured an infinite set of objects all at once. You've done a step of abstraction. And I would love to have artificial partners that can do that We got gifting all wrapped up at Sephora Gift more and spend less with our value sets packed with the best makeup skincare fragrances and hair care love This year's Showstopper gift sets are bursting with beauty products from Rare Beauty, Summer Fridays, Glossier, Amica, and so much more. Shop holiday gifts at Sephora.com and give something beautiful. The Who's Down and Who Newville were making their list, But some didn't know. Walmart has the best brands for their gifts. What about toys? Do they have brands kids have been wanting all year? Yep, Barbie, Tonys, and Lego. Gifts that will make them all cheer. Do you mean they have all the brands I adore? They have Nintendo, Nespresso, Apple, and more. What about... So the Who answered questions from friends till they were blue. Each one listened and shouted, from Walmart? Who knew? Shop gifts from top brands for everyone on your list in the Walmart app. it also makes me think of that there's a bit of a kind of a sandwich here so people like wolfram believe in digital physics so that the you know ontologically the universe is is made out of computation as a quick aside you said yesterday that you you you uh you lambasted you know folks for using computation as a noun which i thought was brilliant compute sorry compute compute sorry yes when did anyway like we need more compute and like i'm like oh that does not sound right in my ears but okay i know verbing weirds language and as calvin and hobbs said and or and nouning does too but yeah i know but um yeah we need more weird i guess yeah so so i guess you know like the the first part of the question is are you a pan-computationalist or you know maybe if you're not but the one step up from that is do you think it's appropriate to use the computation metaphor to talk about effective computations that the universe is doing and then you were going in an interesting direction a little while ago you know when you were just talking you know like yosha bark for example he talks about this kind of mimetic virtual computation so our brains are simulators and and we do this metaprogramming and we share programs around and it's almost like the programs are the agents right so where's the locus of agents selfish meme exactly yeah so so So you've got the stack there. And coming from the Santa Fe Institute, of course, we were saying that, I mean, my co-host Keith, of course, he's a big fan of this Turing machine thing because he's an internalist, but you're surrounded by so many fascinating professors who have this very externalist, complex systems view of things. So there are so many ideas of thinking about intelligence. What does it mean to you? All right. So I like to talk about the computational lens. so to me like as a kind of general scientist to the extent i am one i i like to be agnostic about what i should focus on or if you will what lens i should look through when i look at a system and computation is one such lens so like i mean uh and to me that's the lens which focuses on the storage and transmission and transformation of information in a system. So in the cell, right, I have friends who are studying cells and who study the origin of life. And I mean, the ribosome is clearly in part a computational device, which is transforming information from one form into another. The error correction mechanisms in DNA replication and so on are clearly, in essence, computational. And so you certainly learn a lot by looking at that system, looking for computation. um on the other hand so like people who tried to build artificial life and who partly because they want to understand how life began in the physical world i've heard some people say that we move too far in a purely computational or informational direction right so like one idea about the origin of life is some kind of ultimately formal system where strings make more copies of themselves right and this brings us to like lambda expressions which make copies of themselves or you know uh turing machine or like in core wars right a little a little bit of assembly code which copies itself elsewhere and makes more copies of itself um and uh and that is one approach, the sort of replicator first approach to life. But some people think that the problem with that is that it doesn't recognize that organisms are really dealing with thermodynamic constraints. They really have to get energy. They have to manage chemical gradients and extract free energy from these chemical gradients. So there's a lot of physics that they have to do and chemistry that they have to do. And here are things like the abundance of different elements or electrical charge or light. Thermodynamic stuff really matters. And so from this point of view, the fundamental thing is not the replicator. It's more like the metabolism, the thing which channels free energy the way a river channels water or the way a lightning bolt channels electrical charge. And so, you know, this is a caricature, but so you could say, oh, all this informational stuff, the genome, all these wonderful strings of symbols that look very much like touring machines to us. this is just stuff that the selfish metabolism built to better channel free energy right as opposed to metabolisms are things that replicators built to get the energy we need to replicate and maybe you know who knows which thing is the tail and which thing is the dog which came first and maybe there's some truth to all of this similarly like you know You could say that even the orbits of planets in the solar system are computing. They're computing their own future positions. And yes, you can say that. I'm not sure what we learn about planets by saying that. So for me, I mean, as a pan-computationalist, do I think everything is computing? Yeah, but I think in some cases that's an informative thing to say, and in other cases a less informative thing to say. So I think that focusing on that just as other sorts of lenses, like another lens is adaptation. Are things evolving? Are they adapting? Are they learning? Either learning within a lifetime or over evolutionary time. Yes, that's another thing that a lot of things are doing. Sometimes that's really important, understanding them. Other times it might be less. you know so like uh for instance i i have a strong allergy to evolutionary psychology right like i know maybe some of the ways we treat each other and think and feel might be because it's adaptive when because we're social primates blah blah blah but i don't really find that helpful to thinking about i certainly don't find it helpful to think about ethics right and um it's like the origins of things are not always the important thing about them. The constitution was written by slave owners. Yes, that's historically important. It's also this system that we can use and, you know, call upon now to try to do good things in society. You know, so each of these lenses are interesting and they reveal different things and it depends on what you're trying to do and what kind of phenomenon you're trying to understand. And I think we should kind of freely and fluidly switch back and forth between them when we're trying to understand different things. I didn't quite get whether you would agree or disagree that the universe ontologically, you know, in its primacy could be thought of as computational. So I guess I translate that into, is it simulatable by a computer? Oh, is there a distinction though? because I would think it would be possible for the stuff the universe is made out of to not be simulated by a computer, but we could simulate it. Maybe I've just said something very stupid there. Maybe what you said was correct. Is there a distinction there? I don't know. I mean, is a computer computational, right? So a computer is this thing made out of elementary particles, which are doing all sorts of crazy things. We exploit a small fraction of their dynamics to make pixels and do, you know, do, anyway, right. I mean, obviously the laptop is doing all sorts of things other than the computation we want it to do. So is it at the fundamental level a computer? I mean, I don't know. I'm not trying to slip out of the question. I like Feynman's question of if I have a space-time box, a single cubic meter second of space-time, is the amount of information processing, or shall we say computation, in there finite? and you know i'm inclined to think so but i don't really know i mean i don't think that at the fundamental level things are solar automata because i think that doesn't really work with quantum mechanics um i like this picture that at the plank scale something funny happens to space time so that you don't really have an infinite infinitely divisible continuum of space and time at those smallest scales. I don't think it's a lattice, but maybe it's something more amorphous. And, you know, people talk about causal networks and so on. And my first paper was about causal networks. And anyway, I mean, I'm inclined to think that at the end of the day, that the physical church Turing thesis is true, that any, that the universe one form of that says that any device that we could actually build would be simulatable by say a quantum computer with finite resources. Then there's also a question about what does it do by itself right like even in this box there could be analog degrees of freedom that go all the way out to infinity like You know, the states of this box could be real numbers that really have an infinite number of digits. And I spent some time in my career thinking about analog computation, which, by the way, is a wonderful, cool history with Claude Shannon building mechanical computers and so on. So if you have real number computation, then in theory, there are an infinite number of bits there that you could call upon. The question is, can you read to them? Can you write from them? but even if we couldn't access them as engineers to do an infinite amount of computation maybe it's still doing an infinite amount of computation itself if you know what i mean um i'm inclined to think that's not the case uh so yes i'm inclined to think that there is a finite amount of computation happening and if you want to say that that means that it's computational although if they're an infinite amount we could say it's computational it's just a really awesome kind of infinite hyper computation yeah some of the work on hyper computation is a little silly um but you know i mean what happens in black holes and can you uh you can set uh set your grad students up in orbit around a black hole um make sure that they have a hereditary uh monkhood which will keep working on a problem then you wave goodbye and fall into the event horizon and if they ever if their turing machine ever halts if they ever solve the problem And then they send you a signal, which of course will vaporize you because it will be blue shifted into gamma rays. But then, you know, in theory, maybe you could learn something about, you know, closed, if you have a closed time like curves, you can do really awesome things. um yeah i mean i don't know part of it so physicists which is my original culture have an allergy have an have an allergy why did i say that have a oh have an allergic reaction that's why i stressed that syllable to infinity so for us when something is blowing up and like when an integral is diverging, say, or some infinite sum is diverging instead of converging, like you seem to be able to do an infinite amount of computation in finite time, for us this is a sign that something is breaking down, right? And so the attitude is, oh, well, the problem with this black hole idea is that there's cosmic censorship, which will actually prevent us from making closed time-like curves. Or there's going to be some sort of noise or firewall, whatever, at the event horizon which will blow up our ability to do this. Or, you know, as Sean Carroll says, the universe is expanding so fast that it's rather grim, in my opinion, that we can only do a finite amount of computation before the stars go out from the accelerating expansion. And, you know, this just pisses me off. I'm like, you know, we'll do something about it. We should, you know, the point is not to study the world, but to change it. So, you know, we should do something about that, you know, which gets us into science fiction. So, but for physicists, right, I mean, every time in particle physics there's something which seems to give an infinite answer, we think that means that our theory is breaking down somewhere. And historically that's been true. So that gives us the sense that there aren't any real infinities. And in particular, that sort of fits with the idea that we're never going to be able to build a box or even find a box out there made of black holes or whatever that can solve undecidable problems. But we don't really know, right? Ultimately, this is a claim about the world which may or may not be true. I'm inclined to think it is. And I guess, yeah, I mean, like I grew up on like Fred Kinn and Toffoli's digital physics and reading Wolfram and playing with solar automata. So yeah, I kind of think something discrete-ish is happening at the finest scales of space and time. Fascinating. And just before we go, Chris, we haven't really spoken about the algorithmic justice. So this is something that you've been spending a lot of time looking at recently. And I suppose it's difficult because we're building these inscrutable neural network models. And I think, you know, certainly in common parlance, There's this intuition that it needs to be inscrutable because if we make them interpretable and if we kind of dumb them down to be understandable, then they don't work as well. But we now have these unbelievable, illegible black boxes that are making consequential decisions in our society. Right. Yeah. So I have thoughts about this and maybe this is another conversation. I don't think these things should be inscrutable I mean I think that there is a range of applications right if you recommend movies to me using a black box and I like the movie everybody's happy that doesn't bother me right maybe if I were a filmmaker I would want to know more but it doesn't really bother me as a consumer at the other extreme if you are putting me in jail, even though I've not yet been found guilty of a crime, or if you're using AI to help find me guilty of a crime. You know, we have these things in the Bill of Rights that say I should be able to confront my accuser. I should be able to cross-examine witnesses. I should be able to contest evidence. And the interesting thing about these things, this is what people call procedural fairness. And the interesting thing about the criminal justice system is we explicitly care about things other than accuracy. Right, so for instance, we've all watched TV shows where the guy actually did the deed, but the police planted the evidence. They violated the rules of evidence. They knew he was guilty. They wanted to put him away. And then they crossed the line. And because of that, he got to walk. And in our society, we think that's how it ought to work. because we don't just want to be accurate in putting away guilty people and releasing innocent people. We want to have a certain relationship between government and its citizens. We want to have rules about how can the government surveil you, investigate you. And that's really profound, right? And that's just, you know, how do you optimize for that? How do you even mathematize that? In some of the work on fair machine learning, people look at statistical notions of fairness. So we have this group of people, we have that group of people. I'm a little bit disturbed by the assumption that everybody belongs cleanly to one of these two groups. I think that's part of the problem. But to the extent that we can divide the world into subpopulations and it's like, well, we want the false positive rate to be equal or whatever. Well, that's a constraint. we can add that to the model. We can tack that onto the algorithm and people have done lots of good work in that direction. But I'm really fascinated by these other harder to mathematize notions, not just of fairness, but, you know, I mean, what do we really want these systems to do? One interesting fact, which I recently learned from a guy named Mark Keneas, who has a PhD in aerospace engineering and then went to law school and became a public defender, I mean, is that a lot of the software products which were being used to do DNA testing, specifically this thing called probabilistic genotyping, where it's been a couple of days, there are multiple people who pass through the scene, the DNA has fallen apart into pieces. You know, how do you then, there are some choices to be made here about how this is not a perfectly clean math problem about was the defendant at the scene of the crime in that setting. Many of the software tools that are used for this, there are kind of two or three popular ones. Some of them have never been, or at least until recently, were not independently tested by anyone. Many of them were not open source. They were proprietary products and they sometimes disagreed with each other. right so what is the right metaphor here is this um are these things expert witnesses that you can cross-examine not really are their designers the witnesses do you cross-examine their coders or the bioinformatics behind them um you know if they disagree with each other how are judges and jury is supposed to evaluate which one is better. So for me, I like the idea of transparency, which for me is a stronger word than explainability or interpretability. I agree transparency is a moving target. In some settings, it might just be, has some independent agency consumer reports or underwriters laboratories tested this thing and can they verify the vendor's claims that it works? In some settings, that might be enough. And in a lot of settings, even that is missing, right? From things that are being used right now to make important decisions about people. In some other settings, I really want to be able to look under the hood. And yes, I know deep networks are hard to interpret, but at least it's a start. If I can look under the hood, I can do these sort of fMRI experiments like, you know, like this Othello paper where people try to do the tomography and figure out what kind of model it's building. I think that's a very interesting line of work. so I think that as humans it would be very good for all of us especially if we want a democratic society where we're making kind of informed collective decisions about when to use these things if to use these things and what settings to use them we should all try to understand these things as well as possible there are multiple sources of gaps in our understanding some gaps are there for honestly good reasons, like deep networks are hard to understand. Some gaps are there because of intellectual property and because people don't want to reveal how these things work because they want them to be proprietary. I am not very sympathetic to that second kind of gap, and I think that kind of gap should be closed. I don't think we should be using opaque proprietary tools to make decisions that affect people's fundamental human rights. I think it's a continuum. Like in health, it's interesting. Like if you are using a proprietary tool to diagnose my cancer, well, I mean, I'm a geeky guy. I'm really curious how it works. I would, you know, if it's been independently tested by people who are not paid by the vendor of the system and it's really led to good outcomes, even if it's a black box, I might go along with it because I want to live, you know. Yeah, I don't know. I mean, I think it's a continuum. Like what level of transparency we would demand. But when we get into sort of constitutional rights, I think we should demand every possible form of transparency. And, yeah. Chris, it's been so lovely to have you on the show. Thank you so much for joining us today. Thank you very much. It's been a great time. With savings over $390 this shopping season, Vrbo helps you swap gift wrap time for quality time with those you love most. From snow on the roof to sand between your toes, we have all the vacation rental options covered. Go to Vrbo now and book a last-minute week-long stay. Save over $390 this holiday season and book your next vacation rental home on Vrbo. Average savings $396. Select homes only.

Share on X Share on LinkedIn