The Fractured Entangled Representation Hypothesis (Intro)

Machine Learning Street Talk

Saturday, July 5, 202515m

Spotify Apple

Machine Learning Street Talk

0:0015:45

What You'll Learn

✓Current AI models using SGD produce 'fractured, entangled representations' that are a 'mess' under the hood, despite producing impressive outputs
✓An alternative paradigm from Kenneth Stanley builds 'unified factored representations' that capture the world in a more modular, intuitive way without massive datasets
✓The path to discovering interesting artifacts may not resemble the final result, due to 'deception' - the stepping stones don't look like the end goal
✓Embracing the 'evolution of evolvability' can lead to superior, more creative representations by selecting for representations that are more amenable to future discovery
✓This provides a counterexample showing that neural networks don't have to produce entangled, chaotic representations - there are alternative approaches

AI Summary

This episode discusses the 'fractured, entangled representation' hypothesis, which suggests that the internal representations learned by current AI models using stochastic gradient descent (SGD) are a 'mess' and lack the deep, structured understanding required for true intelligence. The host introduces an alternative paradigm from researcher Kenneth Stanley, which builds 'unified factored representations' that capture the world in a more modular and intuitive way, without relying on massive datasets. The key ideas are that the path to discovering interesting artifacts may not resemble the final result, and that embracing 'deception' and the 'evolution of evolvability' can lead to superior, more creative representations.

Key Points

1Current AI models using SGD produce 'fractured, entangled representations' that are a 'mess' under the hood, despite producing impressive outputs
2An alternative paradigm from Kenneth Stanley builds 'unified factored representations' that capture the world in a more modular, intuitive way without massive datasets
3The path to discovering interesting artifacts may not resemble the final result, due to 'deception' - the stepping stones don't look like the end goal
4Embracing the 'evolution of evolvability' can lead to superior, more creative representations by selecting for representations that are more amenable to future discovery
5This provides a counterexample showing that neural networks don't have to produce entangled, chaotic representations - there are alternative approaches

Topics Discussed

#Fractured entangled representations#Stochastic gradient descent#Unified factored representations#Deception in representation learning#Evolution of evolvability

Frequently Asked Questions

What is "The Fractured Entangled Representation Hypothesis (Intro)" about?

What topics are discussed in this episode?

This episode covers the following topics: Fractured entangled representations, Stochastic gradient descent, Unified factored representations, Deception in representation learning, Evolution of evolvability.

What is key insight #1 from this episode?

Current AI models using SGD produce 'fractured, entangled representations' that are a 'mess' under the hood, despite producing impressive outputs

What is key insight #2 from this episode?

An alternative paradigm from Kenneth Stanley builds 'unified factored representations' that capture the world in a more modular, intuitive way without massive datasets

What is key insight #3 from this episode?

The path to discovering interesting artifacts may not resemble the final result, due to 'deception' - the stepping stones don't look like the end goal

What is key insight #4 from this episode?

Embracing the 'evolution of evolvability' can lead to superior, more creative representations by selecting for representations that are more amenable to future discovery

Who should listen to this episode?

This episode is recommended for anyone interested in Fractured entangled representations, Stochastic gradient descent, Unified factored representations, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

<p>What if today's incredible AI is just a brilliant "impostor"? This episode features host Dr. Tim Scarfe in conversation with guests Prof. Kenneth Stanley (ex-OpenAI), Dr. Keith Duggar (MIT), and Arkash Kumar (MIT).While AI today produces amazing results on the surface, its internal understanding is a complete mess, described as "total spaghetti" [00:00:49]. This is because it's trained with a brute-force method (SGD) that’s like building a sandcastle: it looks right from a distance, but has no real structure holding it together [00:01:45].To explain the difference, Keith Duggar shares a great analogy about his high school physics classes [00:03:18]. One class was about memorizing lots of formulas for specific situations (like the "impostor" AI). The other used calculus to derive the answers from a deeper understanding, which was much easier and more powerful. This is the core difference: one method memorizes, the other truly understands.The episode then introduces a different, more powerful way to build AI, based on Kenneth Stanley's old experiment, "Picbreeder" [00:04:45]. This method creates AI with a shockingly clean and intuitive internal model of the world. For example, it might develop a model of a skull where it understands the "mouth" as a separate component it can open and close, without ever being explicitly trained on that action [00:06:15]. This deep understanding emerges bottom-up, without massive datasets.The secret is to abandon a fixed goal and embrace "deception" [00:08:42]—the idea that the stepping stones to a great discovery often don't look anything like the final result. Instead of optimizing for a target, the AI is built through an open-ended process of exploring what's "interesting" [00:09:15]. This creates a more flexible and adaptable foundation, a bit like how evolvability wins out in nature [00:10:30].The show concludes by arguing that this choice matters immensely. The "impostor" path may be hitting a wall, requiring insane amounts of money and energy for progress and failing to deliver true creativity or continual learning [00:13:00]. The ultimate message is a call to not put all our eggs in one basket [00:14:25]. We should explore these open-ended, creative paths to discover a more genuine form of intelligence, which may be found where we least expect it.REFS:Questioning Representational Optimism in Deep Learning:The Fractured Entangled Representation HypothesisAkarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanleyhttps://arxiv.org/pdf/2505.11581Kenneth O. Stanley, Joel LehmanWhy Greatness Cannot Be Planned: The Myth of the Objectivehttps://amzn.to/44xLaXKOriginal show with Kenneth from 4 years ago:https://www.youtube.com/watch?v=lhYGXYeMq_EKenneth Stanley is SVP Open Endedness at Lila Scienceshttps://x.com/kenneth0stanleyAkarsh Kumar (MIT)https://akarshkumar.com/AND... Kenneth is HIRING (this is an OPPORTUNITY OF A LIFETIME!)Research Engineer: https://job-boards.greenhouse.io/lila/jobs/7890007002Research Scientist: https://job-boards.greenhouse.io/lila/jobs/8012245002Tim's Code visualisation of FER based on Akarsh repo: https://github.com/ecsplendid/ferTRANSCRIPT: https://app.rescript.info/public/share/YKAZzZ6lwZkjTLRpVJreOOxGhLI8y4m3fAyU8NSavx0</p>

Full Transcript

2025 is fast becoming the dawn of a new age of artificial intelligence, an age of miracles. What if I told you that the AI we know today might not be as good as it appears, that what lies underneath the glorious facade is not really intelligent? It's an imposter. AI can create breathtaking art from a single sentence, write complex code in seconds, and converse with us like an old friend. This explosion of capability has led to a powerful and optimistic assumption that as we scale these models, their underlying understanding of the world will get better and better and yet. It's not just that they're beyond human understanding, it's that they're trash. With conventional SGD, which is like the backbone of all of machine learning right now, you get a completely different kind of garbage representation, just total spaghetti. Total spaghetti. So if the internal wiring is a complete mess, how could it possibly produce such brilliant results on the outside? The surprising reason is that it's learned to fake it. Another good metaphor is to think of it as an imposter. The representation of the skull is just somehow a farce. If you just look at the output, it's great. It looks exactly like a skull. But underneath the hood, it's not capturing any of the underlying components or the regularity. So in some sense, it's not really a skull. It's an imposter underneath the hood. To get familiar with the imposter, we have to look at the engine driving almost all of modern AI. The dominant method for training AI today is called stochastic gradient descent, or SGD. It's basically a brute force search. painstakingly adjusting every single grain of sand over and over until its output matches the correct answer until the thing looks like a sand castle basically it works but a groundbreaking paper from kenneth stanley and his team reveals a big difference between the ai we build today and a different path we could have taken by the way kenneth stanley is a hero of mine he wrote why greatness cannot be planned the special edition show we did with him four years ago was peak MLST. But anyway, when you look at these internal representations created by SGD, to put it politely, they're a mess. Garbage representations is total spaghetti. We came up with some terminology that we put in the paper to more clearly articulate what these differences are. But basically, you're talking about just amazing versus garbage. The question that the paper addresses is, what does this mean? Which is something I think that has endless repercussions and potential implications, like the fact that we're basing the entire field on something that produces this complete garbage under the hood. The paper gives this garbage representation a formal name, a fractured, entangled representation. It argues that concepts which should be unified are fractured and discombobulated into overlapping pieces, and behaviors which should be independent become entangled. It is, in essence, the difference between a deeper understanding and elaborate memorization. I can give you a personal example of that. So in high school, when I went to sign up for physics, for physics one, they put me in the one that was for people who had not had calculus. And I had had calculus, right? And so I'm in this class and I'm like, what the heck are we doing? We're just memorizing all these like long lists of equations, you know, for a cannonball in this situation and they're like oh actually we put you in the wrong class you have calculus you're supposed to be over here so after a week I switched to the other one it was so much easier because I knew calculus so I didn't need to have a formula for this specific cannonball situation I just could derive it or just calculate it directly you know it radically different learning mode This distinction is really important basically whether intelligence can only replicate what it seen versus one which can go on to create something new. Two mathematicians, they can both ace like a math exam and one can go on to become like a great mathematician that discovers a lot of things in the field and the other one can go on to discover nothing. It doesn't give you a picture of what we really care about, which is downstream, like how they influence the field and how their research progress carries out. Today's large language models are the second mathematician. They ace the benchmark test, but they are imposters, lacking the deep, structured understanding required for inventive creativity, which is to say, taking the next step forwards out of the box. But what if there's another way? The paper discusses another leading paradigm, founded in an old online experiment Kenneth did many years ago called Pick Breeder. Pick Breeder system allowed people to effectively breed pictures. We found inside of the system that the people who would decide they want a certain image and try to evolve that image would fail. And then people who are not looking for anything particular would discover all these amazing things. Like the butterfly was kind of the symbol that we used because we put it on the front of the book. One of these serendipitous kinds of discoveries. And this led to this idea that, well, you know, there's many things in the world that we're not going to be able to find if we directly search for them because of deception. And that's the underlying reason. We'll come back to this concept of deception and what it means later. But first, let's talk about this new architecture, which improves on SGD. The way these new networks learn is completely different. The representations they create are beautiful. You know, they actually represent the objects at a deep, abstract level. Kenneth and his co-authors called this a unified factored representation. Instead of a tangled mess, the system builds clean, modular, and shockingly intuitive models of the world. The underlying representations of these images, which are basically represented, encoded by these neural networks, are absolutely incredible, amazing. And there is like no good explanation for how they could be as good as they are. They have unbelievable modular decomposition, which means that it's almost like it was engineered by a person. There's a network that generates the image of a skull, and the network has decomposed it such that there's a component of the network that's responsible for the mouth. It can do things with the mouth, like open and close the mouth, or there's another dimension that can make the mouth smile. Would you believe me if I told you that this deep understanding materialized bottom-up? It was built brick by brick, as it were, without being trained on a massive data set with billions of free parameters. Absolutely incredible, mind-blowing to me, because it's like there's a world model of what a mouth is there without being data-driven. Like, how is that even possible? There's not a lot of data here, but we're getting world models out of this thing. This observation that it matters how you got to the solution, how it's represented under the hood, just hasn't gotten the light of day until now. and it's kind of a companion to the old insight from the book, which is that sometimes the only way to find something is by not looking for it, but now there's this caveat. But even if you do by looking for it, you may pay a steep, steep price in terms of the underlying representation. The most intuitive evidence comes from sweeping the parameters or the factored representations, as Kenneth would call them. By changing a single connection in the network, you can actually see which factor of variation it represents. In this new type of network, sweeping these values results in a commensurate semantic change. It might be opening the mouth on a skull or winking the eye on a face or swinging the stem of an apple It like the network understands what these objects are at a deep level In conventional networks the same action just produces meaningless chaotic distortions This is what we mean by the imposter And in case you didn get the memo this is basically how ChatGPT works now. We needed to have a huge number of free parameters in the network to make it trainable, right? To make it statistically tractable. But it's precisely that reason that we end up with a sandcastle. It looks like a castle, but it doesn't have any structural joints. It doesn't look anything like we know a castle to be. Why is one network a sandcastle and the other one the real deal? Well, the secret lies in abandoning the fixed objective in training and building bottom up, not chipping away top down like SGD does. We also need to embrace a counterintuitive notion called deception. Deception means the stepping stones that lead to these interesting artifacts that you might want to find don't resemble them. If you have an algorithm that's trying to follow a gradient by matching closer and closer and closer to the objective, getting a higher and higher score, you're going to get stuck in a dead end because of deception. Because the things that lead to the thing you want actually don't look like the thing you want. And this is true in the lineage of many of these images in Pick Breeder. The paper showed the path to the skull in Pick Breeder. The key idea is that sometimes the stepping stones which lead to something important don't even resemble the thing you end up discovering. It might seem like total serendipity, total randomness, but humans have a nose for what is interesting, which has a lot to do with the foundational cognitive prize which nature has bestowed to us through constraints in our evolution and physical environment and, of course, our life experiences on top of that. On the road to getting an image of a skull, they were not thinking about skulls. When they discovered a symmetric object like an ancestor to the skull, they chose it even though it didn't look like a skull, but that caused symmetry to be locked into the representation. From then on, symmetry was a convention that was respected as they then searched through the space of symmetric objects. And somehow this hierarchical locking in over time creates an unbelievably elegant hierarchy of representation. This hierarchical locking tells us something really important about how representations emerge. We think it's about finding the right building blocks now, but weirdly it's about making future discoveries more likely. You know, a bit like how good code now reduces technical debt in the future. Bad code is the sandcastle. And there's an evolutionary principle which makes sure that these superior foundations win out over time. Pig Breeder, I think what's especially what was at play was like the evolution of evolvability. Because people were only selecting for what they wanted, right? What looked good. But implicitly, there's also an implicit selection pressure for evolvable things. So if there's like two versions of the skull, which is one is like spaghetti and one is like very modular and composable. After a few generations of evolution, the one that's more evolvable will be the one that wins out. Right. Just like in natural evolution, the evolution of evolvability. And this evolvability combined with the serendipity is what I think gives you these nice representations. The thing that I think makes this really intriguing is that it gives you something that otherwise could never exist, which is a counterexample that there actually do exist networks that don't have that issue. You would think that that's just intrinsic to neural representation, that somehow they just look like kind of entangled messes and that's just the way life is. But clearly it's not how life has to be. This leaves us with a choice, the path of a singular goal-orientated kind of optimization, which creates brittle, fractured imposters, or the path of open-ended exploration, which ostensibly creates robust unified models They argue that this choice fundamentally impacts three important things that we want from AI which is to say generalization creativity and continual learning If you think of the skull, again, as a metaphor for all of human knowledge, because that's what an LLM is trying to capture. It's not just a single image. It's like an image of all of human knowledge. For any input, it should output something that's convincingly human. Then it could be just similarly an imposter. It could be the same. underneath the hood everything could be organized wrong not the way you expect it's like a giant charade and again this is very confusing and counterintuitive like for for for people because like you know people are naturally including even me i would react like but should i really care like when you say it's an imposter but it's getting everything right and it's human level like what are you objecting to um but they but the point is that it can still be an imposter because it's like what we care about here is not just that it's going to get answers right, like get good test scores, like seem to be plausibly human when you talk about things that are in distribution. We want it to be able to go outside, like to do things that are creative, to be able to continue to learn, like to get to the next level, including learn on its own and get to the next level. I mean, these are like the next frontiers for the field. If it's an imposter underneath the hood, then these kinds of things are going to hit a wall. or become insanely expensive. It could be that you can always push through that wall, but the expenses just go up and up like crazy. Exponential, worse, I don't know what it means, but it could be something terrible. We might already be seeing that. The amount of money that we're spending here raises questions like, is it necessary? Does it have to cost this much in energy and in money? So this flips everything on its head, right? The very thing we're trying to control, the objective, is a bottleneck for the thing we actually seek, creativity. True creativity might even be intelligence. I mean, it's certainly the nearest quantity I can think of to describe what intelligence is. Once I say that what you need to be good at is if I define where I want you to go and then you can get there, then I'm basically training you not to be able to be smart if you don't know where you're going. But that's what creativity is. It's about being able to get somewhere and be intelligent, even though you don't know where your destination is. The biggest risk may not be that our machines become too intelligent, but that we become too narrow in how we define intelligence. The blind pursuit of benchmarks and performance metrics might actually block us from discovering the real thing. I think one of the high level things we should be doing is not putting all our eggs in one basket. right that's like the main point of the open-endedness lesson is obviously there should be people scaling up these llms to see how far the current paradigm can get us more people should look into you know artificial life pick breeder and the ideas from our paper and uh see because i think it's a very promising direction we need to build an ai which doesn't regurgitate patterns from its training data but it actually understands the deep structure of the world an ai that can look at new scientific challenges that can discover entirely new principles. The path to artificial intelligence is not a straight line towards a known destination. It's a divergent, unpredictable, open-ended search into the unknown. It's possible that the most important discoveries that we will eventually make will be the ones we aren't even looking for now. and by the way folks if you want to watch the entire roughly two and a half hours worth of goodness with kenneth and arkash his co-author at mit um yeah we will be releasing that pretty much on our next episode so hopefully this has whetted your appetite for that cheers

Share on X Share on LinkedIn