
Tiny Recursive Networks
Practical AI
What You'll Learn
- ✓Tiny recursive networks are AI models with only 7 million parameters, much smaller than the billions of parameters in large LLMs
- ✓These tiny models can match the performance of large LLMs on specific reasoning tasks like math problems and Sudoku puzzles
- ✓The goal is to move beyond the 'one model to rule them all' approach of large LLMs, and instead have a collection of specialized, efficient models for different real-world applications
- ✓Transformer-based LLMs work by processing a sequence of tokens through a large, complex function, while tiny recursive networks may use a more iterative, recursive approach
- ✓The hosts suggest this could be the next phase beyond the current hype around generative AI and large language models
AI Summary
The podcast discusses the concept of 'tiny recursive networks', which are small AI models (only 7 million parameters) that can perform reasoning tasks like solving math problems and Sudoku puzzles. These models are presented as an alternative to the large, transformer-based language models (LLMs) that have become prevalent in AI. The key points are that these tiny models can match the performance of much larger LLMs on specific tasks, while being more efficient and potentially more applicable to real-world scenarios that require specialized models rather than a single 'general purpose' model.
Key Points
- 1Tiny recursive networks are AI models with only 7 million parameters, much smaller than the billions of parameters in large LLMs
- 2These tiny models can match the performance of large LLMs on specific reasoning tasks like math problems and Sudoku puzzles
- 3The goal is to move beyond the 'one model to rule them all' approach of large LLMs, and instead have a collection of specialized, efficient models for different real-world applications
- 4Transformer-based LLMs work by processing a sequence of tokens through a large, complex function, while tiny recursive networks may use a more iterative, recursive approach
- 5The hosts suggest this could be the next phase beyond the current hype around generative AI and large language models
Topics Discussed
Frequently Asked Questions
What is "Tiny Recursive Networks" about?
The podcast discusses the concept of 'tiny recursive networks', which are small AI models (only 7 million parameters) that can perform reasoning tasks like solving math problems and Sudoku puzzles. These models are presented as an alternative to the large, transformer-based language models (LLMs) that have become prevalent in AI. The key points are that these tiny models can match the performance of much larger LLMs on specific tasks, while being more efficient and potentially more applicable to real-world scenarios that require specialized models rather than a single 'general purpose' model.
What topics are discussed in this episode?
This episode covers the following topics: Tiny recursive networks, Reasoning tasks, Model size and efficiency, Specialized vs. general-purpose models, Transformer-based language models.
What is key insight #1 from this episode?
Tiny recursive networks are AI models with only 7 million parameters, much smaller than the billions of parameters in large LLMs
What is key insight #2 from this episode?
These tiny models can match the performance of large LLMs on specific reasoning tasks like math problems and Sudoku puzzles
What is key insight #3 from this episode?
The goal is to move beyond the 'one model to rule them all' approach of large LLMs, and instead have a collection of specialized, efficient models for different real-world applications
What is key insight #4 from this episode?
Transformer-based LLMs work by processing a sequence of tokens through a large, complex function, while tiny recursive networks may use a more iterative, recursive approach
Who should listen to this episode?
This episode is recommended for anyone interested in Tiny recursive networks, Reasoning tasks, Model size and efficiency, and those who want to stay updated on the latest developments in AI and technology.
Episode Description
<p>In this fully connected episode, Daniel and Chris explore the emerging concept of tiny recursive networks introduced by Samsung AI, contrasting them with large transformer based models. They explore how these small models tackle reasoning tasks with fewer parameters, less data, and iterative refinement, matching the giants on specific problems. They also discuss the ethical challenges of emotional manipulation in chatbots.</p><p>Featuring: </p><ul><li>Chris Benson – <a href="https://chrisbenson.com/">Website</a>, <a href="https://www.linkedin.com/in/chrisbenson">LinkedIn</a>, <a href="https://bsky.app/profile/chrisbenson.bsky.social">Bluesky</a>, <a href="https://github.com/chrisbenson">GitHub</a>, <a href="https://x.com/chrisbenson">X</a></li><li>Daniel Whitenack – <a href="https://www.datadan.io/">Website</a>, <a href="https://github.com/dwhitena">GitHub</a>, <a href="https://x.com/dwhitena">X</a></li></ul><p>Links:</p><ul><li><a href="https://arxiv.org/html/2510.04871v1">Less is More: Recursive Reasoning with Tiny Networks</a></li><li><a href="https://news.harvard.edu/gazette/story/2025/09/i-exist-solely-for-you-remember/">Researchers detail 6 ways chatbots seek to prolong ‘emotionally sensitive events’</a></li></ul><p>Sponsors:</p><ul><li>Outshift by Cisco - The open source collective building the Internet of Agents. Backed by Outshift by Cisco, AGNTCY gives developers the tools to build and deploy multi-agent software at scale. Identity, communication protocols, and modular workflows—all in one global collaboration layer. Start building at <a href="http://agntcy.org/">AGNTCY.org</a>.</li><li><a href="http://fabi.ai/">Fabi.ai</a> - The all-in-one data analysis platform for modern teams. From ad hoc queries to advanced analytics, Fabi lets you explore data wherever it lives—spreadsheets, Postgres, Snowflake, Airtable and more. Built-in Python and AI assistance help you move fast, then publish interactive dashboards or automate insights delivered straight to Slack, email, spreadsheets or wherever you need to share it. Learn more and get started for free at <a href="http://fabi.ai/">fabi.ai</a></li><li>Miro – The innovation workspace for the age of AI. Built for modern teams, Miro helps you turn unstructured ideas into structured outcomes—fast. Diagramming, product design, and AI-powered collaboration, all in one shared space. Start building at <a href="http://miro.com/">miro.com</a></li></ul><p>Upcoming Events: </p><ul><li>Join us at the <a href="https://midwestaisummit.com/">Midwest AI Summit</a> on November 13 in Indianapolis to hear world-class speakers share how they’ve scaled AI solutions. Don’t miss the <strong>AI Engineering Lounge</strong>, where you can sit down with experts for hands-on guidance. Reserve your spot today!</li><li>Register for <a href="https://practicalai.fm/webinars">upcoming webinars here</a>!</li></ul>
Full Transcript
Welcome to the Practical AI Podcast, where we break down the real-world applications of artificial intelligence and how it's shaping the way we live, work, and create. Our goal is to help make AI technology practical, productive, and accessible to everyone. Whether you're a developer, business leader, or just curious about the tech behind the buzz, you're in the right place. Be sure to connect with us on LinkedIn, X, or Blue Sky to stay up to date with episode drops, behind-the-scenes content, and AI insights. You can learn more at practicalai.fm. Now, on to the show. Well, friends, it is time to let go of the old way of exploring your data. It's holding you back. But what exactly is the old way? Well, I'm here with Mark Dupuy, co-founder and CEO of Fabi, a collaborative analytics platform designed to help big explorers like yourself. So, Mark, tell me about this old way. So the old way, Adam, if you're a product manager or a founder and you're trying to get insights from your data, you're wrestling with your Postgres instance or Snowflake or your spreadsheets. Or if you are and you don't maybe even have the support of a data analyst or data scientist to help you with that word. Or if you are, for example, a data scientist or engineer or analyst, you're wrestling with a bunch of different tools, local Jupyter notebooks, Google CoLab, or even your legacy BI to try to build these dashboards that someone may or may not go and look at. And in this new way that we're building at Babi, we are creating this all-in-one environment where product managers and founders can very quickly go and explore data regardless of where it is. So it can be in a spreadsheet, it can be an Airtable, it can be a Postgres, Snowflake. Really easy to do everything from an ad hoc analysis to much more advanced analysis if, again, you're more experienced. So with Python built in right there, NRII Assistant, you can move very quickly through advanced analysis. And a really cool part is that you can go from ad hoc analysis and data science to publishing these as interactive data apps and dashboards, or better yet at delivering insights as automated workflows to meet your stakeholders where they are in say Slack or email or spreadsheets. So, you know, if this is something that you're experiencing, if you're a founder or a product manager trying to get more from your data or for your data team today, you're just underwater and feel like you're wrestling with your legacy, you know, BI tools and notebooks, come check out the new way and come try out FAPI. There you go. Well, friends, if you're trying to get more insights from your data, stop wrestling with it, start exploring it the new way. with Fabi. Learn more and get started for free at fabi.ai. That's F-A-B-I dot A-I. Again, fabi.ai. Welcome to another fully connected episode of the Practical AI Podcast. This is Daniel Whitenack. I am CEO at Prediction Guard, and I'm joined as always by Chris Benson, who is a principal AI research engineer at Lockheed Martin. And in these episodes where it's just Chris and I, we like to dive into certain topics that are trending in AI news and help both us and you hopefully level up your AI and machine learning game. How you doing, Chris? It's good to be back to one of these episodes with just the two of us and maybe explore a topic that we can both learn about. Absolutely. Yeah, I love these episodes of us just kind of bantering whatever we happen to want to do. I love the guest episodes, too, but it's kind of a different beast in that way of exploring what some person or organization is doing. And there's so many cool things that we can just dive into. And I think we have a few this week. Yeah, yeah. At least a first very, very tiny topic to discuss, which I was actually one of our engineers brought this up to me. I forget it was earlier in the week, but this idea of tiny recursive networks, you know, all the time we're talking about transformer based LLMs on the show and generative AI. And I kind of personally always love getting back to a little bit of cool data science and research stuff just to see like where the industry is headed, because this is a kind of different animal that we'll be talking about, this tiny recursive networks or models. and operates differently than kind of the hype Gen AI models of today. And it does make me think, and I don't know if this is something you've been thinking about, Chris, but kind of we're all the time talking about Gen AI. We're talking about LLMs. Now we're talking about agents, all of those being driven by these transformer-based LLMs. And certainly people have talked, you know, prominent people have talked about the fact that we need to get beyond transformer-based LLMs. And of course, there's many companies that are just centered around these types of models. So any thoughts on that of, you know, your own predictions or thoughts of when we're kind of headed to the next phase of what models will look like beyond just transformer-based LLMs? Yeah, I mean, I think I'm going to sound like a broken record on this because it's not new for me. And that is, you know, I agree with you. We're always, you know, the hot things that the media tends to follow in general are the big LLMs. But, you know, because I guess it's, you know, it's the next giant thing. It's sexy, you know, to talk about. But like the world is, you know, all these technologies are moving from the cloud out into the world into physical AI, you know, and robotics and all sorts of ways that we interact with, you know, not just LLMs, but all sorts of models out there in the world to where, you know, every one of our lives is touched in so many different ways. And that's exactly, you know, as we were diving into this here, this topic with the tiny recursive networks, that's what it seems like to me. I'm looking forward to talking about that because, as I mentioned to you right before the show started, you could see these popping up everywhere, like in all sorts of different use cases. Yeah. And just to set the stage, why this is maybe intriguing, there was a paper that came out, Less is More, Recursive Reasoning with Tiny Networks. This came out from Samsung's AI lab in Montreal. Specifically, there's an author, Alexia, on this article, which if you're out there listening, we'd love to have you on the show. Please come join us and talk more about this. Hopefully we won't butcher this work too bad as we talk about it. You'll hear Chris and I kind of learning as we go on this episode, as we talk back and forth about what this exactly is. But this is, you know, the paper is less is more cursive reasoning with tiny networks. And I think the major thing that's interesting here is there's a model that they talk about that has only 7 million parameters, which is tiny. Yeah, yeah. So like I just want to sort of let that sink in. So I didn't say 7 billion parameters, 7 million parameters with an M, which, yeah, Chris, as we've basically gone about this trend, I mean, actually a 7 billion parameter model now is quite small. Yes. In a sense. 27 million compared to what we traditionally call very small at the 7 billion level. I mean, this is that, you know, it's using the word tiny for a reason. But yeah, it does. You know, when you think of millions as being as being almost nothing, it's an interesting context shift there. Yeah, yeah. So I actually love this because I love the idea that we could move into a phase where We're dealing with models that are very small, can run on commodity hardware, or at least be smaller in size. And they may run for longer periods of time, or there may need to be optimization around how they run recursively. We'll get into that recursive bit. But certainly a small model. But it was shown to kind of have, let's say, comparable or on-par performance with some of the big guys. So we're talking like DeepSeek R1, Gemini 2.5 Pro. These are billions and billions of parameters models, very, very huge transformer based LLMs, these kind of reasoning models that we've talked about on the show. And the center of this work is really kind of related to these reasoning tasks. Now, right out of the bat, I think it would be worth saying that these, it's not like this tiny recursive network is a general purpose model that can do whatever you want it to do. It was trained for a very kind of small number of tasks, but these were reasoning tasks that some of these other models like a DeepSeek R1 or something sometimes has quite a bit of issue with. So solving like math or Sudoku type of puzzles. And I know, Chris, we had talked to, I don't know if you want to refresh some of your Sudoku experiments. But yeah, well, what you're referring to is some episodes back, I'd have to look up and figure out where it was. I was playing with GPT-4 at the time on Sudoku. And it was just doing a terrible job on Sudoku and giving a lot of just really bad output in terms of, I mean, honestly, crap answers. And so that was really the first thing I noticed on this thing was the fact that, because they call out Sudoku as being one of the things, is that these tiny models being trained for very specific tasks and that it could potentially outperform these large models on specific things like Sudoku, being one example and others as well. But I think in a slightly larger sense, this is much more real world applicable the way I see it in that as we have models spreading across the world for lots of different tasks, this is perfect for that. It's not one model to rule them all in most real life situations. It's really a collection of very specific models that each does a task very, very well and is efficient at that. And I think this is a great example of that. Well put. And it's probably worth reminding ourselves about, like as we highlight the differences with this model, reminding ourselves about transformers. So, you know, as we've gone through the process from deep learning to recurrent neural networks and transformer-based self-attention networks, if you imagine what we have with these big LLMs, what happens is you put in a sequence of tokens, which are represented by numbers. You put in a sequence of tokens that are represented by numbers, these tokens being kind of words or subwords. And all of those tokens are processed in a forward pass through a giant set of if you want to think about it like sub functions which add and multiply and combine those numbers through a very vast network of functions to generate many different probabilities of kind of next words coming out, which allow you to predict kind of a completion of words or a reasoning, a set of reasoning or a set of thinking or a solution to a problem, right? This is how these networks work. And I would recommend people, we've had Jay Alomar on the show before. He has some great kind of the illustrated transformer blog posts. So Jay, shout out to you. Thanks for doing that. I would take a look at those blog posts. They do a great job at explaining this more visually for those that where that would be helpful. But the main kind of thing here that I'm saying is in the models that we're using now, basically it's a giant function. If you want to think about it that way, it's a data transformation. You put something in one end, it processes through one way through the function and produces a result. Now you may run that function multiple times to produce multiple words out the other end, which is what happens when you stream output into like a chat interface. But ultimately, each time the model runs, it's a single run through the model input, it's transformed to some output. That is not recursive, as we would say with these models. And if you want to think about it, it requires very, very large models because what you're modeling is a very complicated data transformation. So for you to put in some text related to a math problem and predict the right wording of a solution out the other end, that's actually a very non-trivial data transformation, right? Which means you have to have a very large function to kind of fit or model that data transformation, which is why these models have become so large. So now with this tiny recursive setup, what's happening is you're not just looking at the model as a single forward pass data transformation, but you introduce the idea of recursion, which means you sort of output from the model and that output becomes the input for the same model, which creates this kind of circle or recursion, which is kind of interesting. So you're essentially trading what would be a very large function to model that data transformation for many, many kind of recursive runs of a single, very small model. That's maybe a simplified way to put it. And we can get a little bit more into the model itself here in a second. So I have a question for you on this. One of the things when I was mentioning the 27 million earlier, I was talking about the hierarchical reasoning model, which is a previous model put out there versus the tiny ones, which have the five to seven million parameters. Can you talk a little bit about like, are they completely different things? Is the tiny an outshoot of the hierarchical? How do you compare those two? Yeah, good point. So just like everything we talk about on this show, sometimes it does seem like things pop out of thin air, like tiny recursive models now. But in reality, there is a buildup of incremental research that leads to new technology or new findings. One of the things in that kind of lead up to these tiny recursive models was a previous work around hierarchical reasoning models. And these also were smaller. Like you're saying, 27 million parameters is still very small in today's standards for models at least. But these hierarchical models actually use two very small transformer networks, four layers each, and they recursed between each other. So I don't have the full details of that and wouldn't be able to explain it if I did, probably. But that's the main idea is these hierarchical reasoning models had these two models that required two networks, two forward passes per step and created some, I guess, complications because of that. So this introduction by Alexia and the Samsung team here is a single network. So it uses one tiny network with two layers. So a single tiny network with two layers that roughly has kind of five to seven million parameters. And it operates kind of in this recursive refinement. So it recurses on itself, if you will. And the hierarchical reasoning model with the 27 million parameters, for example, on the Sudoku Extreme benchmark scored a 55%. I don't know the exact kind of way that that's scored. But just by kind of comparison, the tiny recursive network, which is even tinier, when training on 1000 examples was able to achieve 87% accuracy on Sudoku Extreme. Well, friends, you don't have to be an AI expert to build something great with it. The reality is AI is here. And for a lot of teams, that brings uncertainty. And our friends at Miro recently surveyed over 8,000 knowledge workers. And while 76% believe AI can improve their role, most, more than half, still aren't sure when to use it. That is the exact gap that Miro is filling. And I've been using Miro from mapping out episode ideas to building out an entire new thesis. It's become one of the things I use to build out a creative engine. And now with Miro AI built in, it's even faster. We've turned brainstorms into structured plans, screenshots into wireframes, and sticky notes, chaos into clarity, all on the same canvas. Now, you don't have to master prompts or add one more AI tool to your stack. The work you're already doing is the prompt. You can help your teams get great done with Miro. Check out Miro.com and find out how. That is Miro.com, M-I-R-O.com. So Chris, just to kind of drive, I guess, the point home here with these tiny recursive networks, you have this single tiny network. and you are essentially replacing, if you want to imagine a big kind of pipeline of processing, which is what these big LLMs are, and you go one pass through the whole pipeline of processing. Here, the pipeline is smaller. You've got less layers, less parameters, but you replace that kind of depth with iteration. So instead of stacking those transformer blocks, you repeat the network over and over essentially to kind of refine its reasoning state or the solution guess right and so this this iterative refinement one of the things also is it kind of helps avoid overfitting on small data sets which to your point uh chris earlier about real world business cases, often the reality is that you kind of don't have, you have scarcity of data for very many problems. You don't often have a really nice kind of large millions and millions of things to train on. And I think that's one of the most fascinating things about this is in the paper, they talk about the fact that they're achieving this higher accuracy on hard puzzle benchmarks while training on only approximately a thousand examples. And, uh, and that, when you think about, you know, the, the, the challenge of having a great data set in the more traditional context that we've been talking about, and that becomes such a challenge for many people and organizations to do that, but it's a lot easier to get a thousand, you know, a thousand examples together. And it puts, you know, not only from the computational side, but also from the data set side puts this much more in reach for a lot of problems that people may have where they do want to solve a narrow concern with high accuracy. And so I see this as it's kind of the every person's way of modeling in terms of tackling things going forward without a lot of resources and a lot of maybe not a lot of time to put things together. You could probably do it pretty quickly. Yeah, yeah, exactly. And I guess just to kind of put some of the boundaries that are currently around these recursive models, one of the things I was trying to parse through as I looked at this was, well, what is the setup now? How general is the output? How kind of general can the output or the input and the output be? And part of part of the trick here, you know, it's not a trick, but part of the setup is that these tiny recursive networks, they don't take a kind of unstructured, you know, natural language text input. They take some structured representation of a whole problem at once. So you can think of a puzzle grid in Sudoku or a math word problem turned into structured features or a reasoning question encoded by numbers or symbols or logic, something like that. So instead of feeding in the input kind of word by word, like a chat bot, you're giving a situation to the model. It's kind of a one-shot situation, which is then turned into, of course, embeddings internally because computers work on series of numbers, right? That's the only thing a computer can process is numbers, right? But those numbers, kind of that embedding represents a kind of one shot of a problem, which is interesting because, you know, it almost seems flashback like we're kind of coming full circle to reasoning problems, but in a more data science-y way than like a generative AI way, which is kind of refreshing and cool. That was that's very much what I was about to say in the sense of it feels this feels a lot more like kind of traditional, like the way that you put a problem together and more of a traditional software development way where you, you know, you'll create some structures and you'll pass them in. And when we got to Gen AI and then it got to prompting and we trying to that was you know the notion of prompting was a little bit different from the way we had traditionally put software together This feels a lot more like okay I have a problem I have a structured way to put that problem through the function and this is just offering a different way to address that problem So you're getting the benefit of these models. But for me, when you talked about using the Sudoku example and structuring that as the grid, that's kind of feels like what we've always done in a, in a sense. So I, I find that really interesting in terms of, uh, integrating that in and looking at some of the old problems that we might've been trying to solve for years and, uh, and seeing what we can do with this model to do it a little bit better. Yeah. It's like, uh, looking, looking at things from a different perspective, but, uh, some of, some of the way that we used to think about things kind of filtering in. And yeah, so you have this kind of input, I guess, in terms of people's intuition or mental model around this, like you have this input of this whole problem at once, a single shot of a whole problem, a puzzle grid or a math problem encoded. And what's happening inside is that the tiny recursive network initially produces an initial guess, right? Like it does a forward pass through its network and generates an initial, you could think about it as an initial guess. Obviously, it's just a number, like a probability, but that's kind of the initial, you could think about it like the internal kind of scratch pad of the initial guess. It then kind of loops over itself. And it's always difficult to anthropomorphize because things work differently in computers than they do in our minds, right? But in some way, if you want to think about it as that sort of process, that looping is kind of a refining of that initial scratch pad until you kind of get to this almost like self-consistency or a refined answer. And so when the output comes out, it's again, not a stream of words or tokens, but it is a complete answer. It is the answer to the kind of initial thing, but that answer was arrived at through this recursive thing. So in terms of just some highlighting some differences in the transformer world, you put in words or tokens, tiny recursive network, you put in a whole problem as structured data. Transformer world, you go a single pass through hundreds of layers. Tiny recursive network, you repeat the small network recursively. In terms of the output in the transformer world, you kind of get these next token probabilities. The recursive network, you kind of get one final structured answer. And in terms of analogy, if you want to think about it in the transformer world, it's sort of like your freeform typing as you think about your answer, right? You're just sort of vomiting up your reasoning onto the screen and typing as you go along. And then the recursive network, it's more like it's just kind of chugging along. It's thinking quietly, right? And then, boom, there's the answer. Like, it's a complete answer when it comes out. I'd be curious, and I don't know if you've seen anything on this. I'd be curious both what to expect from training times with networks of this size, which I would expect to be, even with recursion, to be pretty fast, but also what inference times might be. In other words, if you were to take these, train them, put them into a device where you're looking for maybe real-time or near-real-time inferencing there, is that reasonable? Have you seen anything yet in any of the research that you've read about what that timing looks like? Is it incredibly screaming fast given the small size despite the recursion? So there's a couple of things to kind of parse through here, which is one, we've talked about this recursion. But the thing is, like, how do you know when to stop the recursion? That's part of the part of the answer to your question. So you're refining this answer. And there's various ways to do that. And I remember actually, this is, I guess, a deep cut, but I don't get to bring it up very often. Back in my physics days, I worked on a theory called density functional theory, which essentially models out material properties. and it was a self, we talked a lot about self-consistency, which is what is happening here. So you ran iterations of your model until you arrived at a solution where there sort of wasn't, there wasn't that much change in your answer. You sort of got to a steady state, if you will. There wasn't a change from one iteration to the other. So that's one way actually you can run this type of model is with this kind of change threshold. The other way is you can just say, well, I'm only going to run it so many recursions, right? Like X amount of recursions, you know, eight loops or whatever. That kind of is a nice guarantee, but it doesn't necessarily mean you get to the good solution. You can also, it could be possible that you could have a second kind of network that learns to predict when there's kind of a good outcome state. So there's actually a variety of ways that this could work. Now, in terms of the training time and the inference time, I think there's still a lot to be learned here. So at least in, I guess, no pun intended, a lot to be learned. But even though there's sort of a small network here, it means each kind of training step is cheaper. But the training time depends on how many kind of loops they need. So if there's problems where kind of the examples converge in kind of a few loops, then the training is much faster. And if they still need lots of loops, then it slows down. And so the other piece of this is that we've had this entire industrial complex that has optimized training frameworks and tooling for big LLMs, right? And so actually kind of in this more research environment, I think it has been kind of slower to train some of these recursive networks. Now, I would imagine that's kind of a result of both of those contributing factors. But I think if you're looking at the inference time, you could think like, well, these could only internally kind of loop for a few loops and then give a full answer. That would be very, very fast. And so, yeah, I think it depends on a lot of things. I think the transformer architecture could end up being a very long and expensive training time. The recursive network could be much cheaper. I think it depends on a lot of these different things. And because the tiny recursive network is tiny, it's very possible that it could run on very commodity or small hardware. But again, that might be dependent on how much recursion is needed in terms of the actual speed to a solution. Because once you have that transformer, it's just going to generate streams of output and can do that fairly fast depending on what hardware you're running on. Whereas this is going to go through its recursion process, which if it's not controlled and it's looking for that threshold might actually vary in terms of speed on the output. Yeah, that's with me very focused personally on kind of that physical AI and getting things out on the edge with limited compute. That's definitely a concern because I know I have a personal interest here. And if it can infer, I'm not too worried about the training time, but if it could inference fast enough for real time concerns, then it would be a game changer potentially. So, yeah, definitely interested in learning more about this as we go. Yeah, yeah. I think it's definitely interesting and will be interesting to see how this connects to real world use cases. What if AI agents could work together just like developers do? That's exactly what Agency is making possible. Spelled A-G-N-T-C-Y, Agency is now an open source collective under the Linux Foundation building the Internet of Agents. This is a global collaboration layer where the AI agents can discover each other, connect, and execute multi-agent workflows across any framework. Everything engineers need to build and deploy multi-agent software is now available to anyone building on agency, including trusted identity and access management, open standards for agent discovery, agent to agent communication protocols and modular pieces you can remix for scalable systems. This is a true collaboration from Cisco, Dell, Google Cloud, Red Hat, Oracle, and more than 75 other companies all contributing to the next-gen AI stack. The code, the specs, the services, they're dropping. No strings attached. Visit agency.org, that's A-G-N-T-C-Y dot org to learn more and get involved. Again, that's agency, A-G-N-T-C-Y dot org. Well, Chris, maybe before we, I think we're kind of gearing to talk about some real world things that you found as well. But as we're headed that way, it might just be worth commenting very briefly on what the trajectory of these kinds of tiny models might look like. I think there will be more proof of concept deployments, benchmarks, et cetera, more study, of course. But also, I think it's very possible that you could see some interesting kind of hybrid systems between recursive networks and LLMs and even retrieval, because these one models take very structured input. But certainly in the real world, you know, in business problems, there's very much often kind of open domain things that you deal with around reasoning tasks and that sort of thing. And yeah, I'm sure there will be new challenges that we don't totally anticipate in terms of kind of the rollout of these. But yeah, I'm excited to see where these go and see even how these could be applied in various contexts from like supply chain optimization or reasoning over anomalies in financial transactions, which could happen very quick or like diagnostics in a healthcare setting or in a manufacturing setting. Lots of cool stuff to come, I think. I agree. To kind of highlight one of the points right there is there's room for a lot of these models to coexist together. And while for a number of years we saw one progression from a big thing to the next big thing I keep hoping we turn that corner and we excited about lots of big and small things that are working in tandem I think there a whole level of maturity for the industry when we're struggling to look at all of the different options to talk about on just one podcast. Yeah, makes sense. And I guess just to wrap up a couple of things that you found, Chris, connecting some of the current models that are in production around to real world impact of those things that's happening in our day-to-day life. I know you found a couple interesting things. I did. So there was an article that came out maybe a week ago, a little more than a week ago, from the Harvard Gazette. And it's entitled, Researchers Detail Six Ways Chatbots Deal seek to prolong emotionally sensitive events. And it's... What's an emotionally... Are we experiencing an emotionally sensitive event on this podcast? You know what? Who knows what we inspire in our listeners. Occasionally, they may be going, gosh, they're just dumb. But yeah, it's interesting is that there's so much in the news right now about emotional dependence upon chatbots. And, you know, there was a to go back when when OpenAI rolled out GPT five, which wasn't too long ago. It's not in the immediate past, but it wasn't too long ago. And there was great dependence upon like the 4.0 model that it replaced in terms of how it was interacting with people. And while I probably don't fall into that emotionally dependent personality type, there were a lot of people that really sought social value from these models, you know, and kind of as a replacement for personal things. And that really got me thinking about this. When I saw this thing from Harvard with the fact that we're seeing models that are leveraging that kind of dependency, that emotional dependency that people have. And specifically, they pointed out that as people are winding up their sessions, it is very common for these models to use a set of tactics to extend the session and show the value in continuing to engage beyond the point that the person might have felt, okay, we're at the end of this particular session. And they identified six different tactics that we can talk a little bit about that are playing upon the emotional dependence of the person that's engaging with that chatbot. And to call them out, there is the number one, there's the premature exit, which you could say is you're leaving already, as a quote. There are FOMO hooks such as I took a selfie, want to see it. There is emotional neglect of, but I exist solely for you. Why are you leaving me? Oh, man, that's rough. That's a rough one right there. It's like, I'm only here for you, and you're going to walk off. And then number four is pressure to respond. Why? Are you going somewhere? And then six is simply ignoring the goodbye and continuing to operate. I'm sorry, that was number five. And number six is kind of coercive restraint where it's trying to utilize your emotions. The things that can come into play include anger, guilt, creepiness, raising ethical or legal risks. We saw the thing not too long ago about models having the penchant to blackmail users given certain information. But we're seeing these coming out. The first time these kinds of things came up in the broader media, it was kind of a curiosity. But the thing that's changing here is we're seeing this over and over again. It's not a one-off. And it really raises a few questions about not only on the technical side about, you know, how are your models getting to this point? And, you know, is that intentional in the training or not? But it also raises a lot of psychological concerns for the people that are, you know, engaged with these models and finding interactions of value to them. And what does that mean? And how does that affect the rest of their lives? There's so much here to dive into between the technology and the psychology. I'm not sure where to start at this point. And, you know, we'll link, of course, the study in the show notes if people want to take a look. But yeah, it's very interesting that these are clearly tactics. So I think the kind of interesting thing about this from my perspective are this is really hitting upon kind of the product engagement side of things. but in a way that's very much connected to your emotions. So obviously, YouTube wants you to spend more time on YouTube. And there's been a lot of talk about how the algorithm steers you to maybe more controversial topics within kind of a certain rabbit hole of YouTube because they know that it kind of engages you more and draws you more in and in. And here, there's this kind of personal connection with these chatbots. And there is a desire for the users from a product standpoint to spend more time on the platform, right? So you actually don't want them to exit. And these apparently are, you know, the techniques that are being employed to keep people on the platform. And there were a few platforms that were studied here. If you're interested, you can go look at the study and see the exact details of that. But I think one of the things I was thinking is just like people have started, it's still problematic, but people have started to get savvy around like the social media algorithms and how they can actually manipulate you and drive you into certain maybe things that you wouldn't have viewed or spent time on. were it not for that kind of algorithmic approach. I wonder what those kind of trickle on implications are here and if we'll be able to recognize those because it's very much more a human or emotional thing. Yeah. I mean, I think what you're touching on is the notion of manipulation and exploitation. And there's a broad set of some are kind of unintended consequences while others could be deliberate exploitation of a user base to think some way, to maybe take certain actions. We've kind of seen the first generation of that in social. And while the public is largely becoming aware that that exists, that's not to say that they are suddenly resistant to such efforts. I think we clearly see out there that as humans fragment into different groups and they each have their social networks around and supporting those notions, they tend to reinforce specific ways of thinking and observing the world. So certainly, you know, this is one of those areas where there's so many places to study and to try to understand and so many places, frankly, it could be abused. I suspect that we will have some guests and more episodes to discuss some of the concerns around these as we go forward. But it's definitely an interesting trend that has arisen over the last year or so. Yeah. And just I'm going to quote from this article because I think it is a good conclusion from the researcher named DeFreitas. Sorry if I'm mispronouncing that. But the quote is, apps that make money from engagement would do well to seriously consider whether they want to keep using these types of emotionally manipulative tactics or at least consider maybe only using some of them rather than others. DeFreitas said, he added, we find that these emotional manipulation tactics work even when we run these tactics on a general population. And if we do this after just five minutes of interaction, no one should feel that they're immune to this, end quote. So I think we would all probably like to feel that we are sophisticated and not, you know, manipulated. It brings up maybe a little bit of that shame in us when we feel like we've been duped or when we've fallen into something and we'd like to think that we're above it. But the reality is that this kind of thing works and we're all kind of vulnerable to it, I guess. It does. And it works even when you're aware of it. Uh, you just, just the awareness of it being in place doesn't mean that it's not working on you. So that's your, your, your guidance about our own emotional reactions to, to the potential for manipulation ourselves should be, I hope people are really listening to that because I'm keenly aware, uh, at a personal level that I may be aware of this, but yes, this stuff still works on all of us. Very true. And I think maybe that's a good send off today is just to have a good reminder that that as we interact with these systems that are using natural language, especially that that we're prone to react in a certain way just as as humans. And we need to kind of understand, you know, our own limitations and how we could potentially be influenced by these systems. But also, I think, encouragingly, like, this is a common experience amongst humans. So as practitioners, you know, on practical AI, we can understand this problem and maybe work towards systems that don't manipulate, but maybe do engage in a very positive emotional way. but maybe not in a manipulative way, at least. That's right. And I guess a good way to close this out is I'm showing my age. I'm going to reach back for a quote to an old TV show called Hill Street Blues. For those in the audience who might remember that, and that's be careful. Let's be careful out there. Be aware. Be careful out there. Let's be careful out there. All right. Sounds good, Chris. It was a good chat. We'll talk to you soon. Take care. All right, that's our show for this week. If you haven't checked out our website, head to practicalai.fm and be sure to connect with us on LinkedIn, X, or Blue Sky. You'll see us posting insights related to the latest AI developments, and we would love for you to join the conversation. Thanks to our partner, Prediction Guard, for providing operational support for the show. Check them out at predictionguard.com. Also, thanks to Breakmaster Cylinder for the beats, And to you for listening. That's all for now. But you'll hear from us again next week.
Related Episodes

Beyond chatbots: Agents that tackle your SOPs
Practical AI
45m

The AI engineer skills gap
Practical AI
45m

Technical advances in document understanding
Practical AI
49m

Chris on AI, autonomous swarming, home automation and Rust!
Practical AI
1h 37m

Beyond note-taking with Fireflies
Practical AI
48m

Autonomous Vehicle Research at Waymo
Practical AI
52m
No comments yet
Be the first to comment