

ChatGPT is Dying? OpenAI Code Red, DeepSeek V3.2 Threat & Why Meta Fires Non-AI Workers | EP99.27
This Day in AI
What You'll Learn
- ✓OpenAI has lost 6% market share this year, and is facing a 'code red' situation
- ✓OpenAI's attempts to monetize ChatGPT through ads have been unsuccessful, as users are more likely to switch to free, ad-free alternatives
- ✓OpenAI's models are no longer seen as the best, with competitors like Anthropic and Meta offering strong alternatives
- ✓OpenAI has failed to turn ChatGPT into a deeper, more integrated personal assistant, instead remaining a 'one-off' chat experience
- ✓OpenAI's financial situation is precarious, as it needs to make deals for hardware while its consumer-focused business model is unstable
Episode Chapters
Introduction
The hosts discuss the challenges facing OpenAI and the ChatGPT platform, including market share losses and issues with monetization.
Dilution of OpenAI's Brand
The podcast argues that OpenAI has diluted its brand by pursuing too many different directions instead of focusing on its core strengths.
Competitive Landscape
The hosts analyze the rise of competitors like Anthropic and Meta, and how their models are challenging OpenAI's dominance.
ChatGPT as a Consumer Product
The podcast discusses the limitations of ChatGPT as a consumer-focused product, and how its lack of deeper integration may make it vulnerable to free, easily accessible alternatives.
OpenAI's Financial Challenges
The hosts examine the financial pressures facing OpenAI, including the need to make deals for hardware while its consumer-focused business model is unstable.
AI Summary
The podcast discusses the challenges facing OpenAI, the company behind ChatGPT. It suggests that OpenAI has lost market share to competitors like Anthropic's DeepSea and Meta's Gemini, and that the company's focus on monetization through ads has backfired. The hosts argue that OpenAI's models are no longer the best, and that the company has diluted its brand by pursuing too many different directions instead of focusing on its core strengths.
Key Points
- 1OpenAI has lost 6% market share this year, and is facing a 'code red' situation
- 2OpenAI's attempts to monetize ChatGPT through ads have been unsuccessful, as users are more likely to switch to free, ad-free alternatives
- 3OpenAI's models are no longer seen as the best, with competitors like Anthropic and Meta offering strong alternatives
- 4OpenAI has failed to turn ChatGPT into a deeper, more integrated personal assistant, instead remaining a 'one-off' chat experience
- 5OpenAI's financial situation is precarious, as it needs to make deals for hardware while its consumer-focused business model is unstable
Topics Discussed
Frequently Asked Questions
What is "ChatGPT is Dying? OpenAI Code Red, DeepSeek V3.2 Threat & Why Meta Fires Non-AI Workers | EP99.27" about?
The podcast discusses the challenges facing OpenAI, the company behind ChatGPT. It suggests that OpenAI has lost market share to competitors like Anthropic's DeepSea and Meta's Gemini, and that the company's focus on monetization through ads has backfired. The hosts argue that OpenAI's models are no longer the best, and that the company has diluted its brand by pursuing too many different directions instead of focusing on its core strengths.
What topics are discussed in this episode?
This episode covers the following topics: ChatGPT, OpenAI, Generative AI models, AI market competition, AI monetization strategies.
What is key insight #1 from this episode?
OpenAI has lost 6% market share this year, and is facing a 'code red' situation
What is key insight #2 from this episode?
OpenAI's attempts to monetize ChatGPT through ads have been unsuccessful, as users are more likely to switch to free, ad-free alternatives
What is key insight #3 from this episode?
OpenAI's models are no longer seen as the best, with competitors like Anthropic and Meta offering strong alternatives
What is key insight #4 from this episode?
OpenAI has failed to turn ChatGPT into a deeper, more integrated personal assistant, instead remaining a 'one-off' chat experience
Who should listen to this episode?
This episode is recommended for anyone interested in ChatGPT, OpenAI, Generative AI models, and those who want to stay updated on the latest developments in AI and technology.
Episode Description
<p>Join Simtheory: <a href="https://simtheory.ai/">https://simtheory.ai/</a></p><p>OpenAI has declared "Code Red" as ChatGPT faces growing competition from Gemini and other rivals. In this episode, we break down OpenAI's 6% market share decline, why their ad strategy is on hold, and what they need to do to reclaim the AI crown. We also explore DeepSeek V3.2's impressive capabilities as a cheap open-source alternative, Meta's new policy grading employees on AI skills, and the crisis facing higher education as AI fluency becomes essential. Plus, Fatal Patricia hits #1 on our Spotify charts, and Tesla's Optimus robot is running like a slightly unfit human.</p><p>CHAPTERS:<br>00:00 Intro - OpenAI Code Red & Market Share Crisis<br>07:03 ChatGPT's Failure to Go Deeper Into Users' Lives<br>16:33 What OpenAI Needs to Win Back the Crown<br>26:46 Chris's Wishlist for an OpenAI Comeback<br>31:22 DeepSeek V3.2 - The Open Source Threat<br>39:34 Meta Grading Workers on AI Skills<br>46:29 The University & Education AI Crisis<br>56:25 Fatal Patricia Hits #1 & WTF of the Week</p><p>Thanks for listening. Like & Sub. xoxox</p>
Full Transcript
So, Chris, this week, it is code red. Code red. It's all falling apart at OpenAI. According to the information, OpenAI CEO declares code red to combat threats to chat GPT delays ad effort. So they can't roll out the ads anymore. According to sources, they've lost similar webs 6% of market share this year. And, you know, it's kind of not looking good. I think Gemini 3 has been a bigger hit than they thought. And, you know, I would argue people are just in love with Nano Banana is the truth and want free memes. Yeah, I think that's the thing. Like, people really want to use whatever the best of whatever is at the time, right? Like, if I'm going to, like, try to impress my friends or a company presentation or whatever it is, It's like, I want to use the best image model to do that. And right now, would you recommend anything other than Nano Banana? Like, if you're trying to show off what the capabilities are, that's what you'd use. Same with music. You're going to use the best music model or the best text-to-speech model. You're not going to go, well, I'm brand loyal to OpenAI. I only have ChatGPT. That's what I'll use. And like, I've seen people like, we'll use an example later, but I was at my kid's school this week for their annual presentation. the principal the big principal of the whole school used suno to make a song in chinese about himself in the chinese class and put it up on the thing with videos that he made with vo3 right like this is someone you would consider to be this sort of mainstream consumer who's just going to chat gbt because that's what they know but no he's used the best of what's available he's been able to navigate it and find it because he really cared about showcasing the technology now the thing about OpenAI is really what their motivation is, is how do we cash out our money from this thing? Like we've made this massive brand. We want to now foist it on the public, get our shares sold at a good price. And the only way to do that is be perceived as the best. And they've just totally diluted that by having all these feigns in different directions and never following through. I think this is the problem, isn't it? There's just been so many different directions and it's sort of like you forget what got you there it's sort of like a band right and they've got a certain style of music that all the fans love and the fans are like this is this is that band genre right and then all of a sudden they do country music and you're like and no one likes that album Beyonce yes um and so I think this is the problem with OpenAI right now is the models aren't the best And I know there's going to be the fanboys in the chat that'll be like, GPT 5.2 Pro-716 is great. That's not a real model. It's too expensive and it's not accessible. It's not a real model. It might be the best, deploy it to everyone. So I just, they don't have the best models. They don't have any advantage when it comes to like silicon or chips or hosting from what I understand. so really what they had is we are the best at developing AI models and maybe image models and you know maybe Sora as well with a with a video model or although I would argue VO 3.1 is still far better so I don't I just don't I think their moat was that allure of having the best models and the hype around it and if you rewind to this time last year we were so hyped around 03 and all the leaks around it and it just smashing benchmarks. I recall when we started using it, we were completely obsessed with it. Like it was a really good model and a huge leap. But this year, I don't know, it's crickets. There's rumors that there's four new models going to drop before the end of the year, but I don't know. I would put it down to these are panic tunes of the model or benchmark, like benchmark maxing models. Yeah. And look, I might be naive, but we talk about the daily active users and their brand recognition as their main advantage, right? Because, like, you can't argue only losing 6%. I mean, that's nothing. Like, they've still got, what did you say, 800 million daily active users? I think the other thing is you've got to look as well. Like, the pie is growing, right? So, like, if I bring it up on the screen now with SimilarWeb, it's like generative AI tools and the market's growing. So it's getting bigger. And yes, their share is technically less, but the market is bigger. Yeah. No, I get that. But in a way, like what worries me or what would worry me if I was in that business? I'm sure I wouldn't be worried at all. I'd just be on my yacht or whatever. But like, let's say I was in there and I still cared for some reason. I'd be worried about the fashion nature of it. Like it's fashion, right? To use ChatGPT. It's a fashionable thing to do and like use it in my work, use it in my life, whatever. The problem is that that kind of cheap AI where you're not really invested in it, like you're not building detailed agents, you're not, you know, working with AI systems, you're simply just using like a chat prompt. It's so easily replicated for free in other products. Like, as you mentioned earlier, like the Gemini built into Google is pretty good. Like it actually does the job pretty well. Like a lot of these basic AI assistants you can get anywhere are pretty decent. And I just can't help but feel like the casual user in the long run is just not going to bother with ChatGPT and will just use whatever is easily available on their phone or their desktop or whatever it is, right? And to me, the real users that matter are things like massive organizations, governments, like big industrial corporations and things like that who need controls, who need reliability, who need trust, who need security, this kind of thing. and those big contracts are going to go to someone. And I just don't know if OpenAI is going to be that company now just because of the way they've diluted and damaged their reputation by going in all of these different directions. Not to mention, you've got to remember, if you're a regular OpenAI API subscriber, there is a single checkbox in the UI which says, you can train on my data, and you get a million free tokens or something, a month or something like that, right? If you check that box even once, if one person on your dev team, and that's anyone who's an admin in your OpenAI, checks that box even once, they're going to train on all of your data, right? Like, that's a deadly serious security risk. And who would mess around with a company who makes it that fickle to suddenly train on your data? Like, I don't know about you, but it seems to me when I talk to people in serious decision-making positions, one of their biggest fears is their data being trained on like that. Like that is the big issue. And these guys haven't even really got an answer for that. Yeah. And I think you raised two interesting points. One is that chat GBT, it hasn't, they haven't as a product. You get out a way to go deeper into your life than what I think was promised. Like it's not really a great personal assistant. It's a great chat model and a great chat experience. It's very fast. It's accessible and available everywhere. I think they still have arguably, you know, top, maybe second now, image recognition model. So I think as a platform, it's great. There's some features in there that, you know, were very much hyped, like deep research and a few of their sort of agentic things like the computer use stuff, the browser, all that kind of stuff. But nothing's actually stuck at all. Like there's never been a moment where they've gotten deeper into your life. it's sort of like every time you go to chat gbt it's like a tinder date you go there for kind of one thing right you like my analogy um and then before my time before my time as well but i'm just i'm still using it so just for research yeah um so you you go there for one thing and you don't stick around this is not a relationship tool um and so i think that's the problem right is like they never got deeper with the consumer. So if DeepSea comes out with a model, we'll get to later, and Anthropic has a model or Gemini has a model and people hear about it and that's just as accessible and the cost is free, I think the consumer is lost. It's almost like the newspaper business model in a way. It's like free is a great price and the models are just getting better. So that consumer side of the business, it's always going to be tough and eroded. The only way to do it is to have all the daily active users and sell ads. But if you deploy ads too early, like they're finding out or maybe about to find out, everyone's just going to go to where there's no ads. Yeah. Nothing is going to piss off the, hang on, the average AI user than injecting ads into the responses from your AI models. Like, can you imagine how quickly you would switch away from that model if that started to happen? Like it would kill you. The other thing I think is important, important noting, when you look at the financial side of OpenAI, which YouTube is all abuzz about, everyone's talking about, oh, they've agreed to trillions in contracts, and then how are they going to pay for all that? Now, I think the problem is they're entering into these big boy contracts because they need the hardware, right? A point you made earlier where Google has their own TPUs. It's their hardware. They're making it. So that's a bit of a different proposition. Amazon is the sort of king of that sort of stuff, so they don't have to worry about it. Microsoft has Azure, so they don't have to worry about it. But OpenAI does. They owe money to people and they have to do these agreements to get good prices. Now, the problem is on the other side, if all your customers are consumers, you don't have solid contracts with these people. They can quit anytime they want. Whereas if you're doing deals with Coke or Pepsi or... I love how always your enterprise examples are Coke and Pepsi. Well, didn't they use that in the early days? All right, fine. The government of Uruguay. Like Manchester United, the soccer team, right? You're doing contracts with companies like that. You can do two, three-year contracts where you might not get the cash up front, but you've got that guaranteed revenue to match it against what you owe these other people. Like that's how real businesses operate. They don't screw around with like, okay, now we're announcing this feature for free to burn a bunch of capital. But I think the anticipation from OpenAI was like, we're the next Google, So let's try all these different projects and not worry about the core experience. But the core experience was never delivered upon or figured out. Like, I think this is the problem, right? And a lot of the statistics cite the sharp decline on weekends of ChatGPT usage to show it's not really a part of your personal life as much as it is professional, where people are just going to ChatGPT, throwing, like, I guess, company data or whatever into it. and getting it to do stuff, or if you're a student, getting it to write emails or essays or maybe a bit of chat or learning. You know, there's a ton of use cases, obviously. But I think that's the thing is, to be honest, I think this stuff's way overblown. Like, they have the lion's share of the market still. They're only, like, one good image model and chat model away from being back in the zeitgeist and everyone hyping them up again and the market roaring. um and that i mean yeah like one just one thing i go off though like i remember like in our in our sort of uh regular live company right i used to look at the open ai bill and it was escalating and escalating i'm like this is going to be another big thing because we're always thinking about costs right like what is what is the cost of this stuff versus the benefit we're getting from our customers and it was growing and growing and then suddenly it stopped and reduced and now it's not even a really a line item worth worrying about, right? Like the use of open AI is just not a significant line item in a company where it kind of should be and was, you know, like I just, I just wonder how that is going to affect them in the long run when there's so many legitimate alternatives. And when you've got companies in, in markets where they're really looking to cost cart and get models like DeepSeek v3.2 that they can host themselves and have a static known cost, right? And just do a little bit of extra work to optimize for that model. How does OpenAI justify a company continuing to spend $10,000 a month, $20,000 a month with them? But I think this is the whole feeling of the entire market, whether it's enterprise or its consumer. Like if you think about the enterprise, well, yeah, cost is a huge constraint. Having like basically zero inference cost or very low, where you're just paying for the bare metal in your own cloud or your own environment, that would be important. And then the other challenge is, like you said earlier, with the principle and the tools, like people just pick the best tool for the job. ultimately i think a lot of people would put everyone in this camp of like oh you know the plebs just go to chat gbt i think most people that have an interest in ai or at least hear about better models and things that it can do will just go to the right tool for the job naturally yeah and i think i don't think i don't think when it comes to ai tool usage there are plebs as we refer to them that's is that an australian thing i don't know but like you know we're talking about regular Joes, right? I just wonder if a lot of the people we refer to as plebs or regular Joes are just children. They're just children using it and they're just using it because it's available. Because I think that anyone who is using AI seriously in their business is either trying to do it because they want to be better at their job and do a good job, or they're trying to champion in their organization. And people who are trying to do both those things need to use the best model because they don't want to be like the lawyer who gets done for citing fake references or the person who does a PowerPoint presentation that has some sort of crazy nonsense in it and they get found out. They want to use a model that makes it seem like it's them. And therefore, they're always going to gravitate to whatever the best model is, and ideally, not some mainstream model like ChatGPT. This is what I think, right, where they've gone wrong is just not by focusing on that core of having the best image model, the best model for inference. They sort of control that narrative of like, OpenAI is just the best at this. Don't even bother trying, guys. I mean, Sam Altman said it originally in India, remember? Like, don't even bother. You'll never catch us or beat us. And it turns out pretty much everyone has. And I think the challenge now is they've got to get back to their roots. So to me, this code red, if it's going to be about anything, should be about building models that people actually are climbing over the walls to use because they're notably better. And I think that's the problem is a lot of these companies now for the sort of general day-to-day usage of models, they've all become very similar. And so you've got two parts. It's like, do we fight in the consumer? Do we figure out ads? And do we have the best models and the best personalization and get deeper with the consumer? And is that the direction we're going? or are we going to turn to like Anthropics model, which is say, okay, all the token money is in coding use cases right now and generating code. So let's build the best agent to coding agents and go after some sort of enterprise platform, which I think is obviously Anthropics strategy, but the numbers aren't great for them. Like in the enterprise, they've gone from 48% market share 18 months ago to 24%. And the other thing is Gemini monthly active users are growing pretty rapidly but I think they do sort of cheat because they just put it everywhere Like I don know if they including the AI search Like if you include every Google search you ever do But it would already be at billions, I think, if they did that. So I'm not sure how they calculate that. But I would say Google's strategy now is to sort of have the best or better models than them over time and then just bleed them dry. Because if they can't charge a monthly subscription to a consumer, and if they add ads, people flock to Gemini. Where do you go? Like, honestly, where do you go? Like, if you were them right now in this Code Red meeting, even though they still control, you know, a vast majority of the market, what are you doing? It's a tricky one. I don't really know. But, like, all I know is right now, I just look at it in my mind. It's like I use GPT 5.1 occasionally if I'm desperate, if I want another opinion, something like that. I'm never thinking like if OpenAI cut us off today hard, I probably wouldn't even fight to get it back, honestly. Remember early on when they did cut people off and they were so strict on limits? I remember our first few podcasts on this channel, which was quite a while ago now. My fear, the thing that we talked about a lot was, what if they just take this away? What if they just make it for the elites and they take it away? It was a genuine fear. They were so much better than everyone else. That was a real fear. Now, I don't care. You could take away most of the major players and there would still be legitimate alternatives like Grok, like Deep Seek, like Kimi K2. There's really good alternatives out there. You don't need them. I like the Frontier models. I really do. But it's not a thing anymore. It's just not as big of a deal. And in fact, some of the lower tier models actually have advantages depending on what your use case is, because you can control them. You can run them in your private cloud. You can run them without a marginal cost on tokens, that kind of thing. Like, it really is a different game now. And I just think that for them to actually wield that sword, they need to come out with something that's just so much better that all this other stuff goes away. I still push back on that because, like, the reality is we have access to every single model and pretty much every tool available, right, in Sim Theory. Yeah. What do you use? Right now, what do you use? I use Opus 4.5 and Gemini 3.0. What do you use for images? Nano Banana. Yeah, like, this is what I mean. Like, people just want to use Frontier and the best. Oh, sorry. I was like, it's like a cross-examination. You got the truth out of me immediately. I'm hopeless. No, but I guess what I'm saying is, like, we keep banging on, like, people, like, and I agree with you. If they took all those other things away today, sure, my fallback right now would be those other models, right? But I still don't, like, I just think people want the best tools. And this is why the enterprise strategy is a bit strange to me from Anthropic and potentially OpenAI as well. because you're essentially going to an enterprise and saying, hey, it's to our model, our one specific model that constantly, you know, is this like pull, like sometimes it's the best, but then you'll not have the best model for like maybe a year and then it'll be good again. And they're asking for these big commitments. And then as you said, there's all these like, oops, I checked this checkbox. Now they're trading on my data. To me, the smartest companies, like this assumes the enterprise is dumb as well. and the enterprise is going, well, hang on, we can just access all these tools through APIs. So why would we cut a deal? We'll just go to the cloud provider. So then you think, okay, well, OpenAI is screwed again because they're just going to go to like Azure or Amazon. And I guess the only defensive move then is like pull your models off those clouds. But since there's so many good models now, it doesn't really matter. So you just lose market share. I would argue people would struggle to tell. I think people who are using this stuff every day would know. But, like, if I was using Azure as the back end and I transparently switched your open AI model for an Anthropic one, you would know. I would know in two seconds, yeah. I still disagree. I think people do know. Like, a good real-life example is my father-in-law is in his, like, mid-70s. he talks to me when he comes over about the different AI models. Like he has taken a strong interest in them. He runs like a sort of charitable organization now in his retirement and he uses the models to, which is kind of bad, but replace employees or contractors. He used to have to do stuff like write things, create images, create video content, like all that sort of stuff. Right. And he knows his model tunes and he knows when to switch models. And he also recognized when Gemini 2.5 Pro got dumb. Like he said to me, like, you know, it feels like it's gotten dumber. And so this idea that people, I think people's brains just naturally wire into the best tool for the job if they're available, obviously. Yeah, I tend to agree. I think it's gone beyond being a tech thing. Like I always used to be proud that, you know, I'm a tech person. So I understand this stuff better, like in terms of understanding the model's performance. But I actually don't think that's true at all. I actually think that most people who use it seriously for something they understand do have a feel for the models that's just as good as mine or anyone's. Like, I think that, you know, your own work and you know when it's doing a good job and when it does a different job and that kind of thing. It's a universal thing. You're dealing with a form of intelligence. It's not like dealing with just a tech person where you need to understand how it works to be able to use it properly. And I think, but I think this is where it's all going, right? Is like the models, the models are just like, sort of like having a chip on a computer in a way or several chips on a computer. It's probably not a great analogy. And you like, or maybe like different apps on your computer and you just know which app to use for what purpose and what app yields the better result. And so everyone can use a computer and everyone can use apps. So I kind of think it's similar. Whether or not long-term people want a model switch or not, I'm not sure. Like, I, during the week, experimented around. I had a pretty good idea for a model router, finally, in sim theory, that I've been testing locally that works quite fast. Like, I've got the latency down where you don't even notice. And it works pretty good, like, to the point it's picking from my current choice, like Mike's picks, of which model to use for what. So it's completely somewhat manual. Like there's a classifier on top and then the routing is just like controlled by myself. But I use it for a while. I was like, this is good. But then I was like, no, I just want the manual control still. As soon as you need to do something real, right? Yeah. Like I need to get results in. Yeah. Like I need to actually know I'm definitely using this or I want a second opinion or as soon as it unravels very quickly. So I'm not sure. I'm not sure. I'd love to actually experiment with it and put an automatic model in, right? And just see how many people use the automatic mode over switching. And I would argue or I would predict that most people will still end up picking their model, not using the auto pick. Maybe I'm wrong. If I'm wrong, drop it in the comments below. Let's try it and find. So, yeah, I think that the last few weeks of the year are going to be really interesting. Does OpenAI just blow us out of the water and they've got some rabbit in their hat? But generally, when they have a rabbit in their hat, there's so much hype and it's been planned for ages. But it's almost like they had nothing. Like they just assumed Gemini 3 would be bad, which I don't think was a good assumption. And to think they can now stay ahead with models is probably not true. because I would argue Google and Anthropic are just going to keep, like, it's not like they're going to stop now and give open AI time to catch up. Yeah, I think they need to do more than just announce a new model. It has to actually be better. I think that's the key. It can't be just, like, benchmarks and, you know, a nice event where they've got, like, a Christmas tree in the background and they bring out their nerds and talk. It's got to be a model that we all universally say, okay, this one is better. and it's obvious that if they did that now, it would be enough to retake the crown, honestly. It isn't like Google has that much of a lead or Anthropic has that much of a better model with Opus that they couldn't come out with something that does that. Yeah, interestingly too, if you look at Open Router, this is the market share and it is Open Router data, right? So you've got, like, there's specific use cases for OpenRouter that people use it for. So it could be slightly misleading. But if you look at it in the green on this chart, you have XAI with Grok 4.1 fast. And we talked about that. Was it last show or the show before? This thing is so cheap. It's brilliant at tool calling. So it's great at agentic use cases. It's just a great all-round model. And it's so cheap. It's essentially free. And so if you're building an application right now and you're doing tool calling and you want it fast and everything, it makes sense to me why it's number one for the use cases that people use Open Router for. Then you've got Google second, Anthropic, and OpenAI is sitting in it fourth on Open Router. So when given the choice to consume from different APIs, even though Anthropix models are more expensive than OpenAI and Google is about on par with OpenAI models they're still forth and let's ignore XAI for now even though I don't think it should be ignored because it's showing what people will consume when the price is good and the model is great so anyway, we'll see how it plays out do you think they're going to have some model that blows us out of the excites us? they have to I mean, I think maybe they will because they have to. Like if they're going into code red, the code red should not be, let's release GPTs too, or, you know, let's come out with the 12 days of bullshit web UI that we're actually not that great at. They need to be like the code red is let's come out with a model that is just so much better. And here's my criteria. Here's what it has to do to be the best, right? I'm coming up with this on the fly. However, it has to be at least a million contexts. It can't be less than a million or I just don't think it's in the game, right? It has to be at least 64,000 tokens output. It has to be phenomenal at parallel tool calling. Like that has to be its wheelhouse, like absolutely amazing at it. It has to be good at long running repeated processes. So if it's like, okay, I've done phase one, I'm going to come back phase two, do more research, whatever it is, i.e. a gentic loop kind of thing. I think you need to see a big improvement in vision. I think a big improvement in vision from any model right now would leapfrog them to number one because it's an area that really hasn't improved at all in the last year, if not two years. Like, it's gotten a little bit better, but it really hasn't changed much. Do you agree with that? I don't know. I think GPT's have always been great at image or good enough. Yeah, but I mean, what percentage have they improved in the last year? Oh, nothing. It's flatline completely. I think Gemini 3 is a tad better. So I think a big improvement in vision would be absolutely massive. I mean, this is your wish list. And like these are, I reckon these are panic tunes of GPT-5. Like they don't, they can't, like maybe they're new models. Maybe they just have all these models sitting around, but... I don't know. And then my final one is speed. My final one is speed. Even if they're already making a massive loss, I would, at the API level, just throw every piece of hardware I've got at it and just make it lightning fast, like just so fast, it's unbelievable. And then the other one, and again, maybe not economically the best decision, but in terms of like mindshare, cheaper. Like if they could get it half the price, twice as fast a million context and better vision they would absolutely crush it and the model probably intelligence wise only needs to be six percent faster ten percent faster something like that sorry more intelligent like slightly more intelligent like or at least on par with opus at least on par with gemini um and bigger context uh and also sorry this wish list is so long i'm Like how long is this going to go for? While I'm going for a wish list, the other thing is some of the great API features that Anthropic has added. Automatic context management. So in other words, it's crushing down the token usage from old tool uses and things like that. So just so everyone understands, when you've got a big context building up from a long session, you've got all the results of all the tool uses in there that need to be there for continuity. So for example, if you've crawled a website, it's got the results of that. If you've, you know, downloaded your emails, it's got all the bodies in there. What Anthropic can do is discard the ones that are no longer relevant, that have already been taken into account automatically. It's a great feature, especially when you're doing agentic looping and things like that. It's very nice. Better caching, that kind of thing. Like OpenAI already has transparent caching, but it isn't the best. If they could get some of those things going as well, suddenly to me, they would be number one. Listen to everything you've described. What I hear is this. Google and Anthropic are building tools that people actually want for real life use cases. And OpenAI's APIs have basically reached this weird point of yeah, it's almost like they're not listening to what people want and they've just got their own sort of agenda with their models or they just can't figure it out. Because they're noticeably worse at tool calling. and they, like, even the way they tune the output format, the feel is off in a lot of these models now when people get exposure to the other models. Anyway, we should move on because we could rant about this all day. Yeah, and, like, just to finish that, like, I'm obviously looking at it from a unique perspective because we provide a system that allows disparate companies with different needs to get the most out of the models, right? So we're in a position where we need the best of everything and we're always going to be fighting whatever the best is. I think it would be different if you had bespoke AI use cases where you're just targeting one thing you're trying to do the best. I think a lot of this stuff goes out the window then and you're really just looking at like a cost perspective or something like that. But if OpenAI want to win people like us and win people who are building stuff on top of AI, they need to do some of this stuff or it's just not going to happen. No one's even going to care. Here's the other problem for OpenAI long term. Let's now introduce into the discussion DeepSeek version 3.2. So this is a new model a new DeepSeek DeepSeek R1 being the thing that kind of like blew up the market originally because everyone was like there is no moat Remember that phase Yep So DeepSeek 3.2 is out, said to be a reasoning first model built for agents, yada yada, benchmark max, looks good on paper. Open source, out there, available. you can run it wherever you want we've tried it out unfortunately the provider we use in the US the tokens are just so slow it's hard to get a true feel for the model yet but my initial impression just from the few queries I did do on it is that it's pretty good like it seems very comparable to like even Opus I would say in a way absolutely like if you again if you did like a blind test on me with this, I probably wouldn't have noticed. Like you say, it is disappointing about the provider, and we'll look for a new one because, or just do it ourselves, because when it's bad, it's good. And when it's good, it's sensational, because it's very fast. Like when you get the request through, it's very fast. It was just that we are clearly part of a queue, and the queue is not being picked up fast enough. However, I got it to using in sim theory, research the ashes test, which starts today, the Australia versus England cricket match and said, find out about, you know, injuries, player performance, historical things, stuff like that. It did about 20 or 30 tool calls, like in terms of research. Then I said, make an infographic, a song and an image of the match. and it made this incredible image of Steve Smith, the acting Australian captain batting. It correctly got that it's a pink ball test, i.e. a day-night test. They use a different colored ball for the game. It got an approximate score of where it thinks it'll be partway through the match. It wrote an amazing, albeit American, country song about the game. And then the infographic it made is just unbelievably detailed. I mean, I know Nano Banana deserves a lot of the credit, but, like, look at that. It's got player images. It's got stats. It's got historical information. It's got comparison figures. It shows genuinely accurate statistics about this game. And I said to Mike earlier, like, can you imagine like a newspaper or online magazine or anyone who would want to present this information to their customers in the past? How much that would have cost in terms of employee time, research, graphics designers, all that sort of, like, you couldn't have done it. It would have cost, what, hundreds of dollars, if not thousands of dollars to produce an image like this. Probably thousands, right? And if you look at the actual, like, nano banana cost, it's like, what, 15 cents or something to produce an image like this. And then you've got the DeepSeq inference, which is, like, cheap as anything. Like, it's pretty amazing what it can do. Yeah, I think going back to the model, though, with DeepSeq, this is further erosion, though, of open AI. Because you said earlier about building for a specific use case. Obviously, we use all these tools and not necessarily optimizing for specific use cases because we want to make all those tools available. But if you're building a product on top of AI, it's like, what are your constraints? Well, one of them is your open AI bill or your anthropic bill or, you know, it's essentially like an electricity bill or a utility provider. And it really ruins the economics of these companies where you would have a gross margin in a SaaS business of like 70 percent or 80 percent. And so every dollar a customer gives you, you keep 80 percent of it only costs 20 percent to run the business. With AI businesses, it's sort of inversed in a lot of ways. And I think what can happen here or is happening is all of the startups that are building around AI are using Chinese models, right? And DeepSeek v3.2, I think, further erodes for both OpenAI and Anthropik in a lot of ways because it's just so much cheaper to use. Like, even if you're consuming it through the services we use, just to put some stats on theirs, it's, what, 56 cents per million tokens. I mean, I think GROP 4.1 fast is still cheaper. And probably better and faster. At 1 million output, $1.68, so 10 times cheaper. This particular iteration of DeepSeek 3.2 has 163k context window, but I think it could be higher. Yeah, if you ran it yourself, it could be higher. And another important clarifying point is people here deep seek Chinese model and think, okay, the Chinese get all your data. They don't. The version we use, for example, is hosted in the US. There's no training on the data. It's secure. Additionally, these are open source weight models. You can host it yourself in your own private cloud local to your whole system. So you can get way, way better in terms of privacy and security with a model like this than you actually can with the major providers. So there's a lot of advantages to it rather than disadvantages. And as we've said, the trade-off in terms of lack of intelligence or tool calling is minimal at most. Yeah, so if you're building an app that relies on AI, like you're building one of these toy robots everyone's really into right now, like LAMP, I saw this week, and you just need to run inference, You would be far better to just run a model like this in your own private cloud. And that's going to get the economics right. Yeah, the main downside, obviously, is if you ran it, like there's a minimum fixed cost. You're probably looking at between $500 and $1,000 a month to run it, right? Like in terms of the GPUs you'd need to rent. Or if you had to buy a GPU to do it, it'd be like $20,000, $30,000, something like that. So there is a base cost to it. But if you know the demand's there, then you can easily justify it. And then it's not a marginal cost. It's a fixed cost. So it's a lot better. And I think right now the argument early on and the advice I give to a lot of people is just use the best models to figure out the use case. Don't bother around setting up your own infra. Just take the hit on the cost until you've figured out the business model. And then, you know, like once you've got to get to that point in the business where you're like, this is working, this is a long term thing, then I can optimize around a model like DeepSeq in my own cloud and have fixed costs around this and get my economics looking right. and the question is, are we there yet? And I'm not entirely sure. I think for some businesses, maybe you would be closer to that right now where you're like, there's just these streamlined use cases in the business where it's far easier to have fixed costs around this than just pay for continuous inference from various providers. But I think in terms of raw intelligence in an enterprise or a business today, you do still want to be working with the latest models. you don't necessarily want to be locked into like some, you know, stale model that takes 12 months to revision. Yeah. And I think also there's expertise involved. Like for example, you couldn't live with 160K context window. That's really small by today's standard. Like I think the minimum is 200 you get now with something like Opus 4.5. And honestly, that's weak. Like it should be higher than that. So there are downsides definitely. But like you say, with specific known use cases, a model like this is hard to beat. So interestingly, I think we were talking about this last week or the week before, this idea of like AI first people. Like if you were hiring now, this is what you would be looking for is these skills where people understand how to use AI models and tools in the workplace to be more efficient or just like get, you know, a lot more done and be a lot more productive. and automate processes and things like that. And there was this article that stirred up a lot of commentary in Business Insider this week. Meta is about to start grading workers on their AI skills. Meta will assess employees' performance by AI-driven impact starting in 2026. The company is shifting towards an AI-native culture and incentivizing AI adoption through rewards. Meta is also rolling out an AI tool to assist employees in writing and performance reviews. Do you think maybe Zuckerberg listens to our podcast? And if you do, Zuckerberg, please buy our merch. God knows we need it. Our store may or may not be a scam, but please just buy some of it for your staff, whatever. I'm amazed it's still operating. But it really got me thinking, right? Like, there was a lot of commentary around, like, this is stupid. People still need hard skills. Like you don't want a bunch of slop out there, like slop code in your core code base, slop sales proposals going out. And there's this narrative, this like anti-AI narrative, I think, around this where people are like, do we even need this? Is this the right way to lead in this era? Like, should we be forcing people to use these tools or not? And I think even in our own direct experience, there is something now, or at least in 2025, where it feels like workers or employees that are not adopting AI are starting to actually fall behind. They're starting to seem like dinosaurs that haven't necessarily embraced these tools. And I think it's becoming more and more obvious. And then you're also in this strange era in education where if you look at schooling or university, university in particular, the financial burden people face when they go and do a degree now only to come out now and not even get a job. And all of a sudden, the AI skills are vastly more important. It is leading to a lot of people not even going to university. And the value proposition is almost questioned there as well. If I want to go and work at Meta and all they do value is these AI skills, why would I bother going and doing a course that's already a sort of dinosaur course, getting myself in a horrendous debt? and then not having the AI fluency I need to go and get a job in these companies. So I think this disruption, it's still very early with it, but you can totally see it coming. And these tech companies tend to be the early adopters of the trend, like the fact that they want AI fluent workers. Yeah. And I think you asked a question the other day about a task and you're like, well, how long should that take a modern front-end developer to complete? And we look at what we would have thought, say, three years ago and look at now. And you're like, if it's taking this person more than an hour, they don't know how to use AI, basically, because there's no reason to do the grunt work associated with that job anymore. Like there's literally, if you're using, say, React, which Facebook invented, right? And you have a well-known components library that's fully documented, which they do, right? And I ask you to build a new control that does something. Why in the world should it take you more than an hour to fully build it and fully test it in place when you've got the access to these models? Even Lama could do this stuff. Like, you know, there's absolutely no reason why a developer in that role should be doing it in a traditional way. And if they don't know how to competently use AI, if I was Facebook, I'd fire them immediately or retrain them, obviously. But even outside like our software engineering or developer bubble, this applies to almost everything. Like if you're a lawyer and you're writing it, like trying to come up with an angle in a case, like going back and forth with the AI on this is going to be much faster than just sitting there manually typing things out. You still need agency and you still need some core like hard skills there. But even data analysis skills like now, you don't really need a data analyst. You can get it to write SQL queries, produce charts, produce insights, at least raise these things. And sure, like it might be wrong or inaccurate sometimes, but so are data analysts in real life if you've ever actually dealt with one. Also, think about someone replying to RFP documents or making RFP documents or producing slide decks or writing business proposals or purchase orders. All of these things should basically be seen as rough clay where you work with the AI to refine it into the finished product using your expertise. They should not be documents that you're ever creating from scratch anymore. And therefore, anyone who can't use the AI skills is by definition less efficient than someone who can. Yeah, and I think this is the thing. It's like AI fluency or not. And I think Zuckerberg, despite, you know, being seen as like evil sort of alien spawn villain, he is probably right. This is the way many workforces are going to go and be expected to implement, whether it's automation or just work with AI to be way more productive. And I think that the challenge is a lot of people in the workplace still see AI as a threat to them, not a tool that will actually get them promoted or empower them and make things better. and then on the other side you've got people and i i know because many of them have reached out to us whether it be their parents or them directly and said you know like should i even go get a degree anymore should i even go and study and like will what i learn be out of date by the time i get out uh and and i think interestingly that concept requires a lot of self-motivation right like if you're going to sit with chat gbt uh or gemini or whatever and teach yourself and learn stuff like you still need structure and you still need to understand what it is you're learning in order to be valuable in a workplace or to get a job or um you know like i think this idea that you'll just magically teach yourself with a chat gbt uh subscription is a bit bonkers no because it requires taste and it requires understanding where it plays a role and where it doesn't like you really need to know how and when to use it and how to respond to what it gives you in order to be effective. Like it's a totally different way of working. It isn't that the future of work is just copying and pasting into a chat window. It's like, how and when do I leverage this to do the pieces of my work that it is better at and faster at than I am? And I think that that is about context building. And soon it's going to be about coordinating agents. And there's a whole big technique that is going to be needed. And the people who know it are going to do better, regardless of where they learn that training. And this is the thing, right? Like there's another statement Zuckerberg made quite a while ago. It says, Mark Zuckerberg says college isn't preparing students for today's job market. This and the debt burden will create a reckoning for higher education And I mean there a lot of data that supports this as well where outside the sort of like you know doctor lawyer where you need the professional qualifications and that you know, like you can't, you're just not going to be able to pick that up. There's so many jobs, or I would say the vast majority of jobs where you would traditionally go get your degree, maybe you go back and get an MBA. And then as a result, you were recognized, you were promoted. Like there was a lot of value placed on that certificate and that training and the importance of it. And now it's getting to a point where a lot of this stuff is just overnight out of date. Like certain courses are just not landing people jobs in this era. And really what needs to happen is they all need to be rewritten and rebuilt for this AI fluency. That's right. The universities need an AI component to basically all of their degrees. It's like they need compulsory subjects in them that is like, this is an AI certified degree that has, like you say, AI fluency attached to it. And interestingly, spoiler alert, with Sim Theory very early on in Australia, a university who I think gets this, and this is definitely not like some paid advertisement for them at all. No, it is. Mike's lying. It's a paid advertisement. It's our second sponsor. It was carrots and now it's university. I'll censor it. Hang on. Yeah. Yeah. So if you're into carrots or universities, you want Bolthouse Fresh or UNE? Yeah. So the University of New England here in Australia reached out to us pretty early on and have, you know, also rightfully identified this. And I think they have taken this as an existential crisis. Like, how do we get AI fluency written into everything that we do? And how do we invent new degrees that are actually aligned to what employers want now and that AI fluency people expect? And I think there's a lot to this. Like, it's going to become like, you know, when people, I guess when the Internet was a thing, like people were like, I don't do the Internet. Like, I just, you know, there was a subgroup of people and the same group of people are like that with AI today. They're like, I don't do AI. and so like florists florists like no one will ever order flowers on the internet are you crazy and then they just got crushed by like roses only and um interflora and all those companies yeah now they're just like distributors for those websites uh but yeah so i'm i like i think the backlash to this is a bit ridiculous like people people seem to see through this lens that all ai is slop and they just want it to go away. And there's this backlash anytime someone makes a statement like this. And now, I don't know, I think increasingly this is just becoming such an important skill. I've got a funny story about this. So with that same speech I mentioned earlier at my kid's school, the big principal was talking about AI and he's like, oh, I think in this world of AI, we need to learn what it means to be human. We're all humans and there's things humans can do that AIs can't. And then he's like, you know, for example, an AI can't love you. It can't feel love. And then I turned to my wife and said, Patricia loves me. She's so angry about it. She doesn't like it. But, yeah, it was so funny how this school, for example, has – sorry, we'll listen. This school, for example, has recognized the pervasiveness of AI. Like it's in everything. They can't ignore it. Cause I was really annoyed at first. I'm like, this, this assembly is about the student and their achievements for the year. It shouldn't be about, you know, the, these, these existential crises you face as a school. But then I'm like, but it's so, it's so pervasive. Like it's in everything they do now. Like students are using it casually. The students are using it for their work. The teachers are using it to help the kids. like it's everywhere. And it's so capable that they have to acknowledge it. They have to recognize it in some way. And I think you're right. I think something that UNE did so well is recognize this, what, two years ago? They're like, this is going to be a big deal for us. We need to do something about it. And I think that this is going to happen in more and more educational institutions where they're like, we can't ignore this. This isn't something that we can just say is slop and we'll go away. We absolutely must acknowledge it, talk about it, And not just come up with a solution for it, but win the hearts and minds of the students, in this case, the parents or, you know, the stakeholders in the situation and say, well, this is our approach and this is how we're going to leverage it. Because I feel like if you don't do that, people are going to go, if they have the mobility to do so, to people who do acknowledge that and have a strategy for it. Yeah, I honestly think that there's this transition period too, where these AI native kids will come through and it'll just be an expectation whether they're being trained in a specific role of like how do i then apply these ai skills in a way that's relevant to this career path i've chosen and then there's also going to be these this poor like in the middle generation where it's like well it's kind of partially here but it's not and some companies are adopting it some companies are saying oh it's you know completely banned or whatever but the value proposition for this generation right now of, you know, potentially spending all this money on a degree where they're not learning these AI skills, going into debt as a result of it. And then right now with the way the market is not getting employed either, it's right for disruption. And I think as you said, it's everywhere. There's definitely job roles that I can see just being completely either eliminated or just like one powerhouse is going to be doing that job for 10 companies. Yeah. And to me, there's opportunity here, which is, yeah, like for the AI fluency market, like to me, that's what education needs to be about is how do you get people up to speed? And also, how do you teach them to use the tools and use their own judgment and question the outputs and give the model some agency because it's certainly not capable, at least today, of just magically doing the job. You really like, yeah, even like building a tool stack, like what's your tool stack like? The combination of tools you use to get the job done, because really MCP and tool calling is such a big part of it. Like even just being able to fashion your own tools to make your job more effective is going to be a useful skill. Yeah. I think it's so interesting to see it and live in this time where you've got kids in the school seeing AI just infuse itself everywhere whether they like it or not and then seeing how these people actually react to it and funnily enough, I think schools have probably the worst internal tooling I've ever seen with AI today at least they do here in Australia so it's somewhat troubling because then it means students are just going everywhere My son's school just banned Sim Theory, and I was, like, really proud of that because I'm, like, finally we're important enough that they banned us. Yeah, because he was handing it out to all these students to bypass their ChatGPT ban, but then we got banned. So I'm, like, proud but also disappointed. So they banned ChatGPT at the school, even though they preached about AI. Yeah, that was from the start, and now we're on the list. Isn't that, like, banning Google? Like, because Google's got all the AI in it. Maybe this is why their traffic's down. yeah it could be but yeah like they i guess the it administrators um ban it and anytime i try to help the kids bypass that i get in trouble so i i don't now but yeah i was pretty proud when i heard that what's the justification of banning it they didn't even say anything i just i just hear from the kids i don't know yeah it seems ridiculous it's sort of like yeah like ignorant it's straight ignorance because i don't think that that is really the way to handle it you got to remember Now, a lot of kids in schools get their own laptop given to them, at least in Australia. And those laptops adhere to the IT policies of the school. So it isn't for them. It isn't just banned at school. It's also banned at home, effectively, in terms of doing their work. It's weird because they use tools like Canva, for example, which has AI stuff in it. So it's not AI generally being banned. It's almost like, well, this specific AI is banned. So it's an odd kind of censorship in a way in terms of the kids and how they use the technology. Maybe it's a conspiracy like big Australian tech is like, you've got to ban it because we want Canva. Yeah, that's right. We've got to do something to finally get to this IPO, guys. Come on, we've waited so long. So Chris, we do a lot of songs on the show, right? Oh, is this a surprise? I didn't know there was a song today. No, there's no song today. Oh, sorry. but we do a lot of songs on the show and I've always thought I had a certain knack for the AI songs you did, I mean you made in our musical, which is wildly unpopular I must admit, but one of my favourite things you made almost all the songs I think I only got one in there and it was probably the least popular but anyway, you've finally beaten me in the Spotify charts it's now number one Fatal Patricia What would it take to get it played on the radio? Like, could we do it somehow? Like, if we rally as a community? Someone in our audience must have some sort of connection at some... But why radio? I mean, no one listens to radio anymore anyway. It's just validation, just like for my self-esteem. I'd love it. Imagine some American radio set. There's so many. Surely one's willing to play it. It would be good to have some sort of rural US or Canadian radio station. I think that should be our goal. If anyone in the audience has any connection to any radio station, we would like this played. I don't know why, though, because no one would like, you know, it's probably got more distribution on Spotify with our 300 monthly listeners. I'll take it. That's good. All right. I have one more segment for you. So it's called WTF of the Week, or ****. And it's this video, and I know for people listening, this is going to be kind of annoying, but I'll explain it as best I can. So Tesla Optimus, which has its own X handle, posted just set a new PR in the lab. And I thought this was AI and totally fake. I mean, it might be, but I doubt it. I mean, it is AI, but not the video. Look at it run. It looks unfit. I don't know. It looks beautiful. It's just so fluid and it looks like a human now. Yeah. It looks like a 50-year-old woman running. Okay. I think it looks better than that. It's probably like me running. I'll give it to you. It's a little unfit. There'd be 50-year-old women fitter than you, I'd imagine. Yeah, wave it up. Definitely fitter than me. It's a little bit unfit. I can't even look upstairs it looks I don't know, it gave me shivers, you gotta watch it I'll link it below in the description for those that haven't seen it, but it's the first one of these Android videos I've seen where someone doesn't kick it in the legs or try to trip it over or something. Oh, there's videos of people fighting it. Throwing bricks at it Look at this, best AI in the world can survive a brick attack Yeah. Alright, any final thoughts? It's a bit of a nothing week at Code Red The DeepSeat model is interesting. I think it's a good model. Yeah, it's all a reputation thing. It's like, can you motivate yourself to stick with DeepSeat for more than a day? You're just not going to do it. That's the problem with it. I don't know what it is. It's a vibe thing. Like, the whole thing of AI is it's a vibe art form, and no one is going to be like, I'm going to vibe with something that might be the worst one out there. Like, you're just not giving it a chance. Maybe that's the problem. Yeah, you're only taking a chance on the big brands. Please, please give it a chance. All right. We will see you next week, maybe with some new open AI models, maybe not. See you later. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.
Related Episodes

4 Reasons to Use GPT Image 1.5 Over Nano Banana Pro
The AI Daily Brief
25m

GPT-5.2 Can't Identify a Serial Killer & Was The Year of Agents A Lie? EP99.28-5.2
This Day in AI
1h 3m

The 5 Biggest AI Stories to Watch in December
The AI Daily Brief
26m

Claude 4.5 Opus Shocks, The State of AI in 2025, Fara-7B & MCP-UI | EP99.26
This Day in AI
1h 45m

Exploring OpenAI's Latest: ChatGPT Pulse & Group Chats
AI Applied
13m

Is Gemini 3 Really the Best Model? & Fun with Nano Banana Pro - EP99.25-GEMINI
This Day in AI
1h 44m
No comments yet
Be the first to comment