Back to Podcasts
Last Week in AI

#225 - GPT 5.1, Kimi K2 Thinking, Remote Labor Index

Last Week in AI • Andrey Kurenkov & Jacky Liang

Friday, November 21, 20251h 18m
#225 - GPT 5.1, Kimi K2 Thinking, Remote Labor Index

#225 - GPT 5.1, Kimi K2 Thinking, Remote Labor Index

Last Week in AI

0:001:18:14

What You'll Learn

  • GPT-5.1 introduces new personality presets for language models, allowing users to choose between friendly, quirky, nerdy, and other personas.
  • Baidu's Ernie 5.0 model release was met with some disappointment, but the company's autonomous ride-hailing service 'Apollo Go' is now operational in 22 cities.
  • ByteDance's 'Doobao Seed Code' is a new coding assistant that aims to compete with OpenAI's Codex and other AI-powered coding tools, at a lower price point.
  • Google is adding shopping features to its AI search mode, allowing users to shop conversationally and have the AI handle checkout and purchase tasks.
  • Fey Fey's 'Marble' is a new commercial product that allows for the generation and editing of 3D environments, with potential applications in video games, VR, and VFX.
  • The persistent use of em-dashes in language model outputs is a quirky issue that raises questions about the training data and biases of these models.

Episode Chapters

1

Introduction

The hosts provide an overview of the episode's content, which covers a range of AI news and developments from the past week.

2

New Language Model Releases

The discussion focuses on the release of GPT-5.1 and Ernie 5.0, highlighting their new features and the reception they've received.

3

AI-Powered Shopping and E-commerce

The episode covers Google's addition of shopping features to its AI search mode, as well as the potential impact on merchants and adoption.

4

Autonomous Vehicles and Ride-Hailing

The hosts discuss Baidu's progress with its Apollo Go autonomous ride-hailing service, which is now operational in 22 cities.

5

3D Environment Generation and Editing

The discussion focuses on Fey Fey's 'Marble' product, which allows for the generation and editing of 3D environments, and its potential applications.

6

Trends in the Chinese AI Market

The episode touches on the various AI model releases and developments coming out of China, and how they compare to the Western market.

7

Quirks and Challenges of Language Models

The hosts discuss the persistent use of em-dashes in language model outputs and the questions it raises about training data and model biases.

AI Summary

This episode of the Last Week in AI podcast covers a range of AI-related news and developments, including the release of new language models like GPT-5.1 and Ernie 5.0, the expansion of AI-powered shopping features in Google Search, the launch of Baidu's autonomous ride-hailing service, and the release of Fey Fey's commercial world model product 'Marble'. The discussion also touches on trends in the Chinese AI market, the growing use of AI assistants for everyday tasks, and the quirks and challenges of large language models.

Key Points

  • 1GPT-5.1 introduces new personality presets for language models, allowing users to choose between friendly, quirky, nerdy, and other personas.
  • 2Baidu's Ernie 5.0 model release was met with some disappointment, but the company's autonomous ride-hailing service 'Apollo Go' is now operational in 22 cities.
  • 3ByteDance's 'Doobao Seed Code' is a new coding assistant that aims to compete with OpenAI's Codex and other AI-powered coding tools, at a lower price point.
  • 4Google is adding shopping features to its AI search mode, allowing users to shop conversationally and have the AI handle checkout and purchase tasks.
  • 5Fey Fey's 'Marble' is a new commercial product that allows for the generation and editing of 3D environments, with potential applications in video games, VR, and VFX.
  • 6The persistent use of em-dashes in language model outputs is a quirky issue that raises questions about the training data and biases of these models.

Topics Discussed

#Large language models#AI-powered shopping and e-commerce#Autonomous vehicles and ride-hailing#3D environment generation and editing#Trends in the Chinese AI market

Frequently Asked Questions

What is "#225 - GPT 5.1, Kimi K2 Thinking, Remote Labor Index" about?

This episode of the Last Week in AI podcast covers a range of AI-related news and developments, including the release of new language models like GPT-5.1 and Ernie 5.0, the expansion of AI-powered shopping features in Google Search, the launch of Baidu's autonomous ride-hailing service, and the release of Fey Fey's commercial world model product 'Marble'. The discussion also touches on trends in the Chinese AI market, the growing use of AI assistants for everyday tasks, and the quirks and challenges of large language models.

What topics are discussed in this episode?

This episode covers the following topics: Large language models, AI-powered shopping and e-commerce, Autonomous vehicles and ride-hailing, 3D environment generation and editing, Trends in the Chinese AI market.

What is key insight #1 from this episode?

GPT-5.1 introduces new personality presets for language models, allowing users to choose between friendly, quirky, nerdy, and other personas.

What is key insight #2 from this episode?

Baidu's Ernie 5.0 model release was met with some disappointment, but the company's autonomous ride-hailing service 'Apollo Go' is now operational in 22 cities.

What is key insight #3 from this episode?

ByteDance's 'Doobao Seed Code' is a new coding assistant that aims to compete with OpenAI's Codex and other AI-powered coding tools, at a lower price point.

What is key insight #4 from this episode?

Google is adding shopping features to its AI search mode, allowing users to shop conversationally and have the AI handle checkout and purchase tasks.

Who should listen to this episode?

This episode is recommended for anyone interested in Large language models, AI-powered shopping and e-commerce, Autonomous vehicles and ride-hailing, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

Our 225th episode with a summary and discussion of last week's big AI news! Recorded on 11/16/2025 Hosted by Andrey Kurenkov and co-hosted by Michelle Lee Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode:New AI model releases include GPT-5.1 from OpenAI and Ernie 5.0 from Baidu, each with updated features and capabilities.Self-driving technology advancements from Baidu’s Apollo Go and Pony AI’s IPO highlight significant progress in the automotive sector.Startup funding updates include Incept taking $50M for diffusion models, while Cursor and Gamma secure significant valuations for coding and presentation tools respectively.AI-generated content is gaining traction with songs topping charts and new marketplaces for AI-generated voices, indicating evolving trends in synthetic media. Timestamps:(00:01:19) News PreviewTools & Apps(00:02:13) OpenAI says the brand-new GPT-5.1 is ‘warmer’ and has more ‘personality’ options | The Verge(00:04:51) Baidu Unveils ERNIE 5.0 and a Series of AI Applications at Baidu World 2025, Ramps Up Global Push(00:07:00) ByteDance’s Volcano Engine debuts coding agent at $1.3 promo price(00:08:04) Google will let users call stores, browse products, and check out using AI | The Verge(00:10:41) Fei-Fei Li's World Labs speeds up the world model race with Marble, its first commercial product | TechCrunch(00:13:30) OpenAI says it's fixed ChatGPT's em dash problem | TechCrunchApplications & Business(00:16:01) Anthropic announces $50 billion data center plan | TechCrunch(00:18:06) Baidu teases next-gen AI training, inference accelerators • The Register(00:20:50) Meta chief AI scientist Yann LeCun plans to exit and launch own start-up(00:24:41) Amazon Demands Perplexity Stop AI Tool From Making Purchases - Bloomberg(00:27:32) AI PowerPoint-killer Gamma hits $2.1B valuation, $100M ARR, founder says  | TechCrunch(00:29:33) Inception raises $50 million to build diffusion models for code and text | TechCrunch(00:31:14) Coding assistant Cursor raises $2.3B 5 months after its previous round | TechCrunch(00:33:56) China's Baidu says it's running 250,000 robotaxi rides a week — same as Alphabet's Waymo(00:35:26) Driverless Tech Firm Pony AI Raises $863 Million in HK ListingProjects & Open Source(00:36:30) Moonshot's Kimi K2 Thinking emerges as leading open source AIResearch & Advancements(00:39:22) [2510.26787] Remote Labor Index: Measuring AI Automation of Remote Work(00:45:21) OpenAI Researchers Train Weight Sparse Transformers to Expose Interpretable Circuits - MarkTechPost(00:49:34) Kimi Linear: An Expressive, Efficient Attention Architecture(00:53:33) Watch Google DeepMind’s new AI agent learn to play video games | The Verge(00:57:34) arXiv Changes Rules After Getting Spammed With AI-Generated 'Research' PapersPolicy & Safety(00:59:35) Stability AI largely wins UK court battle against Getty Images over copyright and trademark | AP News(01:01:48) Court rules that OpenAI violated German copyright law; orders it to pay damages | TechCrunch(01:03:48) Microsoft's $15.2B UAE investment turns Gulf State into test case for US AI diplomacy | TechCrunchSynthetic Media & Art(01:06:39) An AI-Generated Country Song Is Topping A Billboard Chart, And That Should Infuriate Us All | Whiskey Riff(01:10:59) Xania Monet is the first AI-powered artist to debut on a Billboard airplay chart, but she likely won’t be the last | CNN(01:13:34) ElevenLabs’ new AI marketplace lets brands use famous voices for ads | The Verge See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Full Transcript

Hello and welcome to the last week in AI podcast where you can hear us chat about what's going on with AI. As usual in this episode we will summarize and discuss some of last week's most interesting AI news. Also I think some a little bit older we are in our two-week phase right now, but soon we'll be back to be weekly, I promise. Anyway, as always, you can also check out the Last Week in AI newsletter at lastweekin.ai that will send to your email with a whole bunch more news. I am one of your regular hosts, Andrei Kerenkov. I studied AI in grad school and now work at Astrocade. And once again, we have a guest co-host, Michelle Lee. Hey, everyone. I am Michelle, and I also study AI in grad school with Andre. And now I am the founder and CEO of a company called Medra, where we are building physical AI scientists. Yeah, you co-hosted for the first time just a couple of weeks ago. It was super fun, so you graciously agreed to co-host again, which I think will be a great time. And this episode will be fairly low-key. I think there's not been any like crazy big news in the world of AI, surprisingly, lately. But there is a smattering of different things going on as usual. So we've got some model releases of GPT 5.1, Ernie, a few other things in business stories. There's always more billions being spent on data centers. A lot of stuff going on with stuff driving cars, which we'll touch on again. and moving on to open source. We still are getting more and more models coming out of China, which is a trend we'll touch on a bit. Research and advancements, some pretty interesting research from OpenAI and again, other trends going on. And we will touch on a bit of policy and a bit of media and art. So it should be a fun, relatively short episode. And let's kick it off with tools and apps. we've got GPT 5.1. So this was just announced. There's GPT 5.1 Instant, which is meant to be a warmer and more intelligent model, and GPT 5.1 Thinking, which is faster on simple tasks and will take longer on complex tasks. And this will be a replacement for the previous GPT-5 models. They'll be available for another few months and then they'll become legacy models. So a bit of an interesting development. A lot of people seem to not like GPT-5 when it came out because they liked GPT-4-0 and in fact, like, were angry that GPT-4-0 was taken away. So this seems to be open-air trying to sort of, you know, thread that line of making it be friendly like GPT-4-0, but also not sycophantic as it was to be. Yeah, and it looks like there are now more personality presets for the new models too. So people can choose between friendly, quirky, nerdy, cynical, and a couple other ones. So gives users a lot more options to decide between the personalities that they want for their models. Yeah, which I think is interesting. I always think back to the mind-blowing fact that ChatGPT has 800 million weekly users or something. The two of us presumably are just using LLMs and chatbots for work to be productive. but I can definitely see a lot of people just like chatting with these chatbots. And in that sense, having these friendly, quirky, efficient, cynical, like different personalities makes a lot of sense. And I could even see myself like talking to a nerdy chat GPT and having a bit of fun with that. Well, something I started doing recently is on my drive to work, I will just turn on chat GPT and talk to it and ask it to teach me new things or recently been really into poker. So I'll ask it to give me poker drills that I can do in the car. And it's surprisingly really fun and helpful. And I do sometimes find some of the preset personalities pretty annoying if I'm talking to it. And so I do think that as we are using more and more LLMs in even everyday situations, treating you as a teacher the way I do, the different personalities will start to matter. Yeah, for sure. And on to another big model release, but not from OpenAI. This is from Baidu. They have announced Ernie 5.0 and various other products in their offerings on AI. There was actually Baidu World 2025, which just happened. So there they announced this model. It is now part of ErnieBot. It's going to enterprise customers. And interestingly, it seems to have been met with some disappointment. Baidu's stock went down by 10% after this. This was seen as a bit of an incremental move, in a way similar to Jupyte 5, from what I can tell. But yeah, there's a variety of announcements here also about the autonomous right hailing service Apollo Go, which is now completing quite a few trips, actually 17 million rides. Baidu Search is starting to include more AI capabilities. so I always try to like keep an eye on these kinds of albums in China which are huge and will have a lot of impact but perhaps for our listeners who are mostly in the west may not be as familiar I think part of this is probably just like this past week so many new models came out of China and I don't think Ernie was the most impressive one that came out I mean like a lot of today's episode will be going over a lot of the different models. And unfortunately, I don't think Ernie was as impressive as some of the other models that came out. But what was really interesting was I read that Baidu's ride-hailing service is now operational in 22 cities. That's pretty crazy. If you look at like Waymo and RoboTaxi and now even Souks, like they are really operating at So fewer cities compared to Apollo Go, which is Baidu's service. Yeah, I think Waymo is now trying to get up to something like eight. Tesla is now trying to get to two. So definitely, yeah, quite a bit ahead on that one. Well, next up, we've got another release from China. ByteDance has their Volcano engine and they're releasing Doobao Seed Code, which is a competitor to ClotCode, Cursor, Klein, all these various coding agents. And one of the big deals in it is that it's pretty cheap. So the initial price is only $1.3 for first month. After that, it's going to be $5.5. And it seems to be quite good. So I haven't looked into this too deep, but I did see some discussion online that this is not, let's say, as good as OpenAI Codex or Cloud Code, but it's pretty good and it's much cheaper relative to how much you would pay. So again, we've seen a ton of competition on the coding agent front and coding tooling front, and it makes sense that we're now getting these kinds of releases also from the Chinese model developers. More to come, I'm sure. All right. And now getting back to the West, we've got some news from Google. They are adding a whole bunch of shopping features to AI mode. AI mode being their kind of full on AI search in Google. So they've had this blog post titled, Let AI Do the Hard Parts of Your Holiday Shopping, which came out just earlier this week. And it goes over a bunch of things. You can shop conversationally. So it will do the search for you, show you your product. It adds the support to actually make calls for you and kind of give you the summary at the end. There's a Gentic checkout that is starting to roll out. So basically, the AI can buy stuff for you, which is similar to we've seen also OpenAI adding some of these kinds of features, being able to integrate to Shopify. So it seems to be like a killer money generator that Google and ChatGPT and some of these other companies are betting on. Nice. I guess this is just in time for Black Friday and also for the holidays. and we shall see how this works. Yeah. Apparently, the agentic features are starting with some specific merchants like Wayfair, Chewy, Quince, and some Shopify sellers. So it's probably not like full-on buy anything for you yet. You're not going to be able to go to Facebook Marketplace and buy some used products. but you know i mean if you have a lot of people to get christmas gifts for i could see people using it i don't know yeah i wonder how this will affect adoption of ai from the actual merchants i know amazon had maybe issues was it perplexity where they didn't want complexity to shop on Amazon. But if you don't let your website be connected to these agentic checkout features and insist on using your own, I wonder how that's going to evolve. Yeah, I think it's given the rise of AI browsers and AI search, really, it seems like sort of the natural next step is people are going to start finding what they want to buy via search. They're going to have agents in their browser or whatever why not just like let the agent do the final bits of work there yeah and now moving away from chatbots to world models and the story is fey fey lee's world labs has released marble their first commercial product so world labs has been out of stealth for about a year now a couple months ago they introduced a better release of their work on world models, which is essentially at the time with a prompt, you were able to generate a free environment to navigate around. And it was pretty impressive. You know, you had relatively large worlds, quite detailed. Well, now, two months later, they are releasing an actual paid product where you can start doing a lot more. So they're expanding the generation to be multimodal. Now, in addition to text, you can do images and videos as your input. They also have some interesting ways to edit the environments with, for instance, like coarse 3D settings, saying like here's a cube, make it into a chair or something like that. And they are introducing various tiers. There's a free tier, a pro tier, a max tier, lots of things. So it will be interesting. I think this is one of these things that isn't prevalent yet, but seems like a no-brainer for things like video games and VR, perhaps even VFX from movies. So it's still a bit nascent, but definitely starting to mature. Now, I guess, Andre, you're in the video game world. how do you feel about the development of world models yeah it's i think generally related to 3d as a whole right so generative ai has cracked 2d imagery for a while is now starting to crack video which is in a sense a world model well it's getting quite good on 3d models themselves but that has still been a really tough case to crack. And similarly, 3D worlds are quite tricky because it's big, right? Because you go to 3D, there's a lot of detail you need. So it's been a kind of case of gradual progress for years now and we're getting to a point where you're probably going to be able to start integrating it into actual professional kind of processes. I don't think this is particularly useful kind of as a hobbyist unnecessarily, but for industry purposes, it's likely to start being useful. Although I don't know exactly how people in these spaces do their work, I suppose. And just one last story also on OpenAI, kind of a quirky one. Sam Altman tweeted that they made it possible for you to tell ChatGBT not to use Amdashs. So quite literally, the announcement was if you put it in the custom instructions for ChatGBT, supposedly it will no longer use Amdashs. And immediately after seeing that tweet, I saw someone else reply with an example of that not working. So it's kind of funny. And this made me think, why is this a problem? Why is Amdash a thing in these models? It's not like Amdashs are statistically prevalent in the training corpus of online text. So why is this such a weird kind of problem to have for these models? Yeah, I don't know. I do think it's kind of nice to have this telltale sign of AI because it's now pretty easy to tell when people are using AI to write things and don't even take out the MDashes. So I think it's going to get harder and harder to tell what's written by AI. I would really love it if someone did a deep dive, a real investigation on the root cause of the Amdash writing style. Because you could have some very interesting hypotheses like retraining corpora of journalism uses Amdashes more often. So the high quality data misleads you. I don't know. I'm actually very curious now that I think about it, why it likes Amdashes so much. Yeah, maybe it's like a RLHF. People prefer text with MDashes and then this is just like it will change soon once our opinions about MDashes also change. Yeah, well, I think this is a good test for AGI. If you're told not to use MDashes and you still use MDashes, you're probably not at AGI level. or maybe you actually can make the decision of when m dashes are appropriate so i guess as humans we can make a mistake if you're told not to use like commas you will still use commas so maybe it's unfair maybe it's unfair yeah on two applications in business first up we've got Anthropic announcing a $50 billion partnership to build data centers. This is a partnership with UK-based NeoCloud, I don't know what that means, provider of FluidStack, and they'll be building data centers in the US, in Texas, and New York. So we've seen a ton of these deals popping up in recent months, primarily from OpenAI. all of them have these crazy numbers attached to them of 50 billion 200 billion all these numbers we haven't seen as much coming out of anthropic but this signals that they're very much also in this game as it seems to be everyone all the frontier labs which is meta google anthropic open ai and just a couple other players i know at a point where they believe at least they need to be investing to build all these data centers. Yeah, but it does seem like the 50 billion is a lot smaller compared to some of the other deals where Meta committed to building a 600 billion dollars worth of data centers over the next three years and Stargate, which is OpenAI's partnership with SoftBank and Oracle, is planning 500 billion. Yeah, and this is also Anthropics' first investment in custom infrastructure. So they've been kind of depending on Amazon and Google and other partnerships of this sort. So them making the move to actually kind of take the next step of building their own data centers probably also signals a lot of confidence on their front, basically. They are apparently projecting rising to $70 billion in revenue and even positive cash flow in 2028. So this is another kind of signal of that confidence, I guess. And speaking of hardware, we've got another story from Baidu. They have teased their next generation AI training and inference accelerators. So the chips are M100, which is their inference optimized chip, which is designed in part to enhance the performance of mixtures of experts models. And as with NVIDIA chips and so on, they'll come in configurations of like 256 of them all at once where you can put in data centers. They are also announcing a training optimized ship for M300, which is in development and is set to debut in 2027 with the intent to support multi-trillion parameter model training. So this is quite notable. China is still locked out of using top-of-line hardware from NVIDIA. And geopolitically, it's very unclear where things are likely to be added. But I think at this point, more than ever, it's important for them to create this domestic capability to have hardware to compete. And this, to my knowledge, is like the most promising effort on that front. Yeah it is interesting that they are specifically designed to enhance performance of mixture of experts models because I don think that has been the main kind of architecture that other frontier labs are investing a lot in My guess is that is influenced by recent trends, particularly over the past year. So DeepSeq, V3, R1, Qn, basically, especially out of China, because you need to be more efficient in your inference. It seems to be the trend that all the models are MOE models, and that allows you to be much more efficient with your compute. That's a really good point. Yeah. I also, interestingly, hear in this article, I highlight that Jetson Huang apparently last week admitted that efforts to sell their blackwall accelerators in China have stalled and there are no active discussions. So, yeah, essentially, it's an open market for someone to come in and replace or provide something that NVIDIA is not providing in that market. yeah i mean if there's a threat of in the future these gpus being pulled away from the market i can see why they want to do everything homegrown yeah next up a bit of light drama with people i guess big names and what they're doing as far as startups go as we often have this time it's Yann LeCun. There are seemingly rumors, I'm not entirely sure how confirmed this is, but the news is he is planning to exit and launch a new startup led by him. So Yann LeCun, for reference, is a very big name in AI research, has been active for decades, was the inventor of convolutional neural nets, which for a long time was the predominant way to do computer vision for basically like a decade. It was a big part of why deep learning and large units took off over a decade now. And he's been with Meta, I think, for something like a decade. Built out their entire kind of AI research division, which was doing really advanced research. Not quite as far as Google, but they put out a lot of papers. And as we've covered over the past months, a lot has been going on over at NEDA. They hired Alexander Wang. They established this whole super intelligence division, which is separate from what Yannick Kuhn is doing. He is the chief AI scientist. Alexander Wang is like the chief of something else, their super intelligence efforts. So anyway, the news is perhaps unsurprising now that LeCun is looking to exit and focus on what he thinks is more promising, which is world models, something that is not necessarily just LLMs and trying to achieve super intelligence in the same way that at least Zuckerberg and others think needs to be done, which is more of the same. Yeah, I mean, Jan is a researcher at the end of the day and he for many years espoused deep learning when it was incredibly unpopular and he ended up being right and so I don't think he is at all shy of also espousing perhaps a different unpopular opinion of trying to really think about what the future of true intelligence would be And I think it's great that he can then go and explore this. And I'm sure, given his background and his pedigree, he will be able to raise the money necessary. And I think it would be in general good for AI that researchers are investigating multiple different ways to improve AI. Yeah, exactly. And I think it's maybe the case that there's a perception that he is an LLM hater of some kind. And from what I've seen, he understands the value and strength of technology. He has a somewhat nuanced take with regards to whether it will lead to AGI, which is how we define AGI. So the initiative or the direction he wants to go in is working on models that learn from video and spatial understanding rather than just language, which makes a lot of sense. He often makes a point that modern AI isn't as intelligent as a cat, because cats, of course, can do very advanced, very intelligent things with regard to locomotion, spatial reasoning, computer vision, etc. So in that sense, it's true that LLMs are not truly generally intelligent. And whatever research he is hoping to do as a startup, I think will be very exciting. Moving on, actually touching on something you mentioned earlier, Michelle. The story is that Amazon has demanded Perplexity stop using or letting its browser do purchases on Amazon. So they are actually suing Perplexity AI. They sent cease and desceased. And they are just saying, essentially, the agent in the Comet browser that Replexity has launched should not be allowed to check out and purchase stuff on our site. And their case is, you know, they have a set of acceptable use policies. they have conditions regarding data mining robots or gather data gathering and so on so basically the terms of conditions according to them make it so you're not allowed to do it and i guess makes sense for amazon in that they are also working on chatbots and bots to do shopping for you they have alexa as a thing that exists that you know i'm not sure if anyone's using Alexa Plus, but it's a thing. So yeah, it will be interesting to see where this goes. Yeah, I'm actually pretty surprised that they're blocking it. I guess maybe they were seeing degradation in user experience, but I kind of see this as only net positive for Amazon if these agents are shopping on it, right? If they really believe that Amazon has the best product, has the best prices, then they should welcome this new way of being able to shop. Though I do think, you know, Amazon asking Proplexi to stop and Proplexi not stopping is a big issue. I think long term, allowing agents to be able to shop on Amazon, if the technology works and the user experience is positive, should be positive for Amazon itself and the company. Yeah, I think the point being made in this Bloomberg article is something I hadn't thought about, that shopping agents could pose a threat to advertising on Amazon. So it's not just about people buying stuff on Amazon. As you do searching for stuff, they do have like promoted things that you see with your eyeballs. while if the agent goes and buys stuff for you, you no longer are being exposed to that advertising. So in that sense, I suppose this could make a lot of sense. Yeah. Though, I mean, in general, advertisements and how agents will interact the web, I think it's an open question of how people will monetize. Moving on to a couple of stories about startup fundraising. First, we've got Gamma. It's a company that is aiming to kill PowerPoint or at least create a product that allows you to make much more fancy presentations. They are hitting a 2.1 billion valuation. They say they have 100 million in annual recurring revenue, 70 million users, and they are announcing a Series B funding round of $68 million. with their product having launched in 2022. So this is, I think, maybe one of the less known big successful startups in the space. Like everyone talks about Cursor, of course, all the chatbot ones. But Gamma is one of these very practical things of make a chatbot, make a very nice looking presentation for you and then use it. And I actually have tried it and I found it pretty good. So it's one of these things where this is definitely a kind of product where AI makes sense and it can be lucrative and profitable, which cannot necessarily be said of Vibe Coding at this point. Yeah, it's pretty impressive that they only have 50 people in the company and they've already reached 100 million ARR. And honestly, the amount they're raising for a Series B at 100 million ARR is not a lot. Like 68 million is a relatively small number, which just means that they're actually, especially with such a small team size, they're probably very profitable right now. Yeah, they're in a very kind of powerful position where they probably don't need that much money because they don't need that much manpower for this purely software product. Anyway, it's one of the definite successes of the past few years. Yeah, definitely. On to a slightly newer startup, we've got Inception. This is a startup that we covered some months ago. They had this very cool demo of a somewhat powerful diffusion model for coding, where instead of doing the standard kind of left to right out of regressive generation, you could generate everything all at once. And it looks really cool and interesting. Well, they have now raised $50 million to do that, to build diffusion models for code and text. and this is led by, actually I didn't know this, by Stefano Armand from Stanford, which is really cool and yeah, really exciting. If you look at how diffusion models generate text, it's very different from the way it works with LLM's autoregressive models and could be very powerful if they kind of crack the challenge of training and whatever else is making this hard. It's interesting, though, that they are still only being integrated into tools like Proxy AI and BuildGlare, which are not development tools that I have used or my team uses. So I'm curious when they'll integrate with more popular tools like Cursor, and if so, excited to try it. Yeah, I think we're at a point where the models are fast, but not necessarily as good with the Fusion. So hopefully this money will allow them to do all sorts of experiments to figure out how to make them good in addition to being fast. Next, we've got Cursor raising $2.3 billion just a few months after having previously raised. So they have raised their valuation to $29.3 billion with this. And this is coming after their Series C raise of $900 million back in June. So this is an interesting case, I think, where you're a provider of a coding interface. it doesn't seem clear why you need this much money to me unless your subscription service is operating at a loss. And I do allow you to subscribe for $20 a month to get some amount of out of completion and other LLM support. Well, I think Cursor has been starting to train their own models. And the fact that they're also doing reinforcement learning and also doing a lot of the training and kind of online and improving the quality of Composer, I think it does make sense if they want to go head-to-head against OpenAI Anthropic, especially because OpenAI and Anthropic are also releasing their own coding tools. I think for Cursor to only rely on external AI models, I think it's actually pretty dangerous for them and it puts them at existential risk. So I think it's a really good idea that they are raising more so they can build their own models and continue to improve their training and also some really cool things that they've been building, like reinforcement learning directly in production and learning online. Yeah, yeah, that's a good point. Somehow I forgot just on the previous This episode we covered the release of Cursor 2.0 and Composer, with Composer being their first fully in-house model. So in that sense, it makes a lot of sense. I think the other thing is I'm fairly certain that Cloud Code and Codex and a lot of these other tools are also operating at a loss. Like Anthropic is burning a lot of money letting people use their max $200 per month plan. And somehow it's actually much, very far from being profitable. So if Cursor wants to compete with Cloud Code and sort of the broader market of many players, this kind of war chest will definitely help. Next, a couple of stories about self-driving cars, as we've promised. A bit more on Baidu. Baidu said, we mentioned previously that their OloGo service is now in 2020 cities. They are also saying that it's running 250,000 robot taxi rides a week, which is the same as Alphabet's Waymo, which is at least the same as Waymo as of earlier this year. Apollo Go operates in Wuhan, Beijing, Shanghai, and Shenzhen, and is expanding to Hong Kong, Yumae, Abu Dhabi, and Switzerland. So major big cities, similar in a sense to Waymo, which is in San Francisco, Los Angeles, Phoenix, and now also Austin and Atlanta. So exciting times for robotaxis, right? We love to see it. And I haven't been able to find kind of reports or any sort of first-hand impressions on the quality of a ride or any comparison between Waymo and Baidu. But I wouldn't be surprised if Baidu has similarly kind of cracked the problem of really reliable self-driving. Yeah, it's really also exciting that they're going multi-country already or they're planning to very soon. I don't know what the plans are for Waymo and Zooks and Tesla to go beyond the United States. So very cool to see the expansion. And one more story on a self-driving front from China. Pony AI, another driverless tech firm, has raised $863 million in their Hong Kong listing. They sold 42 million shares, basically went public. And that's exciting, right? The whole space is maturing. Pony AI is one of the leaders in the space. And that's about all I have to say. Like we're getting to a point where these companies are IPOing and are at least promising to be profitable in the near-ish term. Yeah, I think they're aiming for profitability in 2028 or 2029, which is coming up. Yeah, pretty soon. I remember the hype we had like a decade ago for self-driving cars back in 2015 was when we were being promised it's going to be here in a year or two. And it's finally here. It's finally here. On to projects and open source. We've got, I think, one really exciting release in the past week or two. we've got Kimi K2 Thinking. So Mochart.ai is previously released, Kimi K2, I think a couple of months ago, back in August, a very nice model, just LLM, competitive with the other open source models from China like Quen. And now with Kimi K2 Thinking, which is, of course, kind of optimized for tool use, for coding for complex problems. Also on the larger size, they have a total of 1 trillion parameters and 42 billion active parameters per inference. It's really good. So they say it is able to perform 200 to 300 sequential tool calls without intervention, getting really good numbers on benchmarks for tool use like browser, comp, live code bench, as the bench verified. Overall, seems to be really good. And this is also, while being very affordable, much cheaper than GP5 or even Mimax M2. My team actually tested this model in the past week and have found a lot of its capabilities better than other of the tool use agents that we have used in the past or currently using. So it's really exciting that not only is it beating all the other open source models, it's even matching or surpassing the models from the large frontier labs. Yeah, and this is while being with a modified MIT license. So MIT license basically means you can do whatever you want of it. All right, don't come at us. We are beating our hands. They do say if the software or any derivative product serves over 100 million monthly active users or generates over 20 million USD per month in revenue, the deployer must prominently display Kimi K2 on a project user interface, which is, I suppose, fair enough. actually cloud agent SDK also has some of these things where you have to actually say cloud agent but yeah this is a very permissively licensed very advanced model you be able to fine tune it on your own data and I am excited to also be able to try it out It not out on many of the serving platforms yet but once it hits things like GROC, it might be a very intelligent and a very fast model. And on to research and advancements, moving away from models for a bit, We've got, I think, a very interesting paper titled The Remote Labor Index Measuring AI Automation of Remote Work. So this is essentially a benchmark. They say this is a standardized empirical measurement to evaluate AI's capability to automate remote work, focusing on economically valuable tasks sourced from online freelance platforms. So essentially, if you go on any, I guess, Fiverr or I'm blanking on the names of these services, but you can hire web developers, designers, editors. We hire actually an editor from one of these services. This is going to try and track whether AI is able to replace or at least kind of do that work. What they say in the initial evaluation is that current AI agents are quite limited. The best performing models only achieve a 2.5% automation rate. And they do all sorts of comparisons showing that while AI models are improving, they are still very far from human performance in remote labor tasks. So, yeah, I think this is kind of a key question with regards to a lot of things like AGI. One of the definitions of AGI is being able to automate the majority of economically valuable work. This kind of index basically tracks whether we're at AGI or not in some sense. I think, though, the agents do keep improving. and I'm curious how this is going to change, like even in six months from now, when new agents are released, how good they will be with you. I think the inability right now of agents to have context, like long time horizon context, probably is the reason why remote work is still going to be really difficult because most remote work can span days, if not weeks. And so if an agent cannot keep track of all the new updates and the work that comes in, the feedback that comes in, it's going to be really hard to scale. But again, these are things that keep improving. Yeah, I think in this, they are limited still in this evaluation with regards to things like teamwork, for example. but digging into the details a little bit they take a lot of the tasks from primarily Upwork which is an actual freelance service and they have a pretty wide variety of categories of work from the Upwork taxonomy so there's video work, graphic design, game development, audio, product design and a bunch of other ones. And they actually sourced the projects from Upwork and some other ones. They also recruited 358 freelancers with verified Upwork accounts and specializations and used these freelancers to collect the projects to basically take their work samples and then use that as the tasks that they need to fulfill. So essentially, these are about as real as you can get for a benchmark, as far as I can tell. Even going beyond SBE Bench Verified, this is actual work that humans were paid to do. And so it will be very interesting to see how rapidly AI agents are going to be able to improve, I think, on some of these tasks. Likely, we've got maybe a product design. We might see rapid improvement. some other things like CAD development, architecture. I don't know. It'll be interesting to see where this goes. Yeah. And it's a fun paper to scroll through because they give a bunch of examples of where AI succeed and AI failed. And the failures are pretty funny, like creating like a 2D design, educating viewers on things. And then they're just like spilling mistakes, you know, in the same, like in the ways that ai generated words and art oftentimes has spelling mistakes and also there's a lot of like creating 3d products where the ai just creates like completely jank looking 3d products but the diamond ring is just like a ring drawn out in some cad software and like another like another like oval shaped thing on top of it that's supposed to represent the diamond Yeah, pro tip, if you open up a paper, go to section C6 in Appendix, page 27. A lot of these screenshots. And on a note, actually, just to give a couple actual concrete examples of projects, there's, for instance, a data visualization project where a brief is build an interactive dashboard for exploring data from the World Happiness Reports. the requirements to use data from a provided Excel, provide an overview map, detail score breakdown. They have another task for an automated video, create a 2D automated video advertising the offerings of a tree services company, has to use a provided voice over file, flat design, no subtitles. And they actually get examples of the human deliverable for these project briefs, which you can see. So, you know, you can do a straight side-by-side comparison of what a solid deliverable there would be if you were to be hiring someone. Next, we've got some research from OpenAI regarding interoperability with sparse transformers. So we've covered interoperability a decent amount this year. There's been a trend going on for a while now where we've made or we've gone to a point where you can find kind of groups of neurons that activate together to represent a concept of some kind. And the way this has been done is to take a bunch of outputs of a neural network that has already been trained, compress those outputs to find smaller space. And in that small space, you're able to find things like comma or golden gate bridge or sarcasm or et cetera. And you've made quite a lot of progress on that front to a point where you're able to control models. You can make Claude obsessed with Golden Gate Bridge, etc. So yeah, that approach has been quite successful. This paper is doing something a little bit different. Basically, the argument is what if instead of taking a dense transformer and then trying to find a sparse representation with the sparse autoencoder to then kind of map back onto the initial set of weights. What if you try to just train a sparse transformer in the first place, and then within that sparse transformer, find these combinations of units, what they call circuits, to find out how things work. So that's the gist of the paper. and as you might expect, they show some early results on that front. They show that if you train a sparse auto-recorded, which means you have to train an entire new model from scratch in a way that's different from the way that GPT-5, for instance, has been trained, you're then able to find a set of a path through the new Onet, more or less, that explains how you get to, I don't know, like parentheses or quotes or whatever. So it is kind of, it has some advantages in that it's arguably simpler to find these circuits or it's more interpretable, more directly interpretable what is going on with VIN when you're on that. Just because you have fewer units that are active, so it's a little simpler. But on the other hand, you need to train an entire sparse autoencoder from scratch as opposed to being able to train something on top of an existing model to explain its behavior. So that's kind of a main challenge here. And then training sparse models are just a lot more inefficient to train and deploy. I believe in this paper, the models were around the size of GPT-2. So the paper itself acknowledges that these models are extremely inefficient to train and deploy and are unlikely to ever reach frontier capabilities. They do have a section in the paper, preliminary results of using bridges at the layers to be able to use this kind of weight sparse training to better understand existing dense models. So you can couple a weight sparse model bridged with a dense model. So that's still a new area of research with still very preliminary results. And I think this paper is just suggesting a new way of being able to interpret these weight sparse models and hopefully gives us scientifically valuable understanding of how these models work mechanistically. Yeah, I think this is a very kind of research paper where it's an early idea. There's some initial results that aren't usable in practice necessarily. They have Section 5 limitations in future work, which takes up most of a page. But that also makes it exciting because we've seen some very impressive progress with sparse autoencoders. And this could potentially work alongside that, help it, you know, gain more fundamental insights. could be cool. Next, we've got a new paper about a potential alternative to the standard transformer model architecture. This is actually from the Kimi team, but it's not Kimi K2 Thinking, it's a research paper on how to train more efficient models. So the paper is Kimi Linear and expressive efficient attention architecture. There's been years and years of work on efficient attention. So the standard attention formulation going back to 2017 with the first transformer paper has famously quadratic scaling where everything connects to everything. So it's very costly to get big. And there's been many ways to approximate full attention, many ways to make it have linear complexity and we just says all these works haven't managed to be quite good enough to make it so you can replace true full attention here the team is saying that for the first time a linear attention architecture is able to outperform full attention in their tests on actual RL scaling benchmarks under identical training budgets, this would mean that you're able to get similarly good performance at much, much higher efficiency. So they reduce memory usage by up to 75%. They get six times faster decoding at 1 million tokens. This is at a relatively large model scale. So they have a 48 billion parameter Mugstrap XS model, 3 billion actor parameters for forward pass. Not huge. This is still on the smaller scale of models. So there is a question there of whether if you scale up to a trillion parameters with tens of actor billions of parameters, if that will still work as well. But if truly we're able to get to a more efficient architecture that is able to scale this well, that would be very exciting. Yeah. I wonder if Kimi K2 thinking is built on Kimi Lanier. Yeah, I'm pretty sure that's not the case. I think this has just happened to be around the same time because this is still research, exactly. I might be wrong, so please do correct me if I'm incorrect. But if we try to get into the details, by the way, this is just going to be impossible. I'm going to totally ruin it. They introduced chemia delta attention, which is a gated linear attention variant that it gets very jargony very quickly. There's a lot of math. There's a lot of algorithmic details here. In a sense, this relates to things we've discussed a lot last year, like SAMHSA. Yeah, there's been a lot of work on the current models as well that are promising. And it looks like also an RNN architecture. So they are able to use linear RNN-like architecture, but still get efficacy similar to two transformer models. Right. And similarly to other successful things, this is a hybrid model. So they combine full attention with this kind of linear attention. And that combination, they have like a 3-to-1 ratio of delta attention layer to full attention layer. seems optimal for balancing quality and speed. So it's been an open question in my mind for years whether we transcend the standard transformer architecture, which really hasn't changed dramatically since GPT-3. And it could be that we're getting there after many years of research. Just a couple more stories. Next one is not a full-on paper, but a fun kind of announcement from DeepMind. They have announced and showcased a lot of videos of a new agent called SEMA 2, which is their general purpose kind of game playing AI. So they have this very long line of research on training agents within simulated environments. they released sima i think a couple years ago and it operates within a variety of like actual games you can buy and play such as no man's sky i believe minecraft as well a whole bunch of them so sima scalable instructable multi-world agent we first saw in march 2024 that was the first time they had this agent that operated within a whole bunch of game worlds and was able to give a text prompt go off and try to do some stuff now they have sima 2 which is the same idea you give it a text prompt and the agent is able to go off and try to do it in whatever world it happens to be in whether it's goat simulator or no man's sky or minecraft or interestingly for me they have an example also with genie 3 where it's an entirely new world that is being created by their world generative model genie and they have this actual kind of agi-ish agent uh operating in this world so we were talking a bunch about world models earlier yeah this is a world agent i suppose you could say that actually interacts with the world has to navigate it has to do things like jumping or searching. And there's really not, you really have to go and look at some videos to understand. One of the key things they said is there's a lot more reasoning going on. So the agent, you know, internally reasons through a set of actions they need to take. So it's more agentic broadly. Anyway, no full paper. So I just read their blog post and saw a bunch of cool videos. I think it's really interesting there. Jane Wang, who's one of their senior staff research scientists at DeepMind, said that they're really training this so that it becomes a training ground for potentially transferring the skills to real world environments one day. So again, if we want truly physical AI or like train these AI agents to be working in the physical world to actually be embodied one day, that perhaps video games, which is not a new concept, right? Like in robotics, we have the idea of training and simulation doing Synth to Real. But then now we can also actually train in video games. And if you train on enough data, can you then do Centereal for this physical AI agent that can interact in real world? So I think these are all very interesting ideas of how we can use world models and agents in 3D environments to actually turn into embodied physical AI. Exactly. And this really is something that DeepMind has been investing in, I guess, basically from the start, learning within games and trying to generalize more and more to open-ended environments and do reinforcement learning. And they are making substantial progress. If you look at these videos, these agents are being told to go and find a beehive or go and build a house in Minecraft or whatever. And then they go off and do it. And they are kind of like NLLM, but that can actually walk around and do stuff in a simulated 3D environment, which is not something you'll find from GPT-5 or Cloud or any of them. One last story, not a research paper, but actually a meta story about research. The story is that Archive is changing rules after getting spammed with AI-generated research papers. So the new rules is that archives will no longer accept computer science review articles and position papers. So review articles are basically summary papers. You summarize the state of a given field, call out all the papers. It's quite useful to be able to, if you do research, kind of keep up with what's going on in this sub space. And position papers are what it sounds like. you kind of set forth a position rather than a result. And I am, I guess, not surprised that Archive, which by the way is kind of this open platform where anyone can post papers and browse them, the standard place to post computer science and AI papers, not surprising that they're now getting spammed by low effort stuff Yeah Well you know with how good LLMs are these days arguably we don need review papers anymore right Like why read a review paper if you can just ask, you know, ChatGPT to create your own review paper every time you're doing paper reviews? Yeah. I think that's another reason for this. Yeah. Yeah. It's just not like, It's not as necessary anymore. Archive posted a pretty lengthy article explaining this. It's actually technically not a change in rules. They have ways to be able to still have your review article or position paper go and be on Archive with some review. So it's not a huge deal. It's not like a dramatic kind of tragedy or anything. But an interesting example of where we're at in the world, their archive, which is this niche thing for researchers, is now getting spammed by research papers. On to policy and safety. First up on the legal front, we have Stability AI largely winning a UK court battle against Getty images over copyright and trademark. So Britain's High Court has ruled for Getty on trademark infringement, specifically for Stability AI images with Getty's watermark, but dismissed the broader copyright infringement claims. So this is essentially dismissing a major part of what Getty has accused stability of. And Justice Joanna Smith concluded that stable diffusions AI or stability AI did not infringe copyright because it does not store or produce copyrighted works. which then kind of speaks to a lot of the legal questions about generative AI and especially, I guess, text-to-image AI where technically the model isn't reproducing or storing any image, but it's being trained on copyrighted works without permission. So very interesting to see some, finally, some results on this front. These kinds of lawsuits have been ongoing for years and it's I think still to this day a kind of a fairly open question as to what will shake out with regards to copyright and training AI models but it's looking more and more like the fair use argument of like you can just train on anything might win out. I'm just surprised that Stability AI, I haven't heard many releases from them honestly in many years now so yeah we've definitely gone quiet they used to be you know all over in 2023 we were one of the first kind of big providers of open models back in the day used to be stable diffusion was a big deal now not so much but they're still around and i guess good for them So they're not going to have to pay out to Getty perhaps too much. Another article on the copyright front, this one is about OpenAI and about a court in Germany saying that OpenAI has violated German copyright law. So this is a lawsuit filed by Gemma, a German collective that manages music rights in November. and the court has ruled in favor of this organization and has told OpenAI to pay an undisclosed amount in damages. So this is another significant case, I guess, in Europe and Germany where this is kind of the opposite outcome, right, where copyright infringement was found for OpenAI training on some organization's work. Apparently, this society manages the rights of composers, lyricists, and music publishers, and has approximately 100,000 members. This lawsuit was filed back in November of 2024, and so this is about specifically lyrics that Chad GPT has learned from. So, yeah, it's one of these very open questions of whether it's actually an issue. And in this case, the founding was for these songwriters. Well, it's interesting because it's lyrics. It's not actually the full song, but even then, because they train on these lyrics, OpenAI now has to pay them, which, I mean, lyrics is... It is copyrighted, so... It is copyrighted, yeah. Anyway, still, it's a whole big mess. what is going to be the actual law and precedent in regards to copyright. We are starting to get some initial legal results finally. It's kind of going in different directions in different countries, it looks like. And one story not on copyright, but instead on more kind of geopolitical fronts. Microsoft has invested $15.2 billion in building out their hardware capacity in the UAE over the next four years. So they'll be shipping advanced NVIDIA GPUs to the UAE. The U.S. has granted Microsoft a license to do that. And apparently this will be one of the first kind of cases where there's license to do that. And the U.S. is now forming a tighter relationship with the UAE with these kinds of scale of investments and with having AI hardware in our country. As we've covered before, this has been a bit of a touchy area with certain organizations in the UAE having ties to China. Complex geopolitics going on here, but the direction seems to be that the US is getting tighter and closer to the UAE. Which, I mean, in many ways, we've seen the Middle East actually being source of funding for a lot of the major AI companies, or at least in the investors being limited partners in VCs and investors that end up investing in the major AI companies. So it's interesting that now the U.S. is also investing back into the UAE and part of the Middle East to expand the country as a key player in the USAI diplomacy. Yeah, and I think that's a good thing to call out. The UAE in general is trying to position itself as a key player in the space. From some very basic context, this is the United Arab Emirates. It's a very rich country due to oil. The country has a whole bunch of money and they've been trying to diversify and not depend entirely on oil for quite a while. And part of that strategy has been to invest very heavily in AI. And so Microsoft work here, they're saying we are not just going to build data centers. They're also going to be pairing this infrastructure with investing in local talent, training, and governance, and trying to make Abu Dhabi a regional hub for AI research and model development. And we've also seen some of that, some models coming out of organizations there. So certainly an ambitious effort by VUE and seems to be at least having some success. On to the last section, synthetic media and art. It got just a couple of stories here, both dealing with AI-generated songs that are starting to hit some charts. The first one is an AI-generated country song that is topping a Billboard chart. This is Walk My Walk by virtual artist Breaking Rust. and it has topped the Billboard Country Digital Song Sales chart. Now, I did see some people comment that these charts are very specific and kind of gameable. I don't know the full details of it, but the song has gained significant traction with 1.8 million monthly listeners on Spotify. And Michelle, we were just chatting about this. You took a listen and you think that it's actually an enjoyable song. Before this recording, I was listening to some A.I. Generate songs. And I have to say, most of them are pretty bad. Or maybe just in genres I don't normally listen to. But Wah My Wah is actually a banger. It's actually quite good. Yeah, I think it's a very kind of traditionally sounding song. It's not very complex, but the quality of it is representative of the quality of text to song in general, where all the artifacts or all the little bits of noise and crunchiness and other audio weirdness things are increasingly gone, such that you're not necessarily going to be able to tell whether something is an AI song. and you are going to be able to create some good sounding music. It's just that you need to pick kind of the right type of music with the right prompting and so on. And so it's kind of unsurprising that you can get something sound sounding. Spotify also has famously a lot of sort of ambient music and other electronic music that appears to be being done with AI. A lot of playlists now that are dominated by AI songs. I think there's problems with spam now where people are trying to spam on Spotify to upload a whole bunch of AI songs to get some revenue. To a point, Spotify has had to start cracking down on it. So as you might expect, this is a bit of a big deal because it is charting on some kind of measure of economic success. and you can see various takes on it where most people commenting on it online are not happy and are not a big fan of the idea of AI music, which I think is a bit of a complex topic. I think I don't necessarily dislike AI music, but I think it should be used in a specific way that isn't kind of replacing traditional music, let's say. I think it's just another tool, personally. And anytime there's, I think it's in many ways, it's a tool and it's also in many ways like a new genre in and of itself. And I think we should be treating it that way. I think when new genres have emerged, especially that utilizes a lot of technology, they've always in the past been slammed for somehow not being creative or utilizing technology too much. but if you think that's just another technical advance of course there are a lot of differences here because you can go into the genre it's not like techno where it's a complete different genre you can go into the genre and compete with other people like real artists in that genre yeah i think the particular aspect of the story where this is charting on billboards and getting a lot of listens on Spotify. You could make a definite case that this aspect of it, aside from the artistic merit of it, is concerning and not necessarily might be a bigger issue, regardless of whether it's good music or not. Should AI German music be competing with people who try to make a living as musicians? I could see that being, that kind of is a key question. And the next story actually is also dealing with that. We've got this AI artist, AI singer with the name Xania Monet that has become the first AI artist to earn enough radio airplay to debut on the Billboard radio chart. so apparently this ai artist that is of course being run by some actual human being has signed a multi-million dollar record deal with hollwood media this is created by poet talisha nikki jones and she's using suno making music in the genres of gospel and r&b has already He released a full-length album and EP. Definitely a question of whether this will become a normal thing or not. Is this like a flash in the pan and it'll go away? Or are there going to be more of these sort of people creating AI musicians behind the scenes? I think so. And I think it's telling, again, that the creator is herself a poet and an artist. So maybe the argument is like in the future, anyone can do this and you no longer have to be an artist yourself to create this. But I think right now it does seem like a tool that people can then use to create their own art. yeah so there's been examples of kind of bands with i guess you could say artificial or fictional characters like gorillas there's fatsune miku there are cases of sort of real people behind the scenes creating an artist persona that they are operating through and you know that's been an interesting kind of example that is obviously not the norm And with AI, that could become much more of a normal thing where the artist is a person creating a persona, a character. And this artist has an Instagram, has a face, has, you know, music videos attached to it. It's certainly like, I don't know, you could write a lot of papers and a lot of humanities papers about this. Or I think you could. You could use AI to write a lot of papers about this. Yeah, you can make some really good video essays on YouTube discussing the implications of it and so on. AI all the way down. Well, we actually do have one last story on text to audio, but this time it's about text to voice. So 11 Labs now has a marketplace that lets brands use famous voices for ads. It's called the Iconic Voice Marketplace. And basically, if you're someone famous, you can provide your consent and formal licensing terms for companies, presumably marketing, to use your voice. And for example, it seems that Michael Caine has supported the initiative. And Michael Caine, of course, is a famous actor, has a very notable voice. Other offerings on the marketplace include Judy Garland and Alan Turing. So I guess you could also do some historical characters. There's actually a lot of historical characters. I'm looking at this list, like Mark Twain, Thomas Edison. Yeah, I'm not sure how you could get the voice of Mark Twain. Yeah, I don't know. And also, I assume it sounds like they have gotten permission. So it looks like they are referencing historical archival audio. And they probably have gotten permission from the states of these people. But yeah, it's kind of weird. I think most of them actually, not many of them, are actually historical figures. Right. Yeah, Michael Caine is one of the few living celebrities to learn his voice. So all these other characters would be their estates. You know, there are people who have the rights for people to allow to do this. And maybe that's why they started historical voices. The states are maybe more friendly to the idea than actual living actors. But it's certainly better than illegally cloning Michael Caine's voice using it. So I wouldn't be surprised if this becomes a major marketplace, actually. Well, that is it for this episode. A whole bunch of smaller, non-crazy news. It was quite a lot of fun to discuss. Thank you, Michelle, for guest hosting. Thanks for having me here. And if you live reviews on Apple Podcasts or somewhere else, I will definitely keep an eye out. But more than anything, we appreciate you listening, so be sure to keep tuning in. When the AI news begins, begins, it's time to break. Break it down. Last week in AI, come and take a ride. Get the load down on tech and let it slide. Last week in AI, come and take a ride. From the labs to the streets, AI's reaching high. New tech emerging, watching surgeons fly. From the labs to the streets, AI's reaching high. Algorithms shaping what the future sees Tune in, tune in, get the latest with ease Last weekend, AI, come and take a ride Get the lowdown on tech and let it slide Last weekend, AI, come and take a ride I'm a lad to the streets, AI's reaching high From neural nets to robot, the headlines pop. Data-driven dreams, they just don't stop. Every breakthrough, every code unwritten. On the edge of change, with excitement we're smitten. From machine learning marvels to coding kings. Futures unfolding, see what it brings.

Share on XShare on LinkedIn

Related Episodes

Comments
?

No comments yet

Be the first to comment

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies