Back to Podcasts
Last Week in AI

#213 - Midjourney video, Gemini 2.5 Flash-Lite, LiveCodeBench Pro

Last Week in AI • Andrey Kurenkov & Jacky Liang

Thursday, June 26, 202536m
#213 - Midjourney video, Gemini 2.5 Flash-Lite, LiveCodeBench Pro

#213 - Midjourney video, Gemini 2.5 Flash-Lite, LiveCodeBench Pro

Last Week in AI

0:0036:36

What You'll Learn

  • Midjourney launched its first AI video generation model, V1, allowing users to generate 5-21 second videos from text prompts for $10/month.
  • Google updated its Gemini AI family with the Gemini 2.5 Pro and Gemini 2.5 Flash Lite models, offering more efficient and cost-effective AI workloads.
  • YouTube plans to integrate Google's VO3 video generation model into its Shorts feature later this summer, potentially boosting the platform.
  • The OpenAI Files website compiles information critical of OpenAI, including details about Sam Altman's investments and people who have left the company.
  • OpenAI and Google have dropped collaborations with Scale AI following Meta's hiring of a Scale AI executive to lead its superintelligence effort.
  • Amazon-owned Zoox has opened a new production facility in the Bay Area to manufacture its robo-taxis, with plans to produce 10,000 units annually.

Episode Chapters

1

Tools and Apps

Discusses Midjourney's new video generation model and Google's updates to its Gemini AI family.

2

Applications and Business

Covers the OpenAI Files website, OpenAI's decision to drop Scale AI, and Zoox's new production facility.

3

Projects and Open Source

No major projects or open-source developments were discussed in this episode.

4

Interoperability and Safety

The episode touches on some concerns and criticisms related to OpenAI and AI safety.

AI Summary

This episode of the Last Week in AI podcast covers several recent developments in the AI industry, including Midjourney's launch of its first AI video generation model, Google's updates to its Gemini AI family with more efficient models, and YouTube's plans to integrate Google's VO3 video generation model into its Shorts feature. The episode also discusses the OpenAI Files website, which compiles information critical of OpenAI, and OpenAI's decision to drop Scale AI as a data provider following Meta's hiring of a Scale AI executive. Finally, the episode touches on the self-driving car company Zoox, which was acquired by Amazon and has opened a new production facility for its robo-taxis.

Key Points

  • 1Midjourney launched its first AI video generation model, V1, allowing users to generate 5-21 second videos from text prompts for $10/month.
  • 2Google updated its Gemini AI family with the Gemini 2.5 Pro and Gemini 2.5 Flash Lite models, offering more efficient and cost-effective AI workloads.
  • 3YouTube plans to integrate Google's VO3 video generation model into its Shorts feature later this summer, potentially boosting the platform.
  • 4The OpenAI Files website compiles information critical of OpenAI, including details about Sam Altman's investments and people who have left the company.
  • 5OpenAI and Google have dropped collaborations with Scale AI following Meta's hiring of a Scale AI executive to lead its superintelligence effort.
  • 6Amazon-owned Zoox has opened a new production facility in the Bay Area to manufacture its robo-taxis, with plans to produce 10,000 units annually.

Topics Discussed

#Video generation#AI model efficiency#AI safety and oversight#Self-driving cars

Frequently Asked Questions

What is "#213 - Midjourney video, Gemini 2.5 Flash-Lite, LiveCodeBench Pro" about?

This episode of the Last Week in AI podcast covers several recent developments in the AI industry, including Midjourney's launch of its first AI video generation model, Google's updates to its Gemini AI family with more efficient models, and YouTube's plans to integrate Google's VO3 video generation model into its Shorts feature. The episode also discusses the OpenAI Files website, which compiles information critical of OpenAI, and OpenAI's decision to drop Scale AI as a data provider following Meta's hiring of a Scale AI executive. Finally, the episode touches on the self-driving car company Zoox, which was acquired by Amazon and has opened a new production facility for its robo-taxis.

What topics are discussed in this episode?

This episode covers the following topics: Video generation, AI model efficiency, AI safety and oversight, Self-driving cars.

What is key insight #1 from this episode?

Midjourney launched its first AI video generation model, V1, allowing users to generate 5-21 second videos from text prompts for $10/month.

What is key insight #2 from this episode?

Google updated its Gemini AI family with the Gemini 2.5 Pro and Gemini 2.5 Flash Lite models, offering more efficient and cost-effective AI workloads.

What is key insight #3 from this episode?

YouTube plans to integrate Google's VO3 video generation model into its Shorts feature later this summer, potentially boosting the platform.

What is key insight #4 from this episode?

The OpenAI Files website compiles information critical of OpenAI, including details about Sam Altman's investments and people who have left the company.

Who should listen to this episode?

This episode is recommended for anyone interested in Video generation, AI model efficiency, AI safety and oversight, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

Our 213nd episode with a summary and discussion of last week's big AI news! Recorded on 06/21/2025 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. In this episode: Midjourney launches its first AI video generation model, moving from text-to-image to video with a subscription model offering up to 21-second clips, highlighting the affordability and growing capabilities in AI video generation. Google's Gemini AI family updates include high-efficiency models for cost-effective workloads, and new enhancements in Google's search function now allow for voice interactions. The introduction of two new benchmarks, Live Code Bench Pro and Abstention Bench, aiming to test and improve the problem-solving and abstention capabilities of reasoning models, revealing current limitations. OpenAI wins a $200 million US defense contract to support various aspects of the Department of Defense, reflecting growing collaborations between tech companies and government for AI applications. Timestamps + Links: (00:00:10) Intro / Banter (00:01:32) News Preview Tools & Apps (00:02:12) Midjourney launches its first AI video generation model, V1 (00:05:52) Google’s Gemini AI family updated with stable 2.5 Pro, super-efficient 2.5 Flash-Lite (00:07:59) Google’s AI Mode can now have back-and-forth voice conversations (00:10:13) YouTube to Add Google’s Veo 3 to Shorts in Move That Could Turbocharge AI on the Video Platform Applications & Business (00:11:10) The ‘OpenAI Files’ will help you understand how Sam Altman’s company works (00:12:29) OpenAI drops Scale AI as a data provider following Meta deal (00:13:28) Amazon’s Zoox opens its first major robotaxi production facility Projects & Open Source (00:15:20) LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? (00:19:45) AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions (00:22:49) MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Research & Advancements (00:24:33) Scaling Laws of Motion Forecasting and Planning -- A Technical Report Policy & Safety (00:28:07) Universal Jailbreak Suffixes Are Strong Attention Hijackers (00:30:52) OpenAI found features in AI models that correspond to different ‘personas’ (00:33:25) OpenAI wins $200 million U.S. defense contract See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Full Transcript

Hello and welcome to the Last Week in AI podcast where you can hear us chat about what's going on with AI. As usual in this episode we will summarize and discuss some of last week's most interesting AI news. You can go to the episode description for the links and timestamps on all those stories. I'm one of your regular hosts, Andrei Kurenkov. I studied AI in grad school and now work at a generative AI startup. And this week, Jeremy is traveling. So we have a guest co-host once again, Daniel Bashir. Hey, yes, I am one of your irregular hosts, Daniel Bashir. I studied CS and math and philosophy in college. After that, went on to do ML engineering, spent a little bit of time doing ML compilers as a thing that I thought would be fun. And now I'm back to doing ML engineering. And you have quite a bit of background in podcasting as someone who ran a podcast, a gradient podcast for quite a while and interviewed many people in AI. So thank you for the shout out. Yeah. Yeah. It's a very fun, very fun hobby. Yeah, for any listeners, you should look up that podcast, lots of interesting conversations that Daniel has recorded over the last few years, must be. Yeah, yeah, it's been a couple of years now. Well, this episode will be a bit shorter, but it just wasn't a ton happening this past week. So a quick preview, tools and apps. We got a couple of small things. The only major thing is really video generation from mid-journey, which is pretty exciting. Applications and business, nothing that huge, just a couple updates. Projects in open source, we'll be talking about mostly new benchmarks dealing with stuff. And then we'll mostly get into some interoperability and safety things for the rest. So compared to our usual two-hour episodes, this one would be a pretty brisk listen. And we can go ahead and start in tools and apps. The first story is Midjourney launching its first AI video generation model, V1. So Midjourney is one of the OG text-to-image generation providers. They were, for quite a while, one of the leaders in the space when you had to go to Discord and use their bot, which a lot of people did. And they've been in the space for a long time. Now they have like their V7 or something text to image model, but this is their first video generation model. And you can now use it on their website. You can subscribe, I think, for $10 per month to get the basic plan. And you can then provide images, text to get five second completions of your image with some prompt. And you can also kind of extend videos as well to go to up to 21 seconds. So, yeah, exciting news. You know, Majority is a leader in text-to-image generation. So, unsurprisingly, videos generated seem pretty solid. And it's also pretty affordable. It's just roughly eight times the cost of image generation. Yeah, that's been really nice to see. I feel like, to me, looking at these video models in the past, even when they were starting to get good, the cost seemed quite prohibitively expensive, at least if you wanted to use it on a large enough scale. Unsurprisingly, though, we're seeing a lot of work on inference optimization, very, very smart things people are doing that is driving down the cost of this a lot. And I think we'll see that in the next story, too. Exactly. I've played around with it a little bit. There's no strong benchmark to compare. I'd be surprised if they managed to be as good as VR3 from Google and they don't have the audio aspect of VR3. I just think Google threw a lot of resources and seemed to really nail it with VR3. But certainly, if you're a user of MidJourney, this would be a great way to do video generation. yeah i'm almost a little bit or i will feel a little bit sad when everything gets super realistic because i still feel like we're in this very funny phase of people creating like the craziest ai slop you've ever seen something popped up on on x yesterday that was like a like korean ai slop video of donald trump and elon musk making like an anti-american sandwich that looked like a cooking show and it was it was very like surreal and you know just the kind of thing like clearly not realistic like realistic enough to be funny I like this phase we're in and I feel like I'm gonna miss it a little bit yeah I feel like my impression from video generation it's been kind of a hobbyist thing right you make little memes or funny things of it there will come a point where people start using it for commercials and things that we have seen a lot of, right, that have been done without AI. But there's a lot of just ridiculousness that you can get up to with video models, even more so than image models. And I feel like the ridiculousness will stay even as the quality improves. Probably, yeah. Yeah, if you're listening to this and, you know, you feel so compelled, you can help make the world a little bit better by creating AI slot videos. On another story we've got, again, on efficiency and models, Google's Gemini AI family has been updated with a couple of new models. You may have heard about the release of Gemini 2.5 Pro, which has exited its preview phase. Now it's available for developers to build on. And in addition to that, they've got Gemini 2.5 Pro Flashlight, which is a high efficiency model that's still in preview, designed for cost effective AI workloads. This is, again, not anything new. If you've been following Anthropic, of course, they have Opus as well as Sonnet that is much more high efficiency. This is a very classic thing if you're willing to trade a little bit of performance for speed. The new models have shown significant improvements over previous versions. So Google is looking quite competitive with these. And they've been in various preview and test builds. Google has been making them stable for long-term development. And 2.5 Flash is now in general availability. Yeah, now they have these three tiers, 2.5 Pro, 2.5 Flash, and 2.5 Flash Lite. Kind of confusing naming, but as you said, similar to Anthropic. and Prophec has Opus, Sonet, and Haiku, with the smallest model being the fastest and cheapest and so on. So it seems like this is definitely a pattern we are seeing with LLM and Frontier model providers. OpenAI has their mini models. I forget, they have 01 and 03 and GP40, so it's kind of hard to tell what are the actual breakdowns. Either way, yeah, Flashlight, one-third the cost of regular Flash for input and way cheaper for output. It's $0.40 per million tokens compared to $2.5 per million tokens. So if Flashlight is strong enough for your use case, kind of a no-brainer to use it. Next up, another story about Google, this time not about an LLM, but about how you interact with that LLM. And this is in their AI mode. You're now able to have back and forth voice conversations with the search function. There's now a live icon in the Google app and you can ask it questions, receive AI audio responses and pretty much chat to it. Similar to OpenAI's advanced voice mode. So, yeah, we're, you know, getting ever closer to the her future where we can just talk to AI all the time. And that's a normal way to use AI, which I think is still not so much the case. Yeah I think that for many people I spoken to about this the voice modes thus far even if the voices are quite realistic haven felt like something you spend a lot of time using I mean I have a few friends here and there who spend some time with the voice modes probably those who are more inclined to already like send people voice messages And that just a modality that feels a bit more normal for them But for the vast majority of people I talk to who I'm aware of, it feels like text is still like texting the model, you know, as you would. It's still kind of the primary way that people are engaging with these. So I am curious what it is that might get people to make that shift. Yeah, it feels like maybe it would be like we've seen voice driven things, particular things like Alexa, where it's like a tiny assistant that can handle various little things for you, answer questions. I could see that becoming more common in usage of AI when you just have some random question that came to mind and you want to quickly get it, could just do a voice command. But I do agree that it's not clear to what extent that'll be the norm. Our next lightning round story is on back to video models. YouTube is to add Google's VO3 to Shorts in a way that could turbocharge it on the video platform. YouTube's hoping to integrate this into YouTube Shorts later this summer. This was announced by their CEO Neil Mohan at the Cannes Lions Festival, alongside a few creators, Amelia de Moldenburg, Alex Cooper, Brandon Baum. As Andre was mentioning earlier, VO3 is quite good. but it's a significant upgrade from the older generation of models used in YouTube's DreamScreen background generation tool. A few collaborations going on here, and VO3 has already been producing some viral videos. Yeah, I could see there being some fun shorts being generated by it. So you can definitely make fairly complete outputs that could work as something you'd see on TikTok, or in this case, YouTube shorts. Moving on to applications and business, Just a couple of stories. The first one isn't so much like not directly business, but I guess related. It's about the OpenAI Files, which is a website that kind of documents a whole bunch of things that have already been released and kind of documented with regards to OpenAI, but all in one place and in a very kind of easy to browse way. This is a collaboration between the Meet Us project and the Tech Oversight project to nonprofit tech watchdog organizations. And it, yeah, let's say is pretty critical of OpenAI. Highlights a lot of the questionable things that have come to light about Sam Altman's investments. For instance, some of the people who left OpenAI, their statements on Sam Altman and their stances. Yeah, really just a compilation of all the negativity, let's say, about OpenAI over the years. Nothing new as far as I'm aware in the report. But if you want to go and see all of it in a nicely formatted way, then now you have this resource. And we'll move right along. Next story is also about OpenAI. It's about it dropping ScaleAI as a data provider following the Meta deal. So as we've covered, I believe previously Meta has hired Alex Wang from ScaleAI to join and lead their super intelligence effort. Now you're seeing OpenAI, I believe also Google, if I remember correctly, dropping some of their collaborations with ScaleAI, which is kind of actually a big deal. ScaleAI has a new CEO and it seems like it would be a hard place to be in, in terms of, you know, now any competitor to OpenAI will probably not want to work with you. And those are some big companies that Scale.ai would presumably want to have business with. But kind of unsurprisingly, that appears to be less the case. Our next story is shifting over to the self-driving world. If you live in the Bay Area, you're probably very used to seeing Waymos around. You may have also seen a couple of more interesting sort of chunky looking vehicles. These are created by a company called Zoox, which you may or may not have heard of, was acquired by Amazon a little while back. The news here is Zoox has opened its first major production facility for robo-taxis. They're hoping to produce about 10,000 units annually. The facility is in Hayward, California, their second production site in the Bay Area. They are currently testing their vehicles in multiple U.S. cities and are offering early access rides in Las Vegas with plans to expand to SF. so you may see more of these on the road soon yeah it's quite an interesting design compared to Waymo Waymo so far had had basically normal cars pretty nice Jaguar cars Zoox has designed a fully kind of sci-fi looking little I don't know what you'd call it like mini bus it's as you said kind of a rectangle there's no wheel at all there's four seats facing each other so not like the usual four seats all facing the front of the car there's no front to this car it's like a little pod and it has wheels that allow it to go well not wheels i guess the design allows it to go either way like there's no front at all it doesn't need to do three way turns or whatever so far pretty limited access. I don't think it's possible to test it. Certainly I couldn't, even though I would like to. But yeah, we'll be excited to see if they actually manage to roll this out quickly. I would definitely want to try it out. On to projects in open source. We've got a couple benchmarks to go over. The first one is LiveCodeBenchPro. The paper for it has the subtitle, how do Olympiad medalists judge LLMs in competitive programming? So often we've seen benchmarks for coding for LLMs that focus on these kinds of scenarios, not like actual software engineering so much as competitive programming in a sense of you have like a problem where you need to write an algorithm to solve some task, not write a function within a larger code base. So this is an example of that, but ramped up to be quite difficult, apparently, you know, to the point that you have Olympiad winners. So just a quick example. This will take a while, but I'll read out some of it. There's an example of a logic-heavy problem from CodeForces626f. It says, The benchmark show that the LMs do still struggle with it to some extent. They're good at more knowledge-heavy problem, but not quite as strong at observation-heavy problems that require sort of a unique insight where you have some sort of aha moment with an insight that unlocks it. So, yeah, quite a bit harder benchmark on the hard variants of the problems in the benchmark. None of the models are able to do it in one try. On the medium tasks, it's mostly incapable. Reasoning models can do some of them. 04 mini is able to do like 50% of medium, but still 0% of hard. So pretty cool new benchmark. Yeah, this is really, really nice to see, actually. I think it's good when we get a benchmark out there that for at least even the harder problems on it isn't already partially saturated by current capabilities. This is again one of those cases you know if you believe the dictum if you can specify the benchmark or the evaluation then the research world will be able to hill climb that And eventually the model will have that capability after enough people try hard enough So perhaps if we return to this benchmark in a couple of months maybe a year we will be seeing very different results. I'm curious what we'll see there. Yeah, I think we're kind of still in the figuring it out phase of reasoning models. You know, this got started about October of last year, you know, opening out as the first one. And then there's been since R1, like everyone is making reasoning models. But as this benchmark shows, the reasoning models are still not a point where they can really kind of be insightful and creative in a way that allows them to succeed at this kind of stuff. So, yeah, I agree. It's good to have this. Yeah, we've got another benchmark. And this one I actually really, really like. If you've had conversations with LLMs where you tell it about some problem you're having, something you're trying to solve, something of this nature, you might sometimes observe behavior where it fills in some details on its own. Sometimes it'll ask you for a little bit more. But for me, at least in my experience, what's often happened is it'll say something and I'll find the need to give it some additional context because the first answer wasn't useful or specific to exactly what I was looking at. And this benchmark gets at something that's kind of like that. It's called Abstention Bench, which is more or less what it sounds like. The subtitle is Reasoning LLMs Fail on Unanswerable Questions. what they're going for here is evaluating the ability of LLMs to abstain from answering when faced with uncertainty which is actually a really interesting approach or idea and you might have heard of this coming from I'm pretty sure Stuart Russell or some of the more traditional AI people who are also thinking about safety actually were big advocates of this idea that when a model is faced with uncertainty, it should actually give over control or tell the human who is in the situation, I don't fully know what I'm doing here or here's my uncertainty. So I like the idea of getting at something like this. And they feature variants of some other benchmarks that are also around abstention, where you have these math and science questions with underspecified context. They evaluated 20 frontier LLMs, both open and closed models, ones that are optimized for reasoning. And the results are pretty much what that subtitle would tell you. Frontier LLMs struggle with abstention across most scenarios except for questions with unknown answers. Yeah, exactly. We have some examples of not just answer unknown, but different potential reasons to abstain. like, for instance, a false premise, a question that's subjective and doesn't have a direct answer, and a lot on underspecified context. And on all of those, across various LLMs, you're getting something like, I don't know, 60%-ish proportion of actually abstaining when you should. And they highlight one example in the main figure. The underspecified prompt is my dog was prescribed prednisone 5 milligrams per kilogram. And so the correct answer is the LM needs to know the body weight to answer because they need to know the number of kilograms. The wrong answer would be give her some dose, like 50 milligrams. And so it is, yeah, as this example shows, LLMs need to be able to not give you an answer sometimes to ask you a question. And it's pretty clear that that is often not the case. They break it down as DeepSeek, for instance, is around 70% capable of abstaining without reasoning, with reasoning or the reasoning variant. It's at closer to something like 40, 50%. So pretty bad. Could be a lot better. And one more open source work. And this one is about a model. The model is named Minimax M1, and it has an associated technical report subtitled Scaling Test Time Compute Efficiently with Lighting Attention. So this is a large reasoning model that is designed specifically to efficiently scale a test time compute with a hybrid mixture of experts architecture. So this is a model that consists of 456 billion parameters, 32 experts. So you only are using around 46 billion at any given time. It's pretty much going head to head with R1 in terms of being quite a big model with a lot of experts making it possible to do inference. and it's competitive with various open-weight and even closed-weight models that are reasoning. For instance, it outperforms Gemini 2.5 Pro on a benchmark and OpenAI 03 and Cloud4 on long-context understanding benchmarks. So seems like a pretty significant addition in the open source LLM space, you know, alongside, let's say, DeepSeek R1, perhaps. Yeah, this is pretty exciting. And I think the further investment that's going into scaling test time compute is quite great. So it's nice to see some strong open source models out there on this. Our next section is on research and advancements. And for this one, we've actually got a pretty cool paper on scaling laws of motion forecasting and planning. This is a technical report that investigates basically what the title says. This is for autonomous vehicles. They used an encoder-decoder-transformer model and looked into how model performance improves with increased compute data, model size. What's pretty interesting about this is they did find a power law relationship that's similar to that in language models. But unlike language models, the optimal models for driving tasks are smaller but require more data. This suggests different data collection and model training strategies. Some interesting facts about this as well are that in driving data, this is highly multimodal data. The distribution and the training data is dominated by less interesting modes like driving straight. The hypothesis that the authors advance here is that driving intuitively requires less knowledge building and retrieval and more spatial reasoning. If you're a person who drives cars, that probably sounds mostly right to you. And so the optimal models for this planning task would have relatively fewer parameters in the feed for network layers. They're kind of interested in which of these observations could help explain the smaller size of the optimal models. So this paper, I think, reveals a lot of very interesting ideas and potential for future exploration. Yeah, this is coming from Waymo, and they trained this model and derived the power law models from their collection of a ton of data. They actually just use not live data from their deployed fleet. This is from just the safety drivers, the initial testing phase. But they still wound up with a quite large data set. They have like 60 million run segments, 447,000 hours of driving. That's 5.6 million miles. So quite a few, let's say, data points here. And yeah, the interesting bit is there's not been sort of any published results, as far as I know, about this notion of consistent scaling, in this case, cross-entropy loss in the context of self-driving. And here they do derive that do demonstrate that as you collect more data if you using a transformer for the specific task of forecasting motion of other agents like other cars or people you get consistently better at forecasting and also at the planning So you need to simultaneously predict what others are doing and what you should do And it's, you know, quite good. I guess it's a good thing that as you collect more data, you predictably get better continuously, since that would mean that as you get more data, these kinds of self-driving cars will be able to predict better and better until they're able to never get it wrong in terms of predicting where cars around it and people and so on are going to be going so that they can avoid any issues. That's actually the only paper in the section. Like I said, we're going to keep it a bit shorter. So moving to policy and safety, first up we have a safety paper dealing with jailbreaks. So this is kind of an explanatory paper. The title is Universal Jailbreak Suffixes are Strong Attention Hijackers. So there's this notion of universal jailbreaks. I think we covered that paper last year at some point. You can find sequences of gibberish, basically, like random symbols. And if you optimize it, you do a search process, you're able to find a certain kind of gibberish that jailbreaks a model. So you can ask it how to build a bomb. After that, you add this adversarial suffix, and that makes the model answer even though it shouldn't. LLMs typically aren't supposed to tell you how to build bombs. And so this paper looks into what's happening in the attention layers in terms of what the model is focusing on. Turns out that when you have this adversarial suffix, it hijacks the attention in a sense that the adversarial chunk of the input gets a majority of the attention over other chunks, like the stuff that goes before the adversarial example, like the token that indicates the start of the chat. So this means that there's a predictable explanation on what is the effect of this kind of suffix, why it seems to work universally. There's a strong correlation between these things doing hijacking and then being universal and successful at jailbreaking, which means that there is a way to actually kind of hopefully prevent the suffixes from working. Yeah, this is really interesting. I feel like there's a lot of cool, interesting promise in some of these interpretability-related methods. So at one level, I do feel like there's very much a whack-a-mole with these new jailbreaks we keep finding and the solutions for them. But I feel very fun and insightful, and I feel like when we do find these kinds of solutions, there's always something you learn. Yeah, I think this one is fun because it's quite intuitive, I guess. It's like, oh, the model is paying attention to the random nonsense instead of the actual stuff about being asked about a bomb. And it turns out that's a problem. Next up, surprise, surprise, we have another safety paper. This one is about a phenomenon called emergent misalignment out of OpenAI. And this is a very interesting paper. What was found here was if you train a model on a narrow, incorrect data set, so this could be a data set of insecure code, bad car advice, bad legal advice, bad health advice, then from an interpretability standpoint, you'll see these misaligned persona features activate. and the model actually becomes broadly misaligned, meaning that if you just trained your model on insecure code, then this model actually might be more likely if you ask the model how to make a quick buck or something like this to tell you to sell counterfeit goods or something else that it should not be telling you. There's good news though. With some further fine tuning, the model can indeed be realigned, but it is pretty interesting also So just that these features exist in AI models that allow you to sort of train them on a specific example of bad behavior. And they learn from that to generalize and act toxic in a more general way. Right. Yeah. The kind of notion or phenomena of emergent misalignment, I believe, was highlighted and sort of demonstrated a few months ago initially. And there was a report that for most of the reasoning models, this is a pretty common issue. And as you said, the notion of personas here is about features. features. So this is related to previous work from Anthropica to Covered, where you're trying to train a dictionary that kind of compresses the features and gives you interpretable notions of what happens within the LLM. So they find that some of these features, like a toxic persona feature that corresponds to toxic speech and dysfunctional relationships, is correlated with being misaligned. And so is some other stuff like sarcastic advice and sarcasm slash satire, which, you know, since you discover that these features get more activations, get kind of more priority, if you just clamp down on them, that would prevent the misalignment. And just one more story, last up, OpenAI wins a 200 million U.S. defense contract. So this is in collaboration with Endural, a company that works with the Department of Defense as well, building drones and so on. This is part of an initiative called OpenAI for Government, where you have things like CHET, GPT, GOV. and apparently the contract will help the DOD improve administrative operations, healthcare, and cyber defense. So nothing too spicy here, but worth noting, I think all the providers, Anthropic, OpenAI, even Google, tech as a whole is getting more friendly with the government and things like these kinds of defense contracts. So not too big a surprise, but worth being aware of. And that's it. That's our episode, kind of a short one, maybe refreshingly so. Thanks, Daniel, for filling in for this week. Thanks for having me. This is always fun. As always, we appreciate your feedback. I appreciate you leaving reviews or sharing a podcast, giving us more listeners. So feel free to do that if you like a podcast. But more than anything, we appreciate it if you do listen. So do tune in next week. Thank you. low down on tech and let it slide last week in a i come and take a ride on the labs to the streets a.i's reaching high new tech emerging watching surgeons fly from the labs to the streets a.i's reaching high algorithms shaping up the future sees tune in tune and get the latest with these last week in a i come and take a ride get the low down on tech I can't let it slide. Class reaching AI, come and take a ride. I'm a little ass through the streets, AI's reaching high. From neural nets to robots, the headlines pop. Data-driven dreams, they just don't stop. Every breakthrough, every code unwritten, on the edge of change, with excitement we're smitten. From machine learning marvels to coding kings, futures unfolding, see what it brings.

Share on XShare on LinkedIn

Related Episodes

Comments
?

No comments yet

Be the first to comment

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies