

⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl
Latent Space
What You'll Learn
- ✓Vercel is a big fan of the AI engineering movement and is focused on making it easier to build and integrate AI agents into workflows.
- ✓Vercel introduced a new workflow development kit that provides a simple and idiomatic way to create durable, streamable workflows with human-in-the-loop approvals.
- ✓The AISDK open-source project has been very successful, and the latest version introduces a stable agent abstraction to make it easier to build AI agents.
- ✓Vercel's open-source strategy is to grow the overall pie while maintaining a consistent share of the business, rather than just selling support or locking down the core technology.
- ✓Workflows are a common but often ad-hoc solution in many transaction processing systems, and Vercel aims to provide a more abstracted and productized approach.
AI Summary
This episode discusses Vercel's efforts in the AI engineering space, including their new workflow development kit and the AISDK open-source project. The key focus is on making it easier for developers to build and integrate AI agents into complex workflows, with features like durability, streamability, and human-in-the-loop approvals. The episode also touches on Vercel's open-source strategy and how they aim to grow the overall pie while maintaining a consistent share of the business.
Key Points
- 1Vercel is a big fan of the AI engineering movement and is focused on making it easier to build and integrate AI agents into workflows.
- 2Vercel introduced a new workflow development kit that provides a simple and idiomatic way to create durable, streamable workflows with human-in-the-loop approvals.
- 3The AISDK open-source project has been very successful, and the latest version introduces a stable agent abstraction to make it easier to build AI agents.
- 4Vercel's open-source strategy is to grow the overall pie while maintaining a consistent share of the business, rather than just selling support or locking down the core technology.
- 5Workflows are a common but often ad-hoc solution in many transaction processing systems, and Vercel aims to provide a more abstracted and productized approach.
Topics Discussed
Frequently Asked Questions
What is "⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl" about?
This episode discusses Vercel's efforts in the AI engineering space, including their new workflow development kit and the AISDK open-source project. The key focus is on making it easier for developers to build and integrate AI agents into complex workflows, with features like durability, streamability, and human-in-the-loop approvals. The episode also touches on Vercel's open-source strategy and how they aim to grow the overall pie while maintaining a consistent share of the business.
What topics are discussed in this episode?
This episode covers the following topics: Workflows, AI agents, AISDK, Open-source strategy, Serverless computing.
What is key insight #1 from this episode?
Vercel is a big fan of the AI engineering movement and is focused on making it easier to build and integrate AI agents into workflows.
What is key insight #2 from this episode?
Vercel introduced a new workflow development kit that provides a simple and idiomatic way to create durable, streamable workflows with human-in-the-loop approvals.
What is key insight #3 from this episode?
The AISDK open-source project has been very successful, and the latest version introduces a stable agent abstraction to make it easier to build AI agents.
What is key insight #4 from this episode?
Vercel's open-source strategy is to grow the overall pie while maintaining a consistent share of the business, rather than just selling support or locking down the core technology.
Who should listen to this episode?
This episode is recommended for anyone interested in Workflows, AI agents, AISDK, and those who want to stay updated on the latest developments in AI and technology.
Episode Description
In this conversation with Malte Ubl, CTO of Vercel (http://x.com/cramforce), we explore how the company is pioneering the infrastructure for AI-powered development through their comprehensive suite of tools including workflows, AI SDK, and the newly announced agent ecosystem. Malte shares insights into Vercel's philosophy of "dogfooding" - never shipping abstractions they haven't battle-tested themselves - which led to extracting their AI SDK from v0 and building production agents that handle everything from anomaly detection to lead qualification. The discussion dives deep into Vercel's new Workflow Development Kit, which brings durable execution patterns to serverless functions, allowing developers to write code that can pause, resume, and wait indefinitely without cost. Malte explains how this enables complex agent orchestration with human-in-the-loop approvals through simple webhook patterns, making it dramatically easier to build reliable AI applications. We explore Vercel's strategic approach to AI agents, including their DevOps agent that automatically investigates production anomalies by querying observability data and analyzing logs - solving the recall-precision problem that plagues traditional alerting systems. Malte candidly discusses where agents excel today (meeting notes, UI changes, lead qualification) versus where they fall short, emphasizing the importance of finding the "sweet spot" by asking employees what they hate most about their jobs. The conversation also covers Vercel's significant investment in Python support, bringing zero-config deployment to Flask and FastAPI applications, and their vision for security in an AI-coded world where developers "cannot be trusted." Malte shares his perspective on how CTOs must transform their companies for the AI era while staying true to their core competencies, and why maintaining strong IC (individual contributor) career paths is crucial as AI changes the nature of software development. What was launched at Ship AI 2025: AI SDK 6.0 & Agent Architecture Agent Abstraction Philosophy: AI SDK 6 introduces an agent abstraction where you can "define once, deploy everywhere". How does this differ from existing agent frameworks like LangChain or AutoGPT? What specific pain points did you observe in production that led to this design? Human-in-the-Loop at Scale: The tool approval system with needsApproval: true gates actions until human confirmation. How do you envision this working at scale for companies with thousands of agent executions? What's the queue management and escalation strategy? Type Safety Across Models: AI SDK 6 promises "end-to-end type safety across models and UI". Given that different LLMs have varying capabilities and output formats, how do you maintain type guarantees when swapping between providers like OpenAI, Anthropic, or Mistral? Workflow Development Kit (WDK) Durability as Code: The use workflow primitive makes any TypeScript function durable with automatic retries, progress persistence, and observability. What's happening under the hood? Are you using event sourcing, checkpoint/restart, or a different pattern? Infrastructure Provisioning: Vercel automatically detects when a function is durable and dynamically provisions infrastructure in real-time. What signals are you detecting in the code, and how do you determine the optimal infrastructure configuration (queue sizes, retry policies, timeout values)? Vercel Agent (beta) Code Review Validation: The Agent reviews code and proposes "validated patches". What does "validated" mean in this context? Are you running automated tests, static analysis, or something more sophisticated? AI Investigations: Vercel Agent automatically opens AI investigations when it detects performance or error spikes using real production data. What data sources does it have access to? How does it distinguish between normal variance and actual anomalies? Python Support (For the first time, Vercel now supports Python backends natively.) Marketplace & Agent Ecosystem Agent Network Effects: The Marketplace now offers agents like CodeRabbit, Corridor, Sourcery, and integrations with Autonoma, Braintrust, Browser Use. How do you ensure these third-party agents can't access sensitive customer data? What's the security model? "An Agent on Every Desk" Program Vercel launched a new program to help companies identify high-value use cases and build their first production AI agents. It provides consultations, reference templates, and hands-on support to go from idea to deployed agent
Full Transcript
All right. We are here in the remote studio. Thanks again to FDOTing for the new space with Malta Ubo, who is CTO of Vercel. Welcome. Hey, how's it going? Glad to be here. Did I get it right, Ubo? I've actually never pronounced it out loud until just now. That was completely perfect. It rhymes with Google. Ah, okay. So perfect that you worked on search at Google and AMP and Wiz, which I think still people don't know enough about Wiz. It is like no longer a secret, but yeah, you can use it. So unless you work at Google, in which case, you probably know what it is. Otherwise, there's no reason to really know. Anyway, suffice to say that you are responsible for a lot of the web as it is today. So thank you for spending some time with us. You're also obviously now building the next web, as we say, with Vercel. And we can cover framework-defined infrastructure. I think you probably saw I have a lot of interest in self-provisioning runtimes. We can cover V0. But here, really, this part is recorded right after you did Ship AI, which we're trying to sort of recap, right, for the general Lane Space audience who may not be watching Vercel as closely as I do or you do. So basically, just generally, what I guess is your message to the broader AI engineer audience on what Vercel is doing with AI? Yeah, I think the super high level view is that what we're really trying to do is like we're like the biggest fan of the AI engineering movement. And we are also fans of, you know, we're not just like going super hard on hype and the big ideas and talking about things, but like being very concrete about like, you know, agents are very exciting and you can actually build them. Right. And so I think our entire conference was about both making that easier, right? And discovering the right abstractions as we're kind of figuring out what people actually want to do, right? Which is emerging as we speak. Then the way Vercel always does these things is by building things ourselves, right? And so that is both in terms of products. So agents that are products that you can purchase from Vercel and stuff that we do basically in our back office to make our own operations more efficient. And so this kind of building of apps lets us ground kind of what we do in that reality and then, you know, kind of extract the abstractions that we feel are really helpful to then put that on the road. And I think the probably most talked about thing that we shipped at the conference was our new workflow development kit, which really, really is just a way to make writing like workflows, like very idiomatic as something that just becomes kind of first class and something you do every day. You think about it, you know, write 15 design docs just because you want one of them. It's just something you do literally every day. I think like since your audience also more generally like in the thing probably like listening to what people talk about, I think that's a lot of talk about our workflow development kit, but also like more generally like what are workflows, what are agents, how are they related, do you use one or the other? I would actually love to talk about that as well. But like, obviously, like in our conference, we basically introduced just like what we hope is by far the easiest way to make your, you know, make your agents something that is easily embeddable into complex workflows and to make those workflows durable, zoomable, streamable, and so forth. Yeah, I mean, as listeners might know, I have a long history of workflows at Temporal. And I think what's weird is a lot of people are discovering this for the first time. You don't really learn about this in CS classes. You don't really learn about this in bootcamps or anything like that, because it's not really a unit of compute and storage that is taught. It's kind of like emergent from, you know, Uber and like Stripe and everyone else. I don't know if there's a version of this at Google. I mean, there is. There's a version of this at every single company that has been doing anything in computer since 1950. Yeah. But what's not necessarily the case is that it has been abstracted in any way, right? But when I run a bank transaction system in 1975, then I invent this, right? Maybe I'm in pure batch processing world and I kind of avoided it. But the reality is that I built this, right? And so either I use something very productized, which obviously Temporal innovated on that being a thing, or I use something that's ad hoc, right? So I go and say, well, I need some kind of queue that tracks the work. And I need a database to store the state at any given point. And I don't know, maybe I write a con job that makes sure that the stuff on the queue doesn't get stuck, right? And so I think what's extremely common in essentially every transaction processing system that has ever created is that there either is an explicit abstraction for workflows in it or someone made one ad hoc because otherwise the thing just doesn't work. Yeah, yeah, totally. So the headline thing for people who maybe haven't dived into workflows enough is that you can sort of wait and resume code. So it's as though the serverless function is kind of indefinitely long running. Like literally you can run an infinite loop inside of your serverless code. And that breaks a lot of people's mental model if they don't really understand that. The code pauses and resumes and you can wait multiple days and it doesn't matter. It doesn't cost anything. Actually, I don't know if it doesn't cost anything. I don't know if you charge. No, it literally does not cost anything. So yeah, you can run compute for an infinite amount of time. You can then also, whenever one of these steps fails, automatically retry them. Stuff like that, that makes things more reliable. Yeah. And so it has a lot of parallels to long-running orchestration problems for agents. If you want to do human in the loop as well, it's a simple task of waiting for, what's the API that you guys have? It's not, I want to say signals, but you have something like resolve webhook or something. Yeah, I think it's similar to a single, but the idea is that you basically make a webhook, like an ephemeral one, which is just a URL that you can ping. And so the realistic flow would be you reach that step where you want human approval. You know, let's say you get the webhook URL and you write it to some database. And let's say the user now, they lock into their computer. Two hours later, they have a queue of things they need to approve. That's from the database. They click on one. They say approve. Now what just happens is that that system now calls that webhook. And then from the perspective of the workflow that you originally implemented, you were now able to await that webhook. And now it resolves and you can just proceed with the program. Yeah, it's very elegant. I would say that it eliminates some complexity that we introduced at Temporal. And that's probably for the better. And the other thing I think, just obviously as someone who is in this space a lot and I've seen all the solutions, You made it open source, which is another above and beyond thing. You could have made it proprietary, but you didn't. Yeah, the way we think about Vercel, and I think we, I mean, I don't want to go too deep into that tangent, but I think about open source as having essentially three business models. The first one is Red Hat, where you just sell support. You think it's open source. The second one is open core, where you are the only one that gets to monetize it, but everyone else gets to run it if they want. right and and vercel maybe has not invented this but certainly kind of is the most successful at a model where you say okay i have this software library and it's truly open source everyone can run it it comes with like adapters for every place on the planet and that makes it really popular and then we get a piece of the pie and so our strategy is to grow the pie while what we actually see is that our pie piece is relatively constant size in proportion to the pie right and so we can drive the open source project. And so that's why, like, you know, I don't want to say we're like, you know, we're in it for the business model, but like I think it's a business model that has more winners than the alternatives. I think also something that if people are seriously evaluating for production workloads, they care about because, and I ran into this in Temporal, like these are going to be extremely valuable workloads that you're going to put on workflows. And so you want to set, you want some ownership, you want some auditability in practice, like who's going to actually run it themselves? Probably not, but you want the option. You want to check the checkbox. 100%. Yeah. And I actually do think people will run it themselves. And that's great. Awesome. So that's workflows. By the way, I think the most disgusting is the use of directives. You know, use cache, use no memo, use Nemo, use whatever. So fun. So fun. What's your, I don't know if you have a take on directives in general. I don't, to be honest, I don't feel super strongly. I do find, particularly inside of the workflow DevCube, I find the use pretty elegant. I could imagine other ways of doing it I think we did post a blog post about all the alternatives we considered because there are some that you think about after 5 minutes and after 2 hours of thinking about it you realize, yeah, maybe this isn't such a good way to do it, but there could be other ways of doing it, like I think we're working with CC39 to bring decorators into more places which would kind of make this literally the same above the function instead of below the function so it's not a big difference but it would become you know for example you you could make it part like typescript be aware of it without a typescript plugin which you already provide right and so I literally I was vibe coding this thing on Sunday and I said okay Claude you have no idea what news workflow is because it came out on Thursday but here's the docs and by the way you know put it on my side and yeah so I did it installed a typescript plugin for me so I had like the perfect And it just worked from scratch. So that, yeah, was great. Okay, awesome. So we can come back to a workflow anytime you want. But I just wanted to keep moving on all the stuff you announced. We should probably also just touch on the AISDK. I know you're not as closely involved to that team, but obviously one of the most successful open source projects. I mean, obviously Versailles is very good at frameworks, but I think it was not a given that AISDK would be a winner because of Langchain, because of Mastra, because of everyone else trying to get that spot. Except that you guys have the perfect package name. So I think that helps a lot. I actually don't. I'm not sure how much that helps, but it's great. Like there's a fun background from what people thought AI was for 10 years ago. But yeah, I think, you know, we announced version 6 beta. And I think the big, I mean, it's not really news because these things are open source and you can follow them very closely. Right. But what it does introduce as a stable feature, because it's already as kind of experimental in ASDK 5. is a direct agent abstraction, which so far wasn there right People would build agents with AISDK but they would have to do it in Whitmore Baybones fashion What I do want to mention is actually because you mentioned it very successful which is true And I think the reason why it successful is because we could strain ourselves to be humble about what we know our users might want to do The example I would like to give, like if when you build like a new web framework in 2025, you know exactly what people are going to do. Like it's such a well-explored space. As the person doing it probably has done it three, four, five times in their life and failed and learned from that and tried the other things, right? It's so mature. Like it's the most mature thing. Even 10 years ago, that was also true, right? That's why Next.js is so good. Because when Guillermo started building it, he knew exactly what to do. He knew exactly what the app would be. Almost nothing has changed. So the AI app space is the absolute opposite. it. Like we know absolutely nothing and we still know absolutely nothing. Things are emerging, but we're so early. And so if you put a very thick abstraction, then it's probably going to be the wrong abstraction. So you have to be humble and say, okay, I need to stay low level so that this can be flexibly used as trends emerge, right? And so that's why we didn't have to rewrite AISDK when everyone went from writing chatbots to writing agents, because we stayed at level where that, you know, almost looked the same. Right. Just stuff people wouldn't on top. And I think that's why it's successful because it doesn't, you know, we didn't say, okay, we know what the apps are going to look like and we're going to do this like Hollywood principle. Don't call us, we call you style framework where like you just have to fill in the blanks. It's super structured. You'll be happy. Right. Like we didn't do that, even though that was like so in our DNA, like we did have to like really restrain ourselves. But like, that's why it's successful because it's so low level. Right. And so that's why, you know, on the other hand, you know, that's why we don't have an agent abstraction yet. Like every other competing library like leads with that. Right. Yeah. Like opening SDK day one. Exactly. Master, et cetera. And, you know, I mean, I'm not saying it's bad, but obviously that's more accessible right now. You know, you have to understand an agent's tool with a loop. What do I do? I use the stream text function and give it tools. Okay. We added all kinds of like control already in ASDK version 5 where you can prepare the step. You can select the tools on every loop. You can like do all these things in a pretty advanced fashion. Half of those other frameworks are built on top of ASDK anyway, right? Like so that it forms the basis. And so what we're doing now is we're bringing what is emerging as the patterns that people build over and over again as abstraction into the library as the usages kind of are solidifying. Yeah, I have interviewed enough agent framework builders and Big Lab model people that I actually find that I can push back on you. Okay, go ahead. It's really interesting because I think you are saying, basically you're saying we will be at the jQuery, kind of at the jQuery era, right? Like we don't know what we want yet. We're building all these tools to make the smallest possible things easier. And then it composes up and we're just starting to emerge of agents. I would say that the big lab people are obviously the opposite, but they're coming at it not from a DX point of view. They're very big model pilled. They want everything to go through the model. The reason they want the Hollywood principle of we'll call you is because they want the model to control the tool calls, the reasoning, what have you. And I feel like there's a mentality. Obviously, you can have frameworks that do both. But there's a mentality in the big labs, if you work the big labs, that you always want to give the wheel to the model. And then for you guys as framework developers and people who are software builders, it's more comfortable to build like the smallest possible thing instead of like the sort of AGI thing, if that makes sense. Yeah, I actually don't think about it in those dimensions. I think the like, because I 100% agree that like people, I think, have to be willing to let go and let the tool, sorry, let the model kind of take control, right? And to get emergent behavior. and certainly on coding agents that works incredibly well. But that I'm totally on point with, right? And that's AISDK does this very well today. Like all the agents that I've personally built work like this. That's not the same. But the other thing is like, how do I now embed this intern application, right? Like the model apps couldn't care less because they're not really building applications. And so that's something that a company like Vercel thinks about a lot. Like, again, what does the developer actually want to express and how do we let them do it? And so I think one of the key things that people want it and still want is they want streaming because these models are slow. And so suddenly this almost obscure sub-genre of programming where people are like, you know, it's 500 milliseconds. I'm just going to not ship it, right? Suddenly it's 30 seconds and it becomes absolutely important. And so we give people the tools to build streaming applications in a way that feels intuitive, right? And I think that unlocked a lot of value there because that was genuinely hard. And we made it easy. And so those are kind of the things that we're looking for, but that are not obvious when you're kind of mostly concerned about the AI part. Fair enough. And I think the design space has more dimensions than what I tried to simplify it down to. Just one more thing on AISDK, then we can move on to the other agent stuff. And obviously you guys announced so much. It's so hard to cover. So Vercel is like a house of frameworks, right? You have so many framework authors, all of them legends in their own right. What's one, I guess, philosophy that you're also applying from all your years and all your people who work on frameworks, right, that is informing you? I have one, and feel free to counter propose, which is what Sebastian Markboga, who is obviously the tech leader of React, is used to say, which is have a small API surface area. I feel like that has maybe been like not as important or like there are other overwhelming priorities. But I just want to get a sense of like, you know, what governing principles really resonate with you. Yeah. I mean, Seb and I are talking about this a lot. But I think I'm often representing the kind of enterprise side where it's like, no, but I actually want to just control this. Give me one API for this. Just one more, bro. So I want to be in control and I want to be able to configure it and I want to define the defaults the way I see it. But it's good to have tension around these things. I think the thing that coming down from Guillermo is just the absolute founding principle of ourself is that we never give you an abstraction that we haven't used ourselves. Like dogfooding is ultimately the thing, right? Like that, like AISDK was extracted from V0 and then we built it and we kind of diverged a little bit. And then we took on the substantial work to bring back V0 actually fully hosted on AISDK. And then we learned back and we made sure that migration isn't too hard, which the users appreciate as well. And so and so forth. Right. And so there's this constant feedback loop where if you don't have that, which is like this sounds like so obvious. right but the the reality is that framework builders are usually not application builders yeah and so they build ivory towers that when they're hyper geniuses or they get lucky they happen to be good but like but but if you want to do this in a reproducible fashion with a high hit rate then the only thing you can do is you you have to you have to try that stuff out yourself and like that's what we do every day i i really like that principle obviously a good idea it's very hard to practice in real life, obviously, because when you're a maintainer of a framework, a lot of bugs come to you and, you know, they pile up and you have to spend some time working on framework level issues. Happy to move on to Vercel Agent and maybe the Agent in Every Desk program, which, you know, I think you're kind of also championing. So like, yeah, let's talk about the use cases that you guys use internal agents within Vercel and what emerged. Yeah, let's structure this two ways, because I do think there's a difference between kind of the agents that we're building internally versus the stuff that's, you know, essentially a Vercel product, right? Yeah. And which you can use today. I thought they were the same thing. No, they're not the same thing. That's actually, I think that's quite important. Like we also, we're distinguishing between like agent as a service, right? And so the Vercel agent, that's what it is, right? It's ultimately an agent as a service product similar to, you know, codecs in the cloud or the cursor agent. Like not as in like it's the same product, right? But as in like, these are things where you go somewhere and you say, I would like to use this agent. And then, I don't know, maybe you give them a credit card and it works, right? Which is different from the stuff that we run internally. But let's talk about the Vercel agent for a second. Like we've been, basically, I think our strategy overall is to have an agent that helps you build applications on Vercel. This is, you know, there is some overlap with coding agents, but like I think the thing that's unique about the Vercel situation is that because we have your runtime data. We have, we see your error logs. We know where the preview deployments are, that they always will exist. We know how to start the dev server. We already have the secrets. So there can be like a quite integrated solution for something that otherwise can be quite hard, right? If you have an onboarded, for example, I mean, you have onboarded dev in a few times, but you probably have been in that situation where it's like, feels like onboarding a junior employee, right? And so some of these things, Like if you're within the Rysel ecosystem, it becomes much more simple. And so in that world, we've been shipping away on different things, right? A while ago, shipped a code review agent, which I think is really good and well integrated. And the thing that we announced last week is our broadly DevOps agent, which is actually tied to our anomaly detection system. So whenever there's an anomaly that we detect on your production site, it kicks off the agent and the agent does an investigation of what's going on. From a technical point of view, what this agent has, it has several tools. It can make any observability query against your project. So as a query builder, it can execute the queries. It has a way to read logs, obviously with queries as well. So what's really magical is that it's just very good at this. Like by the time you click on the anomaly, it will like almost all the time just very precisely tell you what happened It shows you all the graphs I looked at It just like I don know man it just so much easier than doing it yourself It takes away certainly minutes of work But I think what I actually very excited about and I think this is an overall pattern that we see with agents, is that there, in many situations, is this what we in search call recall precision problem. And that also happened with anomaly detection. With anomaly detection, you have to tune it, right? And you either tune it to be very aggressive and then it fires on you. In the worst case, it pages you in the middle of the night and nothing was wrong, right? Just a team in Asia sent a newsletter. And so the traffic went up, right? Or you tune it not like aggressive enough. And so you miss events. And with an agent, you can just say, okay, I'm actually going to have this tuned very aggressively and I'm not waking anyone up. I'm telling the agent and the agent can take two minutes to run. I'm actually fine with that because no one would have reacted in that amount of time in a very reliable fashion. And now it can actually look at a time series, can look what happened. It can look at the IP addresses that are making the request. It can look at the type of error messages, right? And it can make it like a call, whether to escalate to on-call and wake someone up or to say, okay, this is completely fine for someone maybe to take a look next day, which is I think the perfect decision for agents to make. And so that's something I'm very excited about, that you have this like essentially coworker that has no sleeping problems in the loop and they get woken up instead of you. Yeah, I think the dream of AISRE has been a long time coming. And I'm actually on the record at the start of this year, I made a podcast saying like, oh, I don't think anyone's going to do AISRE. So I'm very excited. I haven't tried it out personally. I have seen you tweet about it. And I think, yes, obviously that is the goal. There's the dream. We should make Brian Johnson happy and have good sleep. But we're not exactly there yet. And I think like the question is really fold, which is time series analysis is not exactly within a distribution for language models. There's been a lot of people doing like time series models. There's a lot, there's a deep field of anomaly detection, which is basically what you're doing. And, you know, there's a question about like, is this a solved problem or, you know, like how much can we trust it? And then I think the other one is like aligning the human preferences, right? Like sometimes like, I don't know until I've seen a few examples of like, oh yeah, this one, you should wake me up. the other one you should not. And then pretty much like when I solve the problem, you know, it goes away. So like the next problem, you're always fighting the last war in SRE. So I give you a bunch of things. You can take whatever you want. I think that the, like you have to try the product. It works really well. So we don't do the anomaly detection in the LLM, right? The anomaly detection is, you know. It's a separate part for cell products. It's a completely separate part. It's a pipeline that works in our time series database, right? and launched independently of this. And so, but once you have this, so our experience is that, you know, if you give these agent a tool that does queries, it's really good actually at digging into individual parts of the time series. And the other thing it has access to is logs. And logs are actually, they're just text, right? So if you say, if you see from the time series, like what happened, can I somehow figure out like what this is? And then you head over to logs and do a deeper dive. Now you're kind of more in the world where the model is comfortable. One thing that we don't do today, but we'll do in the future, is that we also give the model X, that particular agent, access to your source code so that it can, first of all, figure out what does the error message mean, right? And every time you can actually make a PR to just fix it. But every so often that'll be possible. And so it would then also do that. Which is why I think like people like Datadog and Sentry are trying to do that, obviously, because they're observability platforms, but they never own the code. And so like they're always limited in what they can do, you know, like. A hundred percent. But also like, I mean, I mean, what I want to qualify where I'm so happy about how it works. It's like, you know, it's a small part of the overall problem, right? Like we're actually not here to like build the AI SRE that replaces that job function. Like that's another part of what I feel pretty passionate about, that it's at this moment, they're like agents are both extraordinarily effective and still very ineffective. And you have to find you have to find the right problems. And then when you find the right problems, they are super magical. And if you wander beyond, then they don't work. And so like that's the magic. Right. And what we see is that getting triggered on like an increased error rate works well. and certainly making their decision what to change in the firewall, I would not let the agent do that yet, right? It's just too dangerous. Or do DNS migrations. Exactly. That would be AGI. Yeah, it's interesting on all that stuff. Yeah, I think that there's this growing consensus of like where agents are doing well and where agents are not. I think, you know, for me, like meeting notes are solved, you know like simple like ui changes are solved i guess like what else you know in that list of like things that are solved that and like are reliable every day what do you put in that bucket yeah i think they so i had a section in my keynote about this and like it boils down to this question like basically where the idea is that you go around your company and you ask people like what do you hate most about your job and i i really think it finds the sweet spot because it finds problems that are they're boring because they're tedious and repetitive but they're you know they would have already been automated if they were automatable without an agent in many cases they often like do require some kind of text like mini judgment etc right so like people people do these things and so that that's i think that that question that that's yields a sweet spot where like this the problems are probably easy enough for for a current generation agent to handle and they're also often very high business impact because this is actually it's pretty substantial part of people's jobs again that's why they hate it because it's like takes so long and so So we ended up at our conference talking about three agents that we built internally, two of which we open sourced. Again, so people have a starting point because these are custom agents. Like they're not software as a service things to just install. So these are custom agents. But like the first one is one that handles processing of our incoming contact sales requests. Lead qualification. Yeah. And there are obviously a lot of startups in that space, right? So I think that's very much in the soft case where you give it a tool for LinkedIn, you give it a more generic tool for Google, give it a bit of an objective and dequalify what do you care about, right? You give it a way to analyze, oh, this is really a support request. Okay, hand it over to the support team, right? There's a few cases like that. It's not so complicated. So that one is, I think, is perfect. And we open sourced that so people can make their own. The other one that we falls into a similar category is abuse analysis. So we get abuse reports. And so in this case, it's really the agent essentially doing the pre-work. So we still have a human person look at the pre-work and then make the decision what happens in the end. But what were they going to do? They were going to go to the reported website, right? They were going to go look at the account and figure out what the age of the account is. And they were going to see if they paid their bills and, you know, whatever, right? Like that's a list of things they will do. And so what you can do is you can just make it so that when they eventually look at the ticket, it already has all this information. And if the page looked like a Facebook login page, then current day LM is also able to do that, like make that adjunct call. And then you just quickly check if it's okay and you move forward. Yeah, amazing. And I think there's one more data analyst agent. Yeah, yeah, exactly. I mean, this is also something we just wanted for ourselves, right? Is to do... We have one too internally, yeah. Yeah, like, and I think it's, that one's also open source. I think the idea is that, yeah, you want to ask questions against your data warehouse. And we were very unsatisfied with the current solutions because they ultimately didn't have access to enough information about the data model. And like, we are not promising that we have the magical tool that you give your prompt and it spits out SQL, just access your schema. But we essentially developed just a structured way to document the semantics of your data so that then the agent is good enough, right? And so we've been using that internally quite successfully. Amazing. So yeah, for those who don't know, Rysel also has an agent on every desk program where you can sort of reach out And is it like a forward deployed engineering situation where like you have like a SWAT team that comes in and helps people? Yes, I think that's, but it's also not appropriate for every company, right? So I think that like my take is that it's, if I'm a large company, I have a lot of efficiency to gain, but it's also quite daunting to ship my first agent. And so something like a forward deployed engineer, which we are doing indeed, like does help quite a bit in that scenario. I think, like, as a startup, I don't want a forward deployed engineer in my office. I just want to see the open source project and feed it to cloud code and then give it my own problem and say, build me something like that. But, like, here's what I want different. And that should also be successful. Right. So I think we are kind of with this particular program really going for just unblocking people who feel that they just don't know what to do. They hear the hype. Right. They don't know how to pick the right project. We talked about this. like how do you actually find the project that's going to be both successful and high impact and and then secondarily okay now that i have the project identified how do i how do i do it and i think that they're like forward deployment is effective it's something that i you know as a framework engineer i've thought about all my life where like you need to have someone kind of guide you the first time you do something and then the second time maybe you know you build an agent yourself so we don't want to stay there right we want we basically sign contracts with companies saying, okay, you have to commit to building three agents. And if you do, we're going to help you. Like we want to build the first one for you. And then the second one, we are going to be there essentially by your side, you know, maybe not literally, but like on a phone rotation. Right. And and for the third one the assumption is that you actually don need any help anymore and you can still reach out obviously But if everything went well now this is a company that empowered to build its own custom agents Yeah, you know, you're going to be helping your biggest customers. And obviously, that's going to be leading to a lot of good products ideas, right? It's kind of dogfooding at scale with the people in the Resell ecosystem and not just Resell alone. 100% you just discover things that probably the 500 person startup would not have discovered. I think, you know, one last thing, you know, just to leave off the whole topic and we can add in anything on the Shippee Eye side. Actually, anything else on Shippee Eye that you really want to cover and get us a box about? I don't know if we covered everything. I think like one point that we haven't talked about is that we, like for sales, a company just has been investing in Python quite a bit. I think maybe the audience of this podcast might also be excited about this. AISDK obviously currently is a pure play type script system. We do find Python really interesting. What we have done over the last weeks is we have shipped like zero config support on Vercel for all the popular Python framework like Flask and FastAPI. Zero config means that you kind of get the Vercel experience where you throw your stuff over the fence and we're going to run it for you. No questions asked. And then just as another thing, like we have, for example, also shipped a Python SDK for our API, just to kind of, again, show that we are engaging with that ecosystem. It's very obviously something that is in a way new for Vercel, but like we have been making hires and infrastructure investments to make Python really well supported on Vercel. It is also on our fluid compute program, which gets you, for example, active CPU pricing. So you get to run Python in production and you only pay when you have compute. And otherwise it's free, which is very nice if your backend takes 30 seconds to respond because it's an AI model. I was also going to make the observation that workflows are very nicely meshed. Like it's almost like there's this fate that you're driving towards, you know, workflows that you needed fluid compute in order to do all these fancy things, or at least make it easy to ship this kind of stuff. Right. I think there's some overlap, right? Like I think the fluid But with workflows, literally nothing's running between steps, right? So it's literally free. That's also why it's free, because there's literally just nothing happening. Versus Fluid Compute, which does have the same property, except it's more agile, right? Usually in a workflow, I don't know, you're doing the FFmpeg thing, then you're doing the Imodel, right? You're not talking millisecond latency. There's some overhead on each step, whereas Fluid Compute being like the VM being literally on, which does operate on a different kind of level. Yeah, excellent. And then on the Python thing, what happened to always been on JavaScript? You know, like, isn't that the Brendan Eich line? Basically, I think, I guess the broader, the non-cheeky question is, are we like 50-50 Python JavaScript now? Is that the future? There's no particular language that will win? Or, you know, do we not have an opinion? I don't think I have an opinion. I saw today that, like, TypeScript is now, as of today, is the biggest language on GitHub, right? Which last year you had to still cope that TypeScript and JavaScript is like drastically bigger than Python. Honestly, obviously, I don't really care. I think both communities are very relevant, very large, and we are investing in supporting them. Like the way for sales infrastructure works, because we essentially just run VMs, it is not actually hard for us to support Python. We will eventually do PHP and Ruby, but we also, I think, care a lot about the details. And so essentially the reason why we haven't done it yet is because we do invest a substantial amount of time to make the DX actually good and feel native to the ecosystem. And so that's why it's like technically easy, but like in practice, actually very, very difficult for us to support these things. So we're taking it carefully, but like we are, you know, there's no technical restriction on our system where we couldn't support all of these different ways of running code. Yeah, totally. I mean, that's a very fair response. And the fact is that I think when you're a serious AI cloud, you have to support Python. So there's no way around it. Okay, I wanted to zoom out. One other thing that we care about in AI engineering is AI leadership, which is leadership of AI engineers. And obviously your role as a CTO has changed a lot since you joined. What are some, I guess, leadership principles that you've had to, I guess, create from first principles? Obviously, you're in a very unusual type CTO role, where you're an infra company, you have frameworks, you have apps. What comes to mind when I say, how has the CTO role changed for you? Yeah. So I joined Vercel a little bit less than four years ago. And that was definitely before ChatGPT. I mean, I came directly from Google and I think I had some insights there of what was happening. But Vercel certainly wasn't living in this world. That was kind of the tail end of the last really big crypto wave and everything else. Right. And so then suddenly the AI revolution happened. And the thing that I definitely had to work through is like, how do I transform the company into something quite different? And I think the solution that we have come to like feels really good. And I think there's a lesson there to be learned for companies that haven't done the transition yet is that you have to do something that feels native to your company. And so the two big bets that we made early on, one being vZero and one being AISDK, I think they felt like native to Rousseau in the sense that vZero was especially originally designed as a tool for making web pages. Like the full stack stuff came later with the Sonnet models. And originally it was very much a web development tool. And AISDK was a framework for building AI apps and because we're a framework company felt really native. And I think you have to like make that, you know, you have to be honest with yourself. like what product, even if I do have to like change to building something else, does kind of extend naturally from what I'm doing rather than being very kind of like being something entirely different and no one believes me that I would be the right place to buy that from. I think that's just generally timeless advice. Have manager to IC ratios gone up? Do you know what I mean? That's a really good question. I don't track that very closely. think that it's a thesis. I'm, I'm, I, yeah, I would be supportive of it going up. It's a, it's a big question. Like the role of the engineering manager, I have two reports. Nice, nice. Right. So like that, that's the way we're organized. Our, we've, I mean, it's a privilege of the CTO that you don't, you know, VPE is usually the people, people managers. Yeah. Like, but, but that's actually not, not the point. The point is that it's like the basic, the really, the one thing that Google really got right is that it has very strong ICs. It has IC levels all the way up, right? There's Sanjay at 11. And so we're doing the same thing. And so we had to have someone as an IC at the top level, right? And so that's why I think this is, I think, another thing that you have to be willing to as a company to live in this world where you don't make your strongest engineers have the choice of, like, not making more money or becoming a potentially very bad manager, right? Because, like, there's little correlation from being the best engineer to being the right manager, right? You could be. Sometimes you have, like, these moments of, oh, my God, like, glad we made this change. But, like, that's a it's essentially a crapshoot whether that happens. Yeah, that's the other thing that I find. Obviously, a Peter principle applies. It's just that principle of promoting people to the level of incompetence. I think the other interesting thing that people are finding is that PMs and designers are now starting to contribute more to code because they feel they can just vibe code something. And maybe that's good. Maybe that's bad. I don't know if there are standards around this that have been established inside of Vercel where there used to be clear code owners. And now because we feel that we're coding agents, we feel we can do a lot more. And maybe that's caused some politics somewhere. 100%. Like we want to be at the avant-garde of doing this, right? Like as makers of vZero, like we highly encourage all our employees to like contribute code. I think one thing we haven't talked about and we might have time to go into, but like we are very deeply working on a way to build apps that follows the threat model that assume that the developer doesn't know what they're doing. And also they're using AI that also doesn't know what they're doing. And so I want to be able to build an app that is secure, even if the developer is incompetent, right? But today, that's not the case, right? Today, I assume a developer is competent. Full trust. Right? And so we have, I think, very strong progress there. And we're working with lots of large companies that have data, for example, where you have the idea that you say, well, Auth cannot be part of the app because they're not going to get that right, right? So all has to be extracted from the app. In fact, which data you can see, they also cannot be under control of the app because again, you're going to get it wrong, right? So we're building these systems that try to have a minimum amount of security, fully independent of the quality of the app. And I think that is part of the future of how people will build stuff. Okay, so this is probably a good idea anyway. This is for cell level, not so much V0 or Next.js level. Yeah, in a way, in an integrated fashion, right? because I'm going to build the app in VZero and now I want to deploy it to my fellow co-workers. Only the right people should be able to access it and when they access it, they only should see the right data. Yeah, I think this is very exciting. I think like the closest people that come to this is WorkOS basically. And yeah, I think there should be more like agent native if we can call it that infrastructure that lets people build and vibe code safely, which we all want to enable them. They just cannot be trusted. This is very insightful. Actually, I'm very excited to see that. Yeah, pay me when it comes out. But otherwise, thank you for spending the time to recap Ship, recap all of Vercel's stuff. I cannot imagine a better person to talk to. You're so generous and friendly and engaged. I definitely don't feel like all CTOs are as engaged with regular developers as you are. I'm trying my best and I'm spending too much time on X. But yeah, this was super fun. Thank you so much for having me. Bye.
Related Episodes

⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security
Latent Space

AI to AE's: Grit, Glean, and Kleiner Perkins' next Enterprise AI hit — Joubin Mirzadegan, Roadrunner
Latent Space

World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI
Latent Space

Escaping AI Slop: How Atlassian Gives AI Teammates Taste, Knowledge, & Workflows, w- Sherif Mansour
The Cognitive Revolution
1h 40m

⚡️ 10x AI Engineers with $1m Salaries — Alex Lieberman & Arman Hezarkhani, Tenex
Latent Space

Anthropic, Glean & OpenRouter: How AI Moats Are Built with Deedy Das of Menlo Ventures
Latent Space
No comments yet
Be the first to comment