Why Physical AI Needed a Completely New Data Stack

Gradient Dissent

Tuesday, December 16, 20251h 0m

Spotify Apple

Gradient Dissent

0:001:00:52

What You'll Learn

✓Rerun provides an open-source SDK for logging, modeling, querying, and visualizing multimodal data from sensors, cameras, and other sources in robotics and spatial computing applications.
✓Rerun has expanded beyond its initial focus on spatial computing to become heavily used in AI-first robotics, where it helps with the data pipeline for training and debugging these systems.
✓Rerun's flexible data model, inspired by entity-component systems, allows users to easily log and visualize a wide variety of data types without being limited by a fixed data structure.
✓The podcast host and Nico discuss the recent advancements in AI-powered robotics, particularly in areas like manipulation and the combination of reinforcement learning and imitation learning.
✓There are challenges in benchmarking progress in robotics, as the best solutions often require co-designing the hardware and software together.
✓Rerun has found unexpected use cases in industries like hedge funds, where its high-performance logging and visualization capabilities are valuable.

Episode Chapters

Introduction

The host introduces Rerun.ai and its similarities to Weights and Biases, as well as the podcast's focus on robotics and embodied AI.

Rerun's Use Cases and Data Model

Nico explains Rerun's origins in spatial computing and its expansion into robotics, as well as the flexible data model that underpins the platform.

Advancements in Robotics

The discussion turns to the recent progress in AI-powered robotics, particularly in areas like manipulation and the combination of reinforcement and imitation learning.

Benchmarking Challenges

Nico discusses the difficulties in benchmarking progress in robotics, as the best solutions often require co-designing the hardware and software together.

Unexpected Use Cases

The podcast touches on Rerun's unexpected use cases, such as in the hedge fund industry, where its high-performance logging and visualization capabilities are valuable.

AI Summary

The podcast discusses Rerun.ai, a company that provides a logging and data platform for robotics and spatial computing applications. The founder, Nico, explains how Rerun's flexible data model and high-performance logging capabilities make it useful for a variety of use cases, from robotics to hedge funds. He also touches on the recent advancements in AI-powered robotics, particularly in areas like manipulation and reinforcement learning.

Key Points

1Rerun provides an open-source SDK for logging, modeling, querying, and visualizing multimodal data from sensors, cameras, and other sources in robotics and spatial computing applications.
2Rerun has expanded beyond its initial focus on spatial computing to become heavily used in AI-first robotics, where it helps with the data pipeline for training and debugging these systems.
3Rerun's flexible data model, inspired by entity-component systems, allows users to easily log and visualize a wide variety of data types without being limited by a fixed data structure.
4The podcast host and Nico discuss the recent advancements in AI-powered robotics, particularly in areas like manipulation and the combination of reinforcement learning and imitation learning.
5There are challenges in benchmarking progress in robotics, as the best solutions often require co-designing the hardware and software together.
6Rerun has found unexpected use cases in industries like hedge funds, where its high-performance logging and visualization capabilities are valuable.

Topics Discussed

#Robotics#Spatial computing#Data logging and visualization#AI-powered manipulation#Reinforcement learning#Imitation learning

Frequently Asked Questions

What is "Why Physical AI Needed a Completely New Data Stack" about?

What topics are discussed in this episode?

This episode covers the following topics: Robotics, Spatial computing, Data logging and visualization, AI-powered manipulation, Reinforcement learning, Imitation learning.

What is key insight #1 from this episode?

Rerun provides an open-source SDK for logging, modeling, querying, and visualizing multimodal data from sensors, cameras, and other sources in robotics and spatial computing applications.

What is key insight #2 from this episode?

Rerun has expanded beyond its initial focus on spatial computing to become heavily used in AI-first robotics, where it helps with the data pipeline for training and debugging these systems.

What is key insight #3 from this episode?

Rerun's flexible data model, inspired by entity-component systems, allows users to easily log and visualize a wide variety of data types without being limited by a fixed data structure.

What is key insight #4 from this episode?

The podcast host and Nico discuss the recent advancements in AI-powered robotics, particularly in areas like manipulation and the combination of reinforcement learning and imitation learning.

Who should listen to this episode?

This episode is recommended for anyone interested in Robotics, Spatial computing, Data logging and visualization, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

The future of AI is physical. In this episode, Lukas Biewald talks to Nikolaus West, CEO of Rerun, about why the breakthrough required to get AI out of the lab and into the messy real world is blocked by poor data tooling. Nikolaus explains how Rerun solved this by adopting an Entity Component System (ECS), a data model built for games, to handle complex, multimodal, time-aware sensor data. This is the technology that makes solving previously impossible tasks, like flexible manipulation, suddenly feel "boring." Connect with us here: Nikolaus West: https://www.linkedin.com/in/nikolauswest/ Rerun: https://www.linkedin.com/company/rerun-io/ Lukas Biewald: https://www.linkedin.com/in/lbiewald/ Weights & Biases: https://www.linkedin.com/company/wandb/

Full Transcript

We had this idea that visualization would be hard to monetize, particularly for this like physical AI kind of applications. You need to visualize every single step of everywhere you might interact with data. So we've actually redesigned the data model probably four times at this point. Oh, wow. You know, AI and robotics, it seems like such a hot topic. What are you seeing that people might not realize that's working now that's in work a year ago? So we're just really seeing an incredible progress and sort of ability to do quite advanced manipulation. This has been a really difficult problem in robotics for a long time. Another one is reinforcement learning. And classically, that's been for walking and motion. And what's starting to work a lot now is kind of combining some of that reinforcement learning with imitation learning. Are there benchmarks that should be created? There aren't any great benchmarks. The way often to solve real problems is to co-train and co-design for your hardware. It's like you build, design these things together. And then it becomes quite hard to benchmark in a good way. You're listening to Gradient Dissent, a show about making machine learning work in the real world. And I'm your host, Lucas B. Wald. All right, this is a conversation with Nico West, who is the founder of Rerun.ai, which is a company that I've admired for a while. They're in a similar domain, the weights and biases, robotics and embodied AI and augmented reality. And they have a very cool logging product that operates at super high scale. and then a database or a kind of system of record for bigger customers to track everything they do. They've made a lot of interesting decisions, some same, some different than weights and biases. So it's fun to get into the weeds about how their product works and how they think about it. It's also just interesting to talk about robotics and the state of the art because it might be the coolest application of AI right now. Hope you enjoy it. All right. So maybe I'll start with a little bit of why I wanted to talk to you on this podcast. So, you know, so you're the founder of Rerun, which is a company that does logging for robotics and also kind of functions as a system of record, it seems like, for robotics companies. And it's very similar in some ways to what we do at Weights and Biases for, you know, AI teams. And, you know, looking at your website, it seemed like you emphasized a lot of the same things as us, like a simple API, you know, like high speed logging. And it also seems like you have a similar customer love to Weights and Biases that I really respect and admire. And maybe not a lot of people know this, but the original use cases of Weights and Biases were pretty heavily in robotics. Actually, OpenAI, when we first started working with them, their main focus was robotics. This was before GPT. and you can actually find an interview with Wojcik, a very early interview with Wojcik, one of the founders of OpenAI, talking about robotics projects there. It must have been like five years ago. So anyway, I mean, you started your company a few years ahead of us in a different domain and made a lot of different decisions. So I was really excited to talk to you about kind of your kind of creative and product process. And also I think everybody loves robotics and the applications and maybe we call it embody AI now, so maybe we can get into that too. Yeah. We get compared to weights and biases every now and then. And that was one of the early signs that we were really happy about our API when people are saying, yeah, it's like weights and biases for robotics or spatial computing and that kind of field. So, yeah, where do we want to dig in first? Yeah, so maybe we should start with like, what are the range of use cases that you're typically used for and how does Rerun work there? Got it. Yeah, so maybe just set the scene. So rerun, we have an open source project that is very popular. And that project is an SDK for logging, modeling, querying, and visualizing. Like really multimodal data, and particularly multimodal data that changes over time. So I'm picturing like sensors from like a LIDAR. Yeah, like 3D sensors, set up for multiple cameras moving around. Maybe you have regular RGB and you have motion sensors and you have whatever other normal signals and tensors. And you just think it could be... Actually, our first early beachhead was in spatial computing or when the reality is like companies building headsets and that kind of stuff. And so what are they logging there? You're logging like sensor images. you're logging camera calibrations, where a camera is in space, as well as just normal text logs and you're logging time series that could mean whatever, just CPU time or some confidence of a neural network or anything like that. So just both the input sensors that you have and any internal computation that you might do. So the output of a neural net, a little bonding box that you detect or anything like that. So, yeah. And where has it gone from there? So, yeah, it started out as that really focus on, we just wanted to make it like 100 times easier to like debug these sort of system, multimodal systems, like computer vision or robotic systems that do things over time. And that really was the focus quite heavily in the beginning and got a lot of, yeah, first usage in spatial computing and then quite broadly, I mean, it's used in like hedge funds and like weird different things, but then now very heavily in robotics and particularly learning first robotics. And so on top of that commercial sort of open source project, we built a, or are currently building a data platform. It's a bit more like a cloud database or a data lake house, even if you want to kind of get into that. So for really focused on kind of making it a lot easier to get data from like from my robot to like in the right shape and the right subset of that into training on it basically. So that like running that like record data, curate and train loop as fast as possible, but not taking part in the training, which is what weights and biases does, but the kind of data pipelines leading up to it. Interesting. So, I mean, I can see how, you know, you start in kind of VR, AR, and then, you know, you move into robotics. But what is the hedge fund logging on your platform? Yeah, I think we just have this breadth of random use cases as well. I guess they just, it ended up, we put a really lot of effort into performance and ease of use. and so it actually turns out it's like pretty difficult to get really fast simple to install streaming visualization of even simple things like time series and so on and so we put a huge amount of effort into be built a in-memory database from scratch basically that runs inside this like visualization app and yeah that that can speed things up and and it's like highly performant and easy to use, basically. It's not perfectly meant for that, so it's not like that's a, I don't imagine that will take off in a big way, but we find a lot of those things, random use cases like that as well. Interesting. And I noticed just like weights and biases, it seems like your kind of core API call is a .log. Yeah. Yeah. And I think like one of the things we often thought about was, you know, how kind of opinionated to be about, you know, people's data types and like how opinionated to be about displaying them. Yeah. How did you like think through that? trade-off? Yeah, so that came from, I mean first it's a really important trade-off and I think one that we got, I don't want to say like right, but got pretty good. You can be proud of your product. Okay, yeah, I'm quite proud of it. Of course there are always like things that you want to improve, but I've built similar systems many times. So the three of us co-founders, we worked at a a 3d like computer vision company a long like a long time ago and we built something similar there and you um where the idea was yeah you want to be able to log all your data including this like images and 3d and all those like the weird sort of weird data and had kind of built several versions of this at different companies and came in with this idea that was really important to have it be incredibly low friction when that's what you wanted so if you're debugging one of of these systems there's so much state in your head right very similar to bugging like machine learning training so you just needs to be like dump and forget just no thinking is like high level very easy but then with experience building these things and maybe more rigid ways and you can always get stuck somehow so it's very important for us to have it be flexible so that you can kind of okay I added in this like maybe output from a neural net or a 3d point cloud or something that my neural net is estimating but I just I want to understand like what was the confidence level of like that this one point and I want to just hover it and look at it and then you need to make that work I can't up front know what all the users data models are going to be not a very flexible data model so uh so we designed a sort of new data model from scratch inspired by entity component systems this is like a way of modeling data in games so we couple of the early employees gaming backgrounds and um yeah basically we designed an entity component system and what does that mean how does that work it's basically um let's see if i can explain this like very very simply but uh instead of objects okay if you're normal object oriented program you might have a you know object you say okay you have a image it has its size it has the memory buffer with the things in it um you just you have sort of memory buffer is one thing um image size is another and you can kind of just compose all these components and then there are systems that know how to interpret so there's some visualization system that says oh if i see a like image or like image buffer and a size and all these things that i need then i will just draw it and so it's it's a bit more flexible than saying okay we just have this fixed object that we know about how to deal with so you can like easily compose um sort of things that you might want to visualize or log or understand more freely, basically. And so what are the most common things that people are logging here? So if you go to sort of right now, where it's like really popular right now, Rerun is particularly popular in like heavy learning-based sort of AI-first robotics. And there people are logging like motion sensors, image or video. So it's really important to have flexibility in high performance and how you represent video. As long together with just other debugging system things that you might have in a robot. So those would be the most important things there. And then point clouds and LiDAR kind of sensor. So maybe we could talk about this, the kind of state of the art of, you know, AI and robotics. It seems like such a hot topic. Like, what are you seeing that people might not realize that's working now that's in work a year ago? I think even this last couple of weeks has been, like, incredibly exciting. I don't know if you've been following all the releases coming out. Maybe not. Yeah. What are they? So we're just really seeing an incredible progress in sort of the ability to do quite advanced manipulation. This has been a really difficult problem in robotics for a long time. it's classic robotics is just like built around being incredibly precise you pre-program every movement up front and just everything has to be down to the sort of millimeter precise and but this kind of flexible messy manipulation like folding laundry is a classic task has been really elusive robotics for a long time and it's just sort of folding folding laundry has kind of gone from being impossible to sort of boring over the last year basically and it's really kind of come through this like end-to-end learning methods first on those manipulation tasks basically it's kind of imitation learning it's a kind of robotics version of supervised learning um you describe how imitation learning works for sure Yeah, so the simplest case is you have, you teleoperate your robot. So that means just a person is remote controlling the robot and you record everything that happens. So let's say you want to fold clothes. You have like a one episode, you record data as episodes. So that would be start, take a piece of clothing from the hamper or whatever and put it on the table and fold it and, you know, put it in a pile or something like that. That could be like one task. And so that, yeah, so you collect data like that, just basically demonstration data and you, you collect all the state of the robot at every timestamp. So where are all the joint angles and some perception or proprioception input basically that you can use. And in simple terms, just train your, your, some neural net to, to repeat that, that task. But this is the real messy world. So it's never going to be an exact replica of the task, but that's the kind of robustness that you try to train in. So I think it was about two or three years ago that started to kind of really work in a sort of more serious ways. And that really started this really, really heavy excitement in the sort of AI first robotics that's happening right now. And that's sort of one of the learning paradigms another one is reinforcement learning and classically that's been for like walking and motion and things like that and what's starting to work a lot now is kind of combining some of that reinforcement learning with imitation learning so we're seeing some like really really awesome results and very robust robotics sort of manipulation handling coming out over the last couple of months. That's, yeah. And do you think it's been like algorithmic insights that have made these strategies work or is it, you know, a bigger focus on training data or kind of better hardware platforms that actually you know work better or more compute or what driving this I think it all of the above My view tends to be that generally the things that matter more maybe more this like ecosystem or like economic engine or something like that I think what I maybe see as the big arc is that the sort of LLM hype, Chepit GPT moment kind of thing really showed to the world the power of like really scalable machine learning. and then that really primed everybody to be looking out for signs of could we have a scalable learning work for robotics because previously it's it's been this thing yeah you add more training data it gets worse is it they just the models don't generalize and then yeah a couple years ago there was a bit of a breakthrough that just you could sort of start seeing the science of scalable methods kind of working for robotics. So that includes using transformers, but there were a bunch of innovations that had to be made in terms of just how do you model a robotics problem, which is different than a text problem. So it's a mix, but then that gets going, and then more people get money, plus in more people get excited. So there's more hardware being built. So that gets cheaper and cheaper. And then people want to, everybody knows it's scalable, that means you have to have a lot more training data. But there's enough excitement to invest in collecting data. My impression of robotics right now at this moment in late 2025 is, you know, I'm super excited about it, obviously. And I see these like amazing demos, you know, all the time. But, you know, then when I visit our robotics customers, the demos don't seem quite as impressive as like the YouTube videos, you know, that I watch. And I'm not talking about the Boston Dynamics ones, where it's obviously like, you know, like kind of more traditional robotics, but even the ones where, you know, you see this really kind of fluid motion, you know, clearly like trained. Well, I don't know. I mean, I still, I mean, you know, I would love to buy a robot that, you know, folds laundry, but I don't have one. Like what's like the gap right now? How far is the gap? And how would you characterize like the state of the art of robotics? I think there are different like kinds of robotics companies. So I think there's a lot of that. The ones that are more maybe swinging through the fences directly, I want to call it that. And you see these incredible demos and then it's actually quite far from a product. But I do see another batch of robotics companies that have a very different approach. They're much more practical. They build robots, scrappy robots early and ship to customers and like have a working product. And they're just aggressively deploying that. So those don't get as much hype because it's not as exciting to look at. a kind of not aesthetically pleasing warehouse robot that is just like... Talk about the working robots that you see like in warehouses. Like what's happening now? So they're really using a lot of the same models, but maybe not as heavy focus on research. So in parallel to all this that's been happening, there's also been a lot more open source models coming out. Also the same way in LLM land, there's some great open source models heavily driven by Hugging Face. So I see them being just kind of scrappy using these like VLA models, that's vision language action models, but maybe just being more practical about building other systems around them and kind of using teleop where needed and kind of just have a more product and like scrappiness sort of approach to making things work. To be like practical, like what are you actually seeing? Like are there, I mean, I've talked to guests on the podcast, I think of every, you know, realm of this, every like spectrum. And certainly like, you know, we talked to folks that have like, you know, carts and factories where you can like put something on it and the cart will like go, you know, to the next location. And that seems like clearly factories are doing that. A lot of people talk about, you know, like kind of picking stuff up in factories and moving it. Yeah. Although that one doesn't seem as, I mean, that's some people have been dreaming about that for like years and years and years. Like, is there actually like picking up objects like happening in real world factories? And is it mattering? Is that, is the city there past that in terms of like what's actually deployed in production? I would call it that there are pockets where it's working, but it hasn't been scaled up. And what are those pockets? that's i guess what okay so um pockets are more like it is actually possible to do this now so we see companies that have deployed you know in the tens to a hundred robots in like manufacturing and they use um these kind of flexible learning based sort of manipulation for like moving stuff in manufacturing for instance like just be simple stuff like pick in place And yeah, that's basically, they have that working, but it's not like at massive scale yet basically. But there are smaller companies that are growing the amount of robots and like, and kind of making it actually work in production right now. These are vendors, robot vendors that where factories buy from these companies. Yeah, yeah, exactly. I haven't heard from, I mean, it may happen, but I haven't heard like the factories themselves doing it and doing that yet. I see. And then like, what about on the consumer side? Like I've, you know, had a couple of rich friends that have bought like humanoid robots and they sort of seem to kind of collect dust in the corner right now. And I've, I've been to, um, you know, kind of a robot fighting league, which we're not sponsoring, but you know, I'd say the fighting seemed a little, um, maybe unimpressive, you know, compared to, uh, the, the hype around the fighting. Um, like, again, what's, um, what's the state of the art there? Like if a robot can actually fold laundry and then the state of the air is actually like past that, why isn't there a consumer product that folds laundry? So I think you can think of, if you just think of what it would take to fold laundry in your own home, there are more things to it than like setting it up, putting the, if you had a very expensive robot, but it required you to set everything up perfectly. maybe some people would buy that but um most people wouldn't because they're just the total setup and the kind of management of it so just there's like a the physical world has such a huge sort of um long tail or fat tail or whatever you want to call it of this weird small variations that make things a lot harder so at the end of the day you're not it's not just a robot that can do a task you need to build a product right so i think that distance and the whole systems aspect that like deploying robots is like it's very difficult right you you need to service robots you need to have you know onboarding tutorials and all these kind of things so i think we're still that's not where we're at yet that like all these like most advanced frontier robotics like into like intelligent robots are at the sort of full product level yet but we're seeing some people um ship starting to starting to ship robots um for the home soon i guess or it's exciting but i don't i don't think we'll see full autonomy in the home for a bit what do you think will be the first applications that people use in their house probably picking up i mean the first one the ones that are used right now is is like vacuum cleaning right well i feel like vacuum cleaning has been around for a long time yeah there's some there's some newer vendors that significantly make that like a lot, a lot better. Interesting. What are you thinking of? Well, I like Matic a lot. I have a Matic in my house, actually. Yeah, so that's a big step up from a Roomba, for instance, right? Actually, it's funny. My daughter was asking me what's better about a Matic than a Roomba. And I think we like the sort of like mapping that it does. But honestly, that's not really like an autonomy feature. What is better about a Matic than a Roomba? So the autonomy, the mapping gives you autonomy, right? So that allows you to handle more robust things. Like something is in a way, it doesn't get stuck. It can not drive over your cat. And the collection of these things make it more robust and make it more useful. It's just, you get into this world or sort of this realm in the physical world where there's just so many small things that can go wrong. And they add up. And, you know, if the probability of one of them happening is high enough, you end up not using the product, basically. So I think there's a lot of that. The same nines of liability kind of. But it seems kind of funny. I mean, I feel like I remember when Roomba launched, it must have been like 2004 or something like that. And here we are 20 years later, and it seems like robot vacuums are still by far the main consumer use case. Yeah. And I think that's, I mean, that is the simplest, right? So it makes sense for that. I would guess robot vacuums that also pick up toys. That's probably the next step. I know some folks are working on that. so in terms of really practical things that can work autonomously I guess it's just like small steps like that will happen first what about fun things I feel like suddenly we see robot dogs everywhere yeah like what else are you seeing in that domain I haven't interacted that closely with the companies that are building more toys but I mean I'm super excited about that I think there's so much fun things like that that can can exist there are like AI powered robot toys and so on but that's not a space I know really well. I guess, what else are you seeing in terms of applications? I think very broadly we're seeing things in like agricultural picking. Just something that I didn't really realize was the breadth of how many places there's this like last mile kind of logistical pick and place thing. So there really is a lot in different parts of the logistics chain as well as like there are many times of like small autonomy you know, last mile delivery, these kinds of things. I put them in the same, it's sort of close to self-driving, but sort of the same realm as robotics as well. Drones, there's just a lot of different applications there. Yeah, are you saying, I mean, am I maybe overly focusing on robotics? Like, do you see other like Embody AI or other applications like outside of robotics that are meaningful to you? Like, how does your customer base break down? Yeah. So, yeah, I think our sort of customer and user base breaks down according to just if you're trying to do something intelligent for the physical world, mostly. And that will split into the big categories being AR, spatial computing kind of things, autonomy more broadly, and robotics. But then there's just like a long tail of smaller things. Totally. I mean, there can be really small things like, a media thing like a little AI thing in like film production that makes like focusing easier just like there are huge amount of these like small little problems it's quite a lot in like security surveillance like analytics for construction that kind of that kind of stuff they're just basically any physical world thing that's happening out in the world someone's probably building a product to either observe and analyze it or actually improve it are you seeing like a lot of satellite data applications? We don't see a lot of that internally because they're just product features. We haven't built a lot of like specialized satellite data handling. We have basic mapping, but that's a whole very deep field. So we don't see them necessarily as huge rerun users yet. But I sort of, aside from that, I see some cool stuff happening. Cool. All right. I kind of want to go like a little bit deeper into your product, if you don't mind. And, you know, I think like one of the things that we really, you know, designed our products for is making sure that doesn't like break the user's process. Was that like a big thing for you? I'm curious how you like to actually stream the information back to like a central server. Yeah. Yeah, that was important for us too. So, I mean, just things like you can never crash the user's process and things like that, of course. so we originally designed it heavily in cases where let's say like we focused a really lot on high performance streaming so there's all the robustness aspects I don't know that I have anything particularly interesting to say other than we spent a lot of time making sure it could never crash but in robotics we didn't design it for the the case of hey there's always a server listening so quite often in robotics you can't guarantee that and the kind of normal thing to do is write to disk first so the two kind of core original use cases like either writing to disk or streaming to a remote viewer or something like that remote server um so we have been designed for for both of those and our sort of on disk format is basically the same as the streaming format and so to make these things fast we built this sort of we you log data and depending on the size you can control these things but we basically run a micro batcher such that you buffer up a little bit of signals in memory and if it's you're having really really fast time series maybe you buffer up more and then sort of send it over the wire and sort of chunks. I like that instead. And that balancing that and kind of doing that efficiently is like one of the important parts of getting reasonably good performance out of these things. I mean, one of the decisions we made that I'm never sure if it was the right decision or not is we had the logging be open source because it's like running in the customer's or the user's environment. Yeah. But then we made our visualization, the centralized visualization web server, closed source. Yeah. I think you made that whole thing end-to-end open source. Yeah. Is that right? Yeah, exactly. And then how did you, so is that why you had to make like a different product to monetize or did you have like a more enterprise version of the visualization server? How did you think about that? So we had this idea that visualization would be hard to monetize actually because of how important it is. So we had the idea that particularly for this like physical AI kind of applications, you need to visualize every single step of the, like everywhere you might interact with data. You need to visualize when you're like prototyping in a notebook. You need to visualize when you're like, you have like maybe one robot or one device like next to your desk and you're like kicking it to figure out like what's happening live. You want to visualize your like data pipelines as you like preparing for training just like everywhere and you want to use it for your operation It just too widely deployed There no good business model that lets you have visualization and all those different points that is actually that works. If you, you could build, you can make the business model around a per seat license or something like that, that works for your like expensive engineer that's looking at the data all the time. But you also want to have the same visualization for your like support person who looks at every now and then or like just that balance becomes very difficult. And having a closed source, if it's so, so central, it actually becomes very difficult to adopt for a lot of companies. So we basically decided to have the base visualization be open source and kind of always has to be just so it's like easy to embed and you need to be able to build your own tools around the visualization that are custom to your company. So you just, you should be able to take the visualizer and embed it in your application and customize it. so the open source visualizer is sort of client side so that's that's kind of how we made that split that like there's a logging sdk that runs in your process and it directly communicates maybe via the file system but like directly communicates with a visualizer app and there's no real serious server in the middle um so that's kind of how we started that out and i wouldn't say the commercial product is a different product but it's a cloud sort of backend for it instead that supports like very large amounts of data. I see. So we might have actually ended up in the same place, although you might have more emphasis on the client side visualization, which we've actually just implemented recently. Yeah, it's kind of, I think, similar in that manner, but just took a different route perhaps. Interesting. Yeah. I guess, have there been interesting API design trade-offs that you made? Is there any like API decisions that you sort of like regret? Like I remember we made a ton of decisions in the early days very quickly with like no information. And it's amazing how that some of the weirder ones that we made have like, you know, stuck and haunted us for years. I think there are more detailed things that are not worth like getting into whatever specifically how we designed the tensor data model, things like that. But I would say rather we design things wrong and then redesign them. So we've actually redesigned the data model probably four times at this point. Oh, wow. So we, that, and that has been like very effortful, right? And it's been hard, you know, we had to keep some backward compatibility and kind of do all these things been, been quite difficult, but, but yeah, we, we kind of really cared about getting those core pieces right and been ready to redo it. I mean, this may be a two in the weeds and people might just skip this episode. But like, what are decisions that you made in the data model that you then changed in like the next version of the data model? Well, the first one was this, we started out with a less flexible API. So we started out like in the beginning, it was like, because this is getting very detailed, right? But it was like rr.log underscore image. And then you'd like give it all the fields. And there was still a flexible way you could give your own custom data into that. so that was the first huge rewrite where we kind of formalized the sort of system to have a higher level of like archetypes which are sort of easier to use wrapper objects that you can log instead so now it's like R log then parentheses image say right and yeah so that was a huge one and then a lot of the under data model things that we've been changing are maybe more not as clear on the surface but it tends to have to do with how you represent columns of data. It's like more like being able to log more data for multiple time step at the same time, for instance. Like here's a whole column of data or here are like five columns of data and sort of those APIs and exposing that. And then, so that's been one set. But the most maybe difficult to do and impactful things have been maybe more how it's represented in memory and how the data model works sort of in the query engine of the database. So those rewrites have been the most, they're very impactful for performance and things like that and storage and pretty painful like to redo. I think we've done that three times or something. Yeah, totally. That's funny. A lot of similarities between our companies. What database are you using now to store all the data? Yeah, so it's custom basically everything. So the whole, everything open source is built in Rust from the ground up and we use Arrow. Yeah, we do too. So that's basically built around Arrow and everything is custom. So it kind of comes down to this, the special properties of what we call like physical data. And so physical data will be like multimodal, multirate and often like episodic. and that just changes how you need to store that data it doesn't like you can't fit that kind of data in like a table um because yeah because it's multi-rate that means maybe you have motion happening very fast and you have a person like images coming very slowly so they just take down a robot that's happening at different rates it just doesn't work if you can't fit it in like a normal tabular model so you just have to store data differently so yeah it's a new file format for doing that efficiently. You could think of it as like a sparse version of Parquet or something like that that we use. And the indexing system. So we kind of built all those things from scratch. I was curious about your customers. Like, you know, across all your customers, are there kind of like consistent blind spots or things that they consistently discover when they start to use rerun to be able to visualize all of their all of their logs so let's say you can split on like customers and users to start with as the users of the open source um i think the really big qualitative thing there so that would what the open source really is mostly giving them is like visualization and do and but it being like radically simpler for them to do so they do a lot more of it and like there's higher performance and so on. So everybody is already doing some amount of visualization. So it tends to be maybe about volume. Like you look at like a lot more things. You're debugging the intermediate steps of your data pipeline and so on. And so it just is a higher volume and therefore higher hit rates. So we'll have people tell us like that are working on some big self-driving car project and they found bugs in their training data pipeline that have been around for three years. um so those kinds of things so that would be on the on the open source side i think on the on the commercials side the biggest like aha things tend to be that they could they can build like just significantly simpler data pipelines because we have a different storage layer than is used and kind of normal data and so just the fact that you can build yeah how simple the the the kind of data pipeline structure can get is kind of tends to be one of the really, really big sort of aha or like unlocks that can do that very differently. So one place that guests on the show have really differed in perspective in kind of physical applications is how much, how they felt about simulations. Like it almost felt like almost like a religious feeling in some of the interviews I've done. I'm curious where you land on sort of like training and simulation and then like what fraction of your customers are training in simulation? I think I'm more on the agnostic side personally. Just coming from that I'm not personally building these things, so I try to just kind of keep an open mind. Totally. So we see both, I would say. The users of our commercial platform do not only rely on simulation. Like if you're like really just relying on simulation, then you have a little bit lower need to care as much about how you're like real saving your data basically so if you don't care about saving your data at all then you don't need a sort of database to to store it so because of that we get a sort of bias towards people who care about real data and i say that's also what i see mostly in the companies that are building products that interact with the real world i don't think anyone knows yet that it's possible to do that well with just simulation you most likely need a real a lot of like a lot of real world data to do that totally um so you're a practical guy like me building for practical people practical product um but i was curious have you ever built a feature in your product because it kind of gives you a spark of joy even if there's no real like obvious case for it like it would give me personal joy you mean that it's there um i don't know that there's no case for it. Probably we have, but I've like, you know, I've forgotten about it. But I think there are some aspects where we've really, really prioritized like small mini interaction performance. It's like it's, we really care about like how fast it is to like scroll back and forth in time. So the rerun viewer is like, it feels a little bit like a video editor or something that you can scroll back and forth and just having incredibly low latency on that. like that probably doesn't matter as much as we care about it but maybe it does it just feels really nice it just has this like oh it feels amazing totally um and i think that people react with that we get a lot of people who just i don't know that gives you something visceral but you can't put that on a spreadsheet for sure totally i love that answer um do you um i guess do you think okay i have a couple rapid fire questions here just let's see where they go okay um what do you do you think there's like an underrated robotic startup that you'd want to call out uh yeah i don't i don't want to like pick favorites here this is a positive though okay yeah um underrated um let's see i'll say one on the sort of modeling side that i think is talked about too little at least they're not it's like generalist ai and i'm very impressed by by what they've done on the sort of ship and why tell me why you think they're um I think that they had a really lot of foresight on the method of how they're going to collect data and like really went for that very seriously. And what was the method, can you say? Yeah, it's public. It's called UMI, but basically, I don't know if you've seen these videos where people put like, they're holding like, it's a person, but they're holding a little robot gripper kind of. So that, with a camera on it and so on. So they're like more puppeteering a fake robot. and so they bet on that and I really scaled that up and end up showing a really, really impressive sort of dexterous manipulation. So I was impressed by the sort of strong bet on that direction and like really making that work. It's been really cool. Yeah, on the kind of, there are two companies of the other examples I think are great that are more on like ship and deploy. I think Ultra is a really great company and the more like picking place for logistics. And why do you think of them? They're just very, I think, good at shipping and learning from that and like building real systems, but still being AI first. I've just sort of seen them doing that very well. There's a European company called Syriac that's like really good at that same thing. They're in different space, more manufacturing, but I've seen them do like very practical, like shipping oriented. And I think that's really cool to see. Cool. Okay, do you think that there's new data formats that you're going to need, say, if we roll forward three years? Like, what do you think are, like, maybe new sensors coming along or new types of objects that you're going to need to visualize in your product? So I think force and tactile touch are clearly becoming a lot more popular. So it's natural, as you'd expect. It's difficult if you imagine losing your sense of touch or in your hands, picking things up is a lot more difficult. And so unexpectedly that helps for robots too. So that, I think audio, more things audio as well, so those kinds of things. Okay, so in the LLM space that I'm much more familiar with, the language space, actually where I started and where we see more of our customers these days, there's tons of benchmarks. and there's all this fighting about like, okay, like, is this a good benchmark, a bad benchmark? People are like too oriented around benchmarks, but I feel like in robotics, you don't really see benchmarks as far as I can tell. Are there benchmarks that you think are meaningful? Are there benchmarks that should be created? This is like a hot topic in the field. I don't have a super strong opinion that's better than anyone else in the, like a strong opinion. Yeah, that's true. Give us your opinion, yeah. But I'll say, yeah, it's a huge problem for the field that there aren't any great benchmarks. Maybe who can fold laundry the fastest? Yeah, but you end up with this difficult situation where the way often to solve real problems is to co-train and co-design for your hardware. It's like you build, design these things together and then it becomes quite hard to benchmark in a good way how you're going to logistically do that, right? So for something to be practical almost has to be simulation-based. And then you have all the problems with simulation and not being, being particularly difficult for like manipulation and interaction in the physical world and, you know, perfect simulator and that doesn't exist. So, yeah, I actually don't know what the solution is, but it's, it's super, super important. But I imagine we'll see much more of companies driving that for themselves, the sort of internal benchmarks. And then companies will have these like forms of robots that they evaluate like, hey, we have a new model, deploy it. And they'd run tests live and real things as the way of testing things. And I would imagine that will continue to be that way for a bit. Okay. So a few years ago when I was messing around with robots in my garage, you know, it seemed like everyone is using this thing called Ross. And it seems like everyone kind of hated Ross, but like still used it. And I was kind of surprised to see on your website, it seems like Ross just has incredible staying power. like you still integrate with it. Yeah. What's your like, do you have like a take on it, whether it's like good or bad? And what is like, why hasn't it been replaced if so many people are griping? Yeah I think it one of these things where it almost a part of every roboticist journey to have a stint of like I going to rebuild ROS but right this time Like very many roboticists have done that at some point and then given up. I think the really core thing is that like, yeah, ROS is the proof that network effects work. That there's just a building ecosystem around it and kind of evolving standard and it's just very hard to kind of come out of that. And then the other part is that it's solving a problem that is difficult to make general. So at the heart of ROS is it's kind of a message passing system. So there's a way of a common set of message definitions. And error message could be how to represent an image that you can pass around or, you know, a 3D pose, basically where something is. So just any data that you might have. And it turns out very useful to have a standard set of those. And most companies will have their own versions, but if you have a set of standards, then you can build pluggable systems. Like, oh, someone builds a pluggable open source, like navigation module. So, and those things all add up. And that's just really, really hard to get out of. Yeah, we actually see a lot of, or a bunch of quite a little bit more serious efforts now to replace ROS. Let's see how it goes, I guess. It's definitely a difficult problem. But I think in particular, it's hard to fund. So that ROS, I think ROS would be better now if it had been better funded, but it's just hard to find a good funding model. Well, it's astonishing because so many well-funded companies are using it. Yeah. You think there would be more of like maybe an open source collaboration like you see with databases or something else? Yeah, I don't know that we see the databases really have that at super scale until they're, yeah, maybe there are a couple databases like that, but usually there is a team first that like really takes it really, really far. And then that kind of continues. I don't know why that hasn't happened with Ross, but it certainly hasn't. Well, I mean, it could segue into one of my final questions, which is, you know, when you look at some of our joint, you know, customers that are doing these general purpose, you know, robotics applications that, you know, are raising it like astonishing valuations pre-revenue, like we've never seen these kind of valuations pre-revenue before does it feel like a bubble to you or not yeah it's hard to know there's there's some of that but i think um yeah it's they are going after astonishingly large markets and i think there is something particular about general robotics which is where like there is a I see the point is one of the hardest parts of robotics is making a decent price point for the hardware. And you kind of need a really lot of scale for that. And the best way to generate scale is to be able to serve a lot of use cases with the same hardware. And so that also requires a lot of scale because then you need a very large amount of data, basically, and compute to make that happen. So I kind of see why a lot of capital gets amassed to go after it. Are you a fan of the humanoid form factor or not? I mean, I'm a fan because I'm just sort of an optimistic person, I think. So I would love to have a humanoid that was like doing my dishes and my, and so on. Do you think it will be a humanoid? I think long run for home tasks, I think there are a set of tasks that are likely to be humanoid or humanoid-like. I think in the shorter term, in many cases, I think sort of a semi-humanoid is a much more practical form factor, like a wheeled fixed base with arms and that are humanoid-like. It's like Jetsons. Yeah, a little bit that kind of thing. I think the thing that human-ish form factors have going for them is data collection. so just like you could imagine maybe it's more efficient to have four arms but it's really hard to have a human control four arms at the same time so as long as the real world like data collection is important I think that's just the like that's a really important factor that will be hard to get away from is there a tool out there in your space that you wish existed but doesn't that's a good question I don't know I don't know that I've you'd probably make it if there was yeah I'd probably make it in that case alright and last question what do you think is the biggest breakthrough likely to see in the next year or two in your space honestly I don't know I think significantly improved robustness and that really is one of the main thing that matters I think over the next year seeing more and more like larger longer and more complicated tax be able to be performed like in a robust way, like self-correcting and so on, and a little bit more like learning on the fly. I think that's a likely trajectory if things go great this year. And my guess is it's going to take a little longer for this more high-level, combining that with like really great high-level reasoning as well. So maybe that's the year after. But now I'm not just guessing. I guess in your experience, how is training embodied AI or robotic systems different than training LLMs? Yeah. So maybe I can start with some things that are common. So I think the things that we see the absolute best sort of AI team, LLM teams and physical AI teams do is kind of have the researcher, the modeler work on both the data and the modeling together. Like the same person is kind of making complicated decisions about exactly how to pre-process data, what data to include and the modeling um and so that that's like super super core um and it just turns out that that's a lot easier to do for like llm llm teams why is that um also yeah there may be like three core things that you need to be doing so you need to be looking at your data just like all the time and for text you can i mean just read text like it just kind of comes in And then if you have other signals, you just like to raise you to visualize. But doing that for physical data is very difficult. That's a little bit more closer to Sol, like for tools like RERUN. The other part becomes just like you want to have a very flexible, like make it possible for the researcher to kind of be editing the data pipelines and so on. And you might need to have like high performance when you do that. So you want to have both of those things, high performance and flexibility. and for LLM data that's like text and normal things that you can use like sort of robust well proven storage formats like Parquet, Iceberg and things like that and like high scale data processing engines like Spark and Databricks those kinds of tools have been around and matured quite a lot that allow you these like easy declarative database style APIs to edit how the data comes in. But for robotics, like none of that exists basically. So all the data formats are built for a world before machine learning. So they're not flexible. They're really optimized for like fast logging rather than like flexible querying. And when you say flexible querying, can you give me an example? Sure. I'm imagining like I want all the, you know, places where it was recorded at night or something like that. Yeah, exactly. Let's say I want all the times that it was like at night and the left gripper didn't open and there was a failure of type whatever, type B. And so that might require you to query the metadata and the signals sort of inside a specific recording. In fact, are you doing some kind of like semantic embedding to make that possible? Some users do that, but you could just be normal querying like this signal value is above five. and this other thing is this and the text is whatever, like the text description. And so the kind of normal robotics stack doesn't have a query, sort of a format where you can represent this sort of multi-rate physical data with sort of 3D or whatever, all these different things that you can store it in a file format and have some kind of just database query engine that can operate on top of it. So you end up having to, you put in these file formats that are optimized for saving logs. And then there is no query format. So the way you answer all these questions is you write a huge parallel job that reads the file into memory, like reads field by field everything. And you're like writing imperative custom code for every such query and running it on a huge parallel job and aggregating it manually. So it's just like a huge amount of work where it should just be like a SQL query or something like that. I feel like I'm seeing, you know, kind of a converging in some cases of, you know, LM models and embodied AI models. Like, for example, you know, one of our customers, Wave, actually uses an LM to describe what the car is doing as it's driving around, which is really kind of interesting. Do you see other things like that happening? Yeah, I mean, particularly for data annotation, that's a really, really big one. So I see a lot of teams doing like they'll train one model or use an open model to just annotate their robotics recordings to just say, oh, here's where it started or stops. This is the task that it's doing right now. Things like that. And obviously using computing embeddings for images and motion and stuff like that for search and curation. So both of those are super common as well. One thing that I think about, you know, for weights and biases is like, you know, we see more and more like agents like interacting with our product. And we've been pretty passionate about like doing really good visualizations. Yeah. But I kind of wonder like, you know, I think a lot of people now, they just pull the data out of weights and biases and then, you know, use Streamlit or Marimo or something to just like, you know, conjure like their own custom visualizations. And I've been thinking like, oh, like do visualizations even really matter in like 2027? Do you have a perspective on that? I think it will continue to matter, but potentially it will not be good. Like people will do more and more custom visualization based on this. So we actually see that happening quite a bit. Like you use Codex or Claude or whatever to just, oh, I need this special case, little visualization and just kick off an agent and it'll build your specialized tool. so if like SaaS style applications where like building a very dedicated tool like that I would imagine that gets harder and harder and the underlying systems assuming they're they need to be very like high scale and robust I think that I would imagine that to to have some more time on it at least yeah okay is there this is another just you know I feel like I'm like just fighting my own demons here but is there any trade-offs that you have between you know kind of flexibility for people in the first case where they're just kind of setting things up and just like kind of messing around versus like real scalability at like production volumes so i don't i don't know that it would be the exact about scalability for production volumes and flexibility to be honest i think we kind of solved that uh that took like two years to like on iterating on this data model thing and i think we got a a really good design that can kind of fit both of course we don't get all the way like for instance so the way our system is designed we'll never get to our logging being useful for like really low latency teleoperation it's just not built for that yeah so we won't get all the way there but i think maybe more importantly on like usability so we want to have a thing that's flexible enough for researchers to use and one of the things that go into that is we don't want to require you to like compile a schema before being having to log something um but in production use cases having systems that are a bit more built around like a pre-compiled known schema this is exactly all the things are going to come i guess that gives a little performance although you can work around those of those performance problems but um that can be actually lead to a bit more not that can fit a bit more naturally uh into into those systems so so that like ease of low friction for a researcher usability versus hey i'm plugging this into my already existing schemas and uh sort of production system that then has taken us a lot more work to kind of work for there too and we still have a little left to go on that on that journey to like really cleanly integrate with those sort of production systems so that like robot production systems like that yeah that's been a big trade-off for us for sure what about um do you get a lot of open source contributions um we do but it's not we get like a medium amount i would i would call it like um do you think it's like worth the overhead of managing it i think it is for us it's it's not if it was just for the sake of the contributions it would not be worth it um there's a we need to review pull requests and so on and so for that purpose no not really so for us open source is it gives a lot of trust right so it's like very easy to adopt and i think there's those just it makes it very easy so rerun gets built into a lot of other open source tools. So one of the most impactful open source projects in robotics learning has been Hugging Faces, the robot project, and they built Rerun into that for the visualization and that wouldn't have happened. And then we built into like NVIDIA as a new simulator engine and gets built into that. And so those kinds of things matter a lot. In terms of contributions, I think it's actually more in terms of some company wanted or some person or someone really loves Rerun and they have some little thing that they are really annoyed about and having a way for them to do that. And maybe it would have practically been easier if they just asked for it and we did it. But that kind of participation builds a kind of relationship with people. And I think that adds up more than the particular thing. Awesome. All right. Well, thanks again. Yeah, thanks. Thanks so much for listening to this episode of Gradient Descent. Please stay tuned for future episodes. you

Share on X Share on LinkedIn

Related Episodes

#228 - GPT 5.2, Scaling Agents, Weird Generalization

Last Week in AI

1h 26m

The Engineering Behind the World’s Most Advanced Video AI

Gradient Dissent

14m

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

Gradient Dissent

59m

#221 - OpenAI Codex, Gemini in Chrome, K2-Think, SB 53

Last Week in AI

47m

Bringing Robots to Life with AI: The Three Computer Revolution - Ep. 274

The AI Podcast (NVIDIA)

52m

Why Physical AI Needed a Completely New Data Stack

What You'll Learn

Episode Chapters

Introduction

Rerun's Use Cases and Data Model

Advancements in Robotics

Benchmarking Challenges

Unexpected Use Cases

AI Summary

Key Points

Topics Discussed

Frequently Asked Questions

Episode Description

Related Episodes

#228 - GPT 5.2, Scaling Agents, Weird Generalization

The Engineering Behind the World’s Most Advanced Video AI

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

#221 - OpenAI Codex, Gemini in Chrome, K2-Think, SB 53

Bringing Robots to Life with AI: The Three Computer Revolution - Ep. 274

The Startup Powering The Data Behind AGI

AI Curator

Ask me anything about AI