arXiv Robotics2d ago|研究・論文プロダクト・サービス

OMCL: Open-vocabulary Monte Carlo Localization

This paper presents a novel approach to robot localization using open-vocabulary vision-language features, enabling robust association of sensor observations with map elements across different modalities.

💡

Why it matters

OMCL advances the state-of-the-art in robot localization, enabling more robust and versatile navigation in complex, multi-modal environments.

Key Points

1Extends Monte Carlo Localization with open-vocabulary vision-language features
2Enables robust association of visual observations with 3D map elements
3Allows global localization initialization using natural language descriptions
4Evaluated on indoor (Matterport3D, Replica) and outdoor (SemanticKITTI) datasets

Details

The paper presents OMCL, an Open-vocabulary Monte Carlo Localization approach that leverages vision-language features to enable robust robot localization in environments where the map was created from different sensor modalities. Traditional localization methods struggle when the robot's sensor data (e.g., camera images) cannot be directly matched to the map representation (e.g., 3D point clouds). OMCL addresses this by using abstract vision-language features that can bridge the gap between observations and map elements, allowing robust likelihood computation. This enables global localization to be initialized using natural language descriptions of the environment. The authors evaluate OMCL on indoor (Matterport3D, Replica) and outdoor (SemanticKITTI) datasets, demonstrating the approach's generalization capabilities.

OMCL: Open-vocabulary Monte Carlo Localization

Why it matters

Key Points

Details

Dive deeper

Related Articles

Large Video Planner Enables Generalizable Robot Control

SORS: A Modular, High-Fidelity Simulator for Soft Robots

Few-Shot Inference of Human Perceptions of Robot Performanc…

Maintaining the Level of a Payload carried by Multi-Robot S…

SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Ar…

A Task-Driven, Planner-in-the-Loop Computational Design Fra…

ManiLong-Shot: Interaction-Aware One-Shot Imitation Learnin…

A2VISR: An Active and Adaptive Ground-Aerial Localization S…

E-SDS: Environment-aware See it, Do it, Sorted - Automated …

Single-View Shape Completion for Robotic Grasping in Clutter

AI Curator

Ask me anything about AI

Related Articles

Large Video Planner Enables Generalizable Robot Control

SORS: A Modular, High-Fidelity Simulator for Soft Robots

Few-Shot Inference of Human Perceptions of Robot Performanc…

Maintaining the Level of a Payload carried by Multi-Robot S…

SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Ar…

A Task-Driven, Planner-in-the-Loop Computational Design Fra…

ManiLong-Shot: Interaction-Aware One-Shot Imitation Learnin…

A2VISR: An Active and Adaptive Ground-Aerial Localization S…

E-SDS: Environment-aware See it, Do it, Sorted - Automated …

Single-View Shape Completion for Robotic Grasping in Clutter