OMCL: Open-vocabulary Monte Carlo Localization
This paper presents a novel approach to robot localization using open-vocabulary vision-language features, enabling robust association of sensor observations with map elements across different modalities.
Why it matters
OMCL advances the state-of-the-art in robot localization, enabling more robust and versatile navigation in complex, multi-modal environments.
Key Points
- 1Extends Monte Carlo Localization with open-vocabulary vision-language features
- 2Enables robust association of visual observations with 3D map elements
- 3Allows global localization initialization using natural language descriptions
- 4Evaluated on indoor (Matterport3D, Replica) and outdoor (SemanticKITTI) datasets
Details
The paper presents OMCL, an Open-vocabulary Monte Carlo Localization approach that leverages vision-language features to enable robust robot localization in environments where the map was created from different sensor modalities. Traditional localization methods struggle when the robot's sensor data (e.g., camera images) cannot be directly matched to the map representation (e.g., 3D point clouds). OMCL addresses this by using abstract vision-language features that can bridge the gap between observations and map elements, allowing robust likelihood computation. This enables global localization to be initialized using natural language descriptions of the environment. The authors evaluate OMCL on indoor (Matterport3D, Replica) and outdoor (SemanticKITTI) datasets, demonstrating the approach's generalization capabilities.
No comments yet
Be the first to comment