LangHOPS

Language Grounded Hierarchical Open-Vocabulary Part Segmentation

1INSAIT, Sofia University 2ETH Zurich 3TU Munich 4Toyota Motor Europe

LangHOPS is accepted to NeurIPS 2025

Figure: LangHOPS jointly detects and segments hierarchical object–part instances via MLLM-based reasoning.

Abstract

We propose LangHOPS, the first Multimodal Large Language Model (MLLM)-based framework for open-vocabulary object–part instance segmentation. Given an image, LangHOPS can jointly detect and segment hierarchical object and part instances from open-vocabulary candidate categories. Unlike prior approaches that rely on heuristic or learnable visual grouping, our approach grounds object–part hierarchies in language space. It integrates the MLLM into the object-part parsing pipeline to leverage its rich knowledge and reasoning capabilities, and link multi-granularity concepts within the hierarchies. We evaluate LangHOPS across multiple challenging scenarios, including in-domain and cross-dataset object-part instance segmentation, and zero-shot semantic segmentation. LangHOPS achieves state-of-the-art results, surpassing previous methods by 5.5% Average Precision (AP) (in-domain) and 4.8% (cross-dataset) on the PartImageNet dataset and by 2.5% mIOU on unseen object parts in ADE20K (zero-shot). Ablation studies further validate the effectiveness of the language-grounded hierarchy and MLLM driven part query refinement strategy. The code will be released here.

Method

Coming soon!

Quantitative Results

Coming soon!

Qualitative Results

Coming soon!

Acknowledgements

This research was funded by Toyota Motor Europe and the Ministry of Education and Science of Bulgaria (support for INSAIT, part of the Bulgarian National Roadmap for Research Infrastructure).

BibTeX


@inproceedings{miao2025langhops,
  title={LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation},
  author={Yang Miao and Jan-Nico Zaech and Xi Wang and Fabien Despinoy and Danda Pani Paudel and Luc Van Gool},
  year={2025},
  booktitle={International Conference on Neural Information Processing Systems (NeurIPS)},
}