Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
Published in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, 2026
We propose a training-free framework that aligns CLIP’s global context with local window features to improve open-vocabulary semantic segmentation, enabling dense predictions on unseen concepts without additional supervision.
Recommended citation: ByeongCheol Lee, Hyun Seok Seong, Sangeek Hyun, Gilhan Park, WonJun Moon, and Jae-Pil Heo. "Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation." In CVPR 2026.
Download Paper
