Abstract: Scene Graph Generation (SGG) encodes visual relationships between objects in images as graph structures. Thanks to the advances of Vision-Language Models (VLMs), the task of Open-Vocabulary ...