[ICASSP'25] PAIR: Complementarity-guided Disentanglement for Composed Image Retrieval

1School of Software, Shandong University,
2Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences)
3School of Computer Science and Technology, Shandong University,
4School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)

*Corresponding author.

Abstract

MY ALT TEXT

Inter-modal incoherence & Intra-modal entanglement

MY ALT TEXT

An example of the CIR task and PAIR’s main idea.


Framework: comPlementArity-guided dIsentanglement netwoRk (PAIR)

MY ALT TEXT

Overall architecrure of our proposed PAIR: (a) Coherence-incoherence Feature Extractor, (b) Complementarity-based Disentanglement, and (c) Asymmetric Feature Composition.


Experiment

MY ALT TEXT

MY ALT TEXT

MY ALT TEXT

MY ALT TEXT

Attention visualization for image coherent tokens on (a) CIRR and (b) FashionIQ.


MY ALT TEXT

The qualitative results of our PAIR, compared to the baseline BLIP4CIR.

BibTeX


                @inproceedings{PAIR,
                title={PAIR: Complementarity-guided Disentanglement for Composed Image Retrieval},
                author={Fu, Zhiheng and Li, Zixu and Chen, Zhiwei and Wang, Chunxiao and Song, Xuemeng and Hu, Yupeng and Nie, Liqiang},
                booktitle={Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing},
                pages={1--5},
                year={2025},
                organization={IEEE}
                }