NTC Noise Challenges and Our Decoupled Three-Phase Paradigm
(a) illustrates the semantic ambiguity of noise in NTC. (b) illustrates the vicious cycle of
self-dependency caused by
unreliable noise determination. (c) introduces our proposed “Expert-Proxy-Diversion” three-phase
learning framework.
Figure best viewed in color.
The proposed Air-Know consists of three primary modules: (a) External Prior Arbitration leverages an offline
multimodal
expert to generate reliable arbitration priors for CIR triplets, bypassing the unreliable small-loss
hypothesis. (b)
Expert-Knowledge Internalization transfers these priors into a lightweight proxy network, structurally
preventing the
memorization of ambiguous partial matches. Finally, (c) Dual-Stream Reconciliation dynamically integrates
the
internalized knowledge to provide robust online feedback, guiding the final representation learning. Figure
best viewed
in color.
Experiment
Performance comparison on the FashionIQ validation set in terms of R@K(%). The best and second-best results
are
highlighted in bold and underline, respectively.
Performance comparison on the CIRR test set in terms of R@K(%) and Rsub@K(%). The best and second-best
results are
highlighted in bold and underline, respectively.
Ablation study on FashionIQ and CIRR datasets. Best and sub-optimal results are highlighted in bold and
underline.
Sensitivity to the hyperparameters (a) p and (b) λ.