DCST: Dual Cross-Supervision for Transformer-based Unsupervised Domain Adaptation

Y Cheng, P Yao, L Xu, M Chen, P Liu, P Shao, S Shen… - Neural Networks, 2025 - Elsevier
Y Cheng, P Yao, L Xu, M Chen, P Liu, P Shao, S Shen, RX Xu
Neural Networks, 2025Elsevier
Abstract Unsupervised Domain Adaptation aims to leverage a source domain with ample
labeled data to tackle tasks on an unlabeled target domain. However, this poses a
significant challenge, particularly in scenarios exhibiting significant disparities between the
two domains. Prior methods often fall short in challenging domains due to the impact of
incorrect pseudo-labeling noise and the limits of handcrafted domain alignment rules. In this
paper, we propose a novel method called DCST (Dual Cross-Supervision Transformer) …
Abstract
Unsupervised Domain Adaptation aims to leverage a source domain with ample labeled data to tackle tasks on an unlabeled target domain. However, this poses a significant challenge, particularly in scenarios exhibiting significant disparities between the two domains. Prior methods often fall short in challenging domains due to the impact of incorrect pseudo-labeling noise and the limits of handcrafted domain alignment rules. In this paper, we propose a novel method called DCST (Dual Cross-Supervision Transformer), which improves upon existing methods in two key aspects. Firstly, vision transformer is combined with dual cross-supervision learning strategy to enforce consistency learning from different domains. The network accomplishes domain-specific self-training and cross-domain feature alignment in an adaptive manner. Secondly, due to the presence of noise in challenging domain, and the need to reduce the risks of model collapse and overfitting, we propose a Domain Shift Filter. Specifically, this module allows the model to leverage the memory of source domain features to facilitate a smooth transition. It can also improve the effectiveness of knowledge transfer between domains with significant gaps. We conduct extensive experiments on four benchmark datasets and achieved the best classification results, including 94.3% on Office-31, 86.0% on Office-Home, 89.3% on VisDA-2017, and 48.8% on DomainNet. Code is available in https://github.com/Yislight/DCST.
Elsevier