Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG
Abstract
:1. Introduction
- 1:
- The semantic gap between IR video and EEG signal is large compared to other cross-modal retrieval tasks. It brings challenges to capture consistent cross-modal semantics in the retrieval task.
- 2:
- Sleep data are large-scale (especially IR video), and require large storage for the gallery sets and superior computing resources in the inference.
- To reduce the large cross-modal semantic gaps, we designed a contrastive learning method based on hard negative samples, that pulls closer the inter-modal similar representations and pushes the dissimilar ones.
- To solve the problem of excessive sleep data storage, we proposed a novel contrastive hashing module to compute a discriminative yet unique cross-modal binary hash codes.
- For evaluations, we collected a large-scale synchronized IR video and EEG data from the clinics. Results proved that our proposed CCHR significantly outperforms the current state-of-the-art cross-modal hashing retrieval methods.
2. Related Works
2.1. Feature Representation for Video-EEG Retrieval
2.1.1. EEG
2.1.2. Video
2.2. Cross-Modal Contrastive Learning
2.3. Contrastive Learning for Cross-Modal Retrieval
2.4. Hashing Methods for Cross-Modal Retrieval
3. Materials and Methods
3.1. Research Materials
3.2. Overall Framework
- Cross-modal feature extraction module that provides deep semantic representation for IR video and EEG signal via deep neural networks.
- Contrastive hashing module that generates the instance-level binary hash code of the deep semantic features through cross-modal contrastive learning.
3.3. Cross-Modal Feature Encoders
3.4. Contrastive Hashing Module
3.4.1. Cross-Modal Contrastive Loss
3.4.2. Quantization Loss
3.4.3. Bit Balance Loss
Algorithm 1 Optimization Algorithm |
Input Training set X, hyperparameter , Output The weights of the IR video hashing network and EEG hashing network ; The weights of the IR video encoder and EEG encoder (If the encoder for both modes is not frozen); 1: repeat 2: Randomly sample a batch of training data with pairwise synchronised IR sleep videos and EEG signal; 3: Compute the outputs of the IR sleep videos encoder and EEG encoder 4: Compute the outputs of two hashing networks and 5: Calculate the contrastive hashing loss according to Equation (3) 6: Calculate the quantization loss and the bit balance loss according to Equations (5) and (7), respectively; 7: Train the target model by optimizing 8: until a fixed number of iterations |
3.5. Network Details
4. Experiments
4.1. Dataset
4.2. Experiment Configurations
4.3. Evaluation Metric and Baselines
5. Results and Analysis
5.1. Results
5.2. Ablation Study
5.3. Analysis
6. Discussion
7. Conclusions
8. Patents
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
EEG | Electroencephalography |
PSG | Polysomnography |
OSA | Obstructive Sleep Apnoea |
AASM | American Academy of Sleep Medicine |
REM | Rapid Eye Movement |
SVM | Support Vector Machines |
RF | Random Forests |
ECG | Electrocardiogram |
EMG | Electromyogram |
IR | Infrared |
AHI | Apnea–Hypopnea Index |
MAP | Mean Average Precision |
SOTA | State of the art |
SHSS | Sleep Heart Health Study |
MASS | Montreal Archive of Sleep Studies |
References
- Berry, R.B.; Budhiraja, R.; Gottlieb, D.J.; Gozal, D.; Iber, C.; Kapur, V.K.; Marcus, C.L.; Mehra, R.; Parthasarathy, S.; Quan, S.F.; et al. Rules for scoring respiratory events in sleep: Update of the 2007 AASM manual for the scoring of sleep and associated events: Deliberations of the sleep apnea definitions task force of the American Academy of Sleep Medicine. J. Clin. Sleep Med. 2012, 8, 597–619. [Google Scholar] [CrossRef] [Green Version]
- Gottlieb, D.J.; Punjabi, N.M. Diagnosis and management of obstructive sleep apnea: A review. JAMA 2020, 323, 1389–1400. [Google Scholar] [CrossRef] [PubMed]
- Supratak, A.; Dong, H.; Wu, C.; Guo, Y. DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG. IEEE Trans. Neural Syst. Rehabil. Eng. 2017, 25, 1998–2008. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Supratak, A.; Guo, Y. TinySleepNet: An efficient deep learning model for sleep stage scoring based on raw single-channel EEG. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 641–644. [Google Scholar]
- Eldele, E.; Chen, Z.; Liu, C.; Wu, M.; Kwoh, C.K.; Li, X.; Guan, C. An attention-based deep learning approach for sleep stage classification with single-channel eeg. IEEE Trans. Neural Syst. Rehabil. Eng. 2021, 29, 809–818. [Google Scholar] [CrossRef]
- Wilde-Frenz, J.; Schulz, H. Rate and distribution of body movements during sleep in humans. Percept. Mot. Ski. 1983, 56, 275–283. [Google Scholar] [CrossRef] [PubMed]
- Yu, B.; Wang, Y.; Niu, K.; Zeng, Y.; Gu, T.; Wang, L.; Guan, C.; Zhang, D. WiFi-Sleep: Sleep stage monitoring using commodity Wi-Fi devices. IEEE Internet Things J. 2021, 8, 13900–13913. [Google Scholar] [CrossRef]
- Lee, J.; Hong, M.; Ryu, S. Sleep monitoring system using kinect sensor. Int. J. Distrib. Sens. Netw. 2015, 2015, 1–9. [Google Scholar] [CrossRef]
- Hoque, E.; Dickerson, R.F.; Stankovic, J.A. Monitoring body positions and movements during sleep using wisps. In Proceedings of the Wireless Health 2010, WH 2010, San Diego, CA, USA, 5–7 October 2010; pp. 44–53. [Google Scholar]
- Della Monica, C.; Johnsen, S.; Atzori, G.; Groeger, J.A.; Dijk, D.J. Rapid eye movement sleep, sleep continuity and slow wave sleep as predictors of cognition, mood, and subjective sleep quality in healthy men and women, aged 20–84 years. Front. Psychiatry 2018, 9, 255. [Google Scholar] [CrossRef] [Green Version]
- Stefani, A.; Högl, B. Diagnostic criteria, differential diagnosis, and treatment of minor motor activity and less well-known movement disorders of sleep. Curr. Treat. Options Neurol. 2019, 21, 1–14. [Google Scholar] [CrossRef] [Green Version]
- Jia, Z.; Cai, X.; Jiao, Z. Multi-modal physiological signals based squeeze-and-excitation network with domain adversarial learning for sleep staging. IEEE Sens. J. 2022, 22, 3464–3471. [Google Scholar] [CrossRef]
- Suykens, J.A.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Perslev, M.; Jensen, M.; Darkner, S.; Jennum, P.J.; Igel, C. U-time: A fully convolutional network for time series segmentation applied to sleep staging. Adv. Neural Inf. Process. Syst. 2019, 32, 4415–4426. [Google Scholar]
- Jia, Z.; Cai, X.; Zheng, G.; Wang, J.; Lin, Y. SleepPrintNet: A multivariate multimodal neural network based on physiological time-series for automatic sleep staging. IEEE Trans. Artif. Intell. 2020, 1, 248–257. [Google Scholar] [CrossRef]
- Phan, H.; Andreotti, F.; Cooray, N.; Chén, O.Y.; De Vos, M. SeqSleepNet: End-to-end hierarchical recurrent neural network for sequence-to-sequence automatic sleep staging. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 400–410. [Google Scholar] [CrossRef] [Green Version]
- Tsinalis, O.; Matthews, P.M.; Guo, Y.; Zafeiriou, S. Automatic sleep stage scoring with single-channel EEG using convolutional neural networks. arXiv 2016, arXiv:1610.01683. [Google Scholar]
- SM, I.N.; Zhu, X.; Chen, Y.; Chen, W. Sleep stage classification based on eeg, eog, and cnn-gru deep learning model. In Proceedings of the 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST), Morioka, Japan, 23–25 October 2019; pp. 1–7. [Google Scholar]
- Zhang, X.; Xu, M.; Li, Y.; Su, M.; Xu, Z.; Wang, C.; Kang, D.; Li, H.; Mu, X.; Ding, X.; et al. Automated multi-model deep neural network for sleep stage scoring with unfiltered clinical data. Sleep Breath. 2020, 24, 581–590. [Google Scholar] [CrossRef] [Green Version]
- Guillot, A.; Thorey, V. RobustSleepNet: Transfer learning for automated sleep staging at scale. IEEE Trans. Neural Syst. Rehabil. Eng. 2021, 29, 1441–1451. [Google Scholar] [CrossRef]
- Prabhakar, S.K.; Rajaguru, H.; Ryu, S.; Jeong, I.C.; Won, D.O. A Holistic Strategy for Classification of Sleep Stages with EEG. Sensors 2022, 22, 3557. [Google Scholar] [CrossRef]
- Li, X.; Leung, F.H.; Su, S.; Ling, S.H. Sleep Apnea Detection Using Multi-Error-Reduction Classification System with Multiple Bio-Signals. Sensors 2022, 22, 5560. [Google Scholar] [CrossRef]
- Mousavi, S.; Afghah, F.; Acharya, U.R. SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach. PLoS ONE 2019, 14, e0216456. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Phan, H.; Andreotti, F.; Cooray, N.; Chén, O.Y.; De Vos, M. Joint classification and prediction CNN framework for automatic sleep stage classification. IEEE Trans. Biomed. Eng. 2018, 66, 1285–1296. [Google Scholar] [CrossRef] [PubMed]
- Jia, Z.; Lin, Y.; Wang, J.; Zhou, R.; Ning, X.; He, Y.; Zhao, Y. GraphSleepNet: Adaptive Spatial-Temporal Graph Convolutional Networks for Sleep Stage Classification. In Proceedings of the IJCAI, Online, 7–15 January 2021; pp. 1324–1330. [Google Scholar]
- Jia, Z.; Lin, Y.; Wang, J.; Wang, X.; Xie, P.; Zhang, Y. SalientSleepNet: Multimodal salient wave detection network for sleep staging. arXiv 2021, arXiv:2105.13864. [Google Scholar]
- Wang, H.; Schmid, C. Action recognition with improved trajectories. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 3551–3558. [Google Scholar]
- Scovanner, P.; Ali, S.; Shah, M. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th ACM International Conference on Multimedia, Augsburg, Germany, 25–29 September 2007; pp. 357–360. [Google Scholar]
- Klaser, A.; Marszałek, M.; Schmid, C. A spatio-temporal descriptor based on 3d-gradients. In Proceedings of the BMVC 2008—19th British Machine Vision Conference, Leeds, UK, 1–4 September 2008; pp. 1–10. [Google Scholar]
- Feichtenhofer, C.; Fan, H.; Malik, J.; He, K. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6202–6211. [Google Scholar]
- Qiao, S.; Wang, R.; Shan, S.; Chen, X. Deep heterogeneous hashing for face video retrieval. IEEE Trans. Image Process. 2019, 29, 1299–1312. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hara, K.; Kataoka, H.; Satoh, Y. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? In Proceedings of the Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6546–6555. [Google Scholar]
- Tran, D.; Wang, H.; Torresani, L.; Ray, J.; LeCun, Y.; Paluri, M. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6450–6459. [Google Scholar]
- Aytar, Y.; Vondrick, C.; Torralba, A. Soundnet: Learning sound representations from unlabeled video. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 892–900. [Google Scholar]
- Owens, A.; Isola, P.; McDermott, J.; Torralba, A.; Adelson, E.H.; Freeman, W.T. Visually indicated sounds. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2405–2413. [Google Scholar]
- Arandjelovic, R.; Zisserman, A. Look, listen and learn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 609–617. [Google Scholar]
- Wu, Y.; Zhu, L.; Jiang, L.; Yang, Y. Decoupled novel object captioner. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; pp. 1029–1037. [Google Scholar]
- Owens, A.; Wu, J.; McDermott, J.H.; Freeman, W.T.; Torralba, A. Ambient sound provides supervision for visual learning. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 801–816. [Google Scholar]
- Wu, Y.; Jiang, L.; Yang, Y. Revisiting embodiedqa: A simple baseline and beyond. IEEE Trans. Image Process. 2020, 29, 3984–3992. [Google Scholar] [CrossRef] [Green Version]
- Harwath, D.; Torralba, A.; Glass, J. Unsupervised learning of spoken language with visual context. Adv. Neural Inf. Process. Syst. 2016, 29, 3984–3992. [Google Scholar]
- Chen, M.; Xie, Y. Cross-Modal Reconstruction for Tactile Signal in Human—Robot Interaction. Sensors 2022, 22, 6517. [Google Scholar] [CrossRef]
- Wu, Y.; Zhu, L.; Yan, Y.; Yang, Y. Dual attention matching for audio-visual event localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6292–6300. [Google Scholar]
- Wu, Y.; Zhu, L.; Wang, X.; Yang, Y.; Wu, F. Learning to anticipate egocentric actions by imagination. IEEE Trans. Image Process. 2020, 30, 1143–1152. [Google Scholar] [CrossRef]
- Li, W.; Gao, C.; Niu, G.; Xiao, X.; Liu, H.; Liu, J.; Wu, H.; Wang, H. Unimo: Towards unified-modal understanding and generation via cross-modal contrastive learning. arXiv 2020, arXiv:2012.15409. [Google Scholar]
- Kim, D.; Tsai, Y.H.; Zhuang, B.; Yu, X.; Sclaroff, S.; Saenko, K.; Chandraker, M. Learning cross-modal contrastive features for video domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 13618–13627. [Google Scholar]
- Zhang, H.; Koh, J.Y.; Baldridge, J.; Lee, H.; Yang, Y. Cross-modal contrastive learning for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 833–842. [Google Scholar]
- Zolfaghari, M.; Zhu, Y.; Gehler, P.; Brox, T. Crossclr: Cross-modal contrastive learning for multi-modal video representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 1450–1459. [Google Scholar]
- Khosla, P.; Teterwak, P.; Wang, C.; Sarna, A.; Tian, Y.; Isola, P.; Maschinot, A.; Liu, C.; Krishnan, D. Supervised contrastive learning. Adv. Neural Inf. Process. Syst. 2020, 33, 18661–18673. [Google Scholar]
- Oord, A.v.d.; Li, Y.; Vinyals, O. Representation learning with contrastive predictive coding. arXiv 2018, arXiv:1807.03748. [Google Scholar]
- Mikriukov, G.; Ravanbakhsh, M.; Demir, B. Deep Unsupervised Contrastive Hashing for Large-Scale Cross-Modal Text-Image Retrieval in Remote Sensing. arXiv 2022, arXiv:2201.08125. [Google Scholar]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 1597–1607. [Google Scholar]
- Cao, Y.; Long, M.; Wang, J.; Zhu, H. Correlation autoencoder hashing for supervised cross-modal search. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, New York, NY, USA, 6–9 June 2016; pp. 197–204. [Google Scholar]
- Xie, D.; Deng, C.; Li, C.; Liu, X.; Tao, D. Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans. Image Process. 2020, 29, 3626–3637. [Google Scholar] [CrossRef]
- Liu, S.; Qian, S.; Guan, Y.; Zhan, J.; Ying, L. Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 1379–1388. [Google Scholar]
- Su, S.; Zhong, Z.; Zhang, C. Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3027–3035. [Google Scholar]
- Shi, G.; Li, F.; Wu, L.; Chen, Y. Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval. Sensors 2022, 22, 2921. [Google Scholar] [CrossRef]
- Jiang, Q.Y.; Li, W.J. Deep cross-modal hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3232–3240. [Google Scholar]
- Yang, E.; Deng, C.; Liu, W.; Liu, X.; Tao, D.; Gao, X. Pairwise relationship guided deep hashing for cross-modal retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Han, J.; Zhang, S.; Men, A.; Liu, Y.; Yao, Z.; Yan, Y.; Chen, Q. Seeing your sleep stage: Cross-modal distillation from EEG to infrared video. arXiv 2022, arXiv:2208.05814. [Google Scholar]
- Faghri, F.; Fleet, D.J.; Kiros, J.R.; Fidler, S. VSE++: Improving visual-semantic embeddings with hard negatives. arXiv 2017, arXiv:1707.05612. [Google Scholar]
- Li, K.; Zhang, Y.; Li, K.; Li, Y.; Fu, Y. Visual Semantic Reasoning for Image-Text Matching. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27–28 October 2019; pp. 4653–4661. [Google Scholar]
- Shen, F.; Shen, C.; Liu, W.; Tao Shen, H. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 37–45. [Google Scholar]
- Shen, F.; Zhou, X.; Yang, Y.; Song, J.; Shen, H.T.; Tao, D. A fast optimization method for general binary code learning. IEEE Trans. Image Process. 2016, 25, 5610–5621. [Google Scholar] [CrossRef]
- Song, D.; Liu, W.; Ji, R.; Meyer, D.A.; Smith, J.R. Top rank supervised binary coding for visual search. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1922–1930. [Google Scholar]
- Quan, S.F.; Howard, B.V.; Iber, C.; Kiley, J.P.; Nieto, F.J.; O’Connor, G.T.; Rapoport, D.M.; Redline, S.; Robbins, J.; Samet, J.M.; et al. The sleep heart health study: Design, rationale, and methods. Sleep 1997, 20, 1077–1085. [Google Scholar] [PubMed] [Green Version]
- O’reilly, C.; Gosselin, N.; Carrier, J.; Nielsen, T. Montreal Archive of Sleep Studies: An open-access resource for instrument benchmarking and exploratory research. J. Sleep Res. 2014, 23, 628–635. [Google Scholar] [CrossRef] [PubMed]
- Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed]
- Qiu, Z.; Su, Q.; Ou, Z.; Yu, J.; Chen, C. Unsupervised hashing with contrastive information bottleneck. arXiv 2021, arXiv:2105.06138. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- Wang, D.; Gao, X.; Wang, X.; He, L. Semantic topic multimodal hashing for cross-media retrieval. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Fu, C.; Wang, G.; Wu, X.; Zhang, Q.; He, R. Deep momentum uncertainty hashing. Pattern Recognit. 2022, 122, 108264. [Google Scholar] [CrossRef]
- Abdi-Sargezeh, B.; Foodeh, R.; Shalchyan, V.; Daliri, M.R. EEG artifact rejection by extracting spatial and spatio-spectral common components. J. Neurosci. Methods 2021, 358, 109182. [Google Scholar] [CrossRef]
Model | Block | conv1 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
F | N | F | N | F | N | F | N | |||
basic | , 7 ∗ 7 ∗ 7, 64 temporal stride 1 spatial stride 2 | 64 | 2 | 128 | 2 | 256 | 2 | 512 | 2 |
Module | Layer | Activation | Output Size |
---|---|---|---|
IR video contrastive hashing network | fc1 | RELU | 512 |
fc2 | ReLU | 4096 | |
BN | / | / | |
fc3 | tanh | K | |
EEG video contrastive hashing network | fc1 | ReLU | 512 |
fc2 | ReLU | 4096 | |
BN | / | / | |
fc3 | tanh | K |
Task | Method | B = 16 | B = 32 | B = 64 |
---|---|---|---|---|
DCMH [71] | 0.446 | 0.467 | 0.510 | |
PRDH [59] | 0.462 | 0.490 | 0.538 | |
IR video query | CPAH [54] | 0.497 | 0.511 | 0.559 |
vs. | DJSRH [56] | 0.485 | 0.510 | 0.557 |
EEG gallery | JDSH [55] | 0.478 | 0.502 | 0.550 |
DUCH [51] | 0.508 | 0.522 | 0.574 | |
CCHR (proposed) | 0.526 | 0.546 | 0.592 | |
DCMH [71] | 0.401 | 0.421 | 0.451 | |
PRDH [59] | 0.386 | 0.426 | 0.458 | |
EEG query | CPAH [54] | 0.447 | 0.460 | 0.492 |
vs. | DJSRH [56] | 0.485 | 0.491 | 0.521 |
IR video gallery | JDSH [55] | 0.481 | 0.490 | 0.531 |
DUCH [51] | 0.490 | 0.499 | 0.538 | |
CCHR (proposed) | 0.506 | 0.514 | 0.554 |
Task | Method | B = 16 | B = 32 | B = 64 |
---|---|---|---|---|
IR video query | CCHR w/o | 0.488 | 0.494 | 0.533 |
vs. | CCHR w/o | 0.512 | 0.520 | 0.575 |
EEG gallery | CCHR | 0.526 | 0.546 | 0.592 |
EEG query | CCHR w/o | 0.470 | 0.485 | 0.517 |
vs. | CCHR w/o | 0.493 | 0.502 | 0.537 |
IR video gallery | CCHR | 0.506 | 0.514 | 0.554 |
Rank | Top 100 Retrieval Results | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
1–10 | A-N3 | A-N3 | A-N3 | A-N3 | A-N2 | A-N3 | A-N3 | A-N2 | A-N3 | D-N3 |
11–20 | D-N3 | A-N3 | A-N2 | A-N3 | A-N3 | B-N3 | A-N3 | A-N3 | A-N2 | A-N2 |
21–30 | A-N3 | A-N3 | B-N2 | B-N2 | A-N3 | A-N3 | B-N3 | B-N3 | B-N3 | C-N2 |
31–40 | A-N3 | A-N3 | A-N3 | A-N2 | A-N2 | A-N2 | A-N3 | A-N3 | A-N3 | C-N3 |
41–50 | C-N3 | A-N2 | A-N3 | A-N3 | A-N3 | C-N2 | E-N3 | E-N3 | E-N3 | E-N3 |
51–60 | E-N3 | B-N3 | B-N2 | B-N3 | B-N3 | A-N2 | A-N2 | C-N3 | C-N3 | C-N3 |
61–70 | C-N3 | C-N3 | C-N3 | C-N3 | B-N3 | B-N3 | D-N3 | D-N3 | D-N3 | A-N2 |
71–80 | A-N2 | A-N2 | B-N3 | B-N3 | B-N3 | D-R | A-R | D-N1 | A-N1 | B-N2 |
81–90 | A-N3 | A-N3 | A-N3 | A-N3 | A-N3 | A-N3 | A-N2 | A-W | C-N3 | A-N2 |
91–100 | A-N2 | A-N2 | E-N3 | E-N3 | E-N3 | A-N1 | A-N1 | A-N2 | C-N2 | C-N2 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, J.; Zhang, S.; Men, A.; Chen, Q. Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG. Sensors 2022, 22, 8804. https://doi.org/10.3390/s22228804
Han J, Zhang S, Men A, Chen Q. Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG. Sensors. 2022; 22(22):8804. https://doi.org/10.3390/s22228804
Chicago/Turabian StyleHan, Jianan, Shaoxing Zhang, Aidong Men, and Qingchao Chen. 2022. "Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG" Sensors 22, no. 22: 8804. https://doi.org/10.3390/s22228804
APA StyleHan, J., Zhang, S., Men, A., & Chen, Q. (2022). Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG. Sensors, 22(22), 8804. https://doi.org/10.3390/s22228804