default search action
Qin Jin
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [b1]Qin Jin:
Robust Speaker Recognition. Karlsruhe University, Germany, 2024 - [j18]Yawen Zeng, Ning Han, Keyu Pan, Qin Jin:
Temporally Language Grounding With Multi-Modal Multi-Prompt Tuning. IEEE Trans. Multim. 26: 3366-3377 (2024) - [c170]Liang Zhang, Qin Jin, Haoyang Huang, Dongdong Zhang, Furu Wei:
Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language Models. ACL (1) 2024: 4177-4192 - [c169]Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin:
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline. ACL (1) 2024: 9479-9493 - [c168]Zihao Yue, Liang Zhang, Qin Jin:
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective. ACL (1) 2024: 11766-11781 - [c167]Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin:
ESCoT: Towards Interpretable Emotional Support Dialogue Systems. ACL (1) 2024: 13395-13412 - [c166]Fengyuan Zhang, Zhaopei Huang, Xinjie Zhang, Qin Jin:
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition. ICME 2024: 1-6 - [c165]Zhaopei Huang, Jinming Zhao, Qin Jin:
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains. IJCAI 2024: 6288-6296 - [c164]Yuting Mei, Linli Yao, Qin Jin:
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos. ICMR 2024: 1034-1042 - [c163]Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Xu Sun, Qin Jin:
Edit As You Wish: Video Caption Editing with Multi-grained User Control. ACM Multimedia 2024: 1924-1933 - [c162]Yang Du, Yuqi Liu, Qin Jin:
Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval. ACM Multimedia 2024: 5260-5269 - [c161]Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. ACM Multimedia 2024: 11279-11281 - [i84]Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe:
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2. CoRR abs/2401.17619 (2024) - [i83]Zihao Yue, Liang Zhang, Qin Jin:
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective. CoRR abs/2402.14545 (2024) - [i82]Boshen Xu, Sipeng Zheng, Qin Jin:
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World. CoRR abs/2403.05856 (2024) - [i81]Boshen Xu, Sipeng Zheng, Qin Jin:
SPAFormer: Sequential 3D Part Assembly with Transformers. CoRR abs/2403.05874 (2024) - [i80]Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou:
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding. CoRR abs/2403.12895 (2024) - [i79]Zihao Yue, Yepeng Zhang, Ziheng Wang, Qin Jin:
Movie101v2: Improved Movie Narration Benchmark. CoRR abs/2404.13370 (2024) - [i78]Qingrong He, Kejun Lin, Shizhe Chen, Anwen Hu, Qin Jin:
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models. CoRR abs/2404.14705 (2024) - [i77]Liang Zhang, Anwen Hu, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin, Ji Zhang, Fei Huang:
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning. CoRR abs/2404.16635 (2024) - [i76]Zhaopei Huang, Jinming Zhao, Qin Jin:
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains. CoRR abs/2405.10860 (2024) - [i75]Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin:
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline. CoRR abs/2405.14040 (2024) - [i74]Boshen Xu, Ziheng Wang, Yang Du, Zhinan Song, Sipeng Zheng, Qin Jin:
EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions? CoRR abs/2405.17719 (2024) - [i73]Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. CoRR abs/2406.07725 (2024) - [i72]Yuning Wu, Chunlei Zhang, Jiatong Shi, Yuxun Tang, Shan Yang, Qin Jin:
TokSing: Singing Voice Synthesis based on Discrete Tokens. CoRR abs/2406.08416 (2024) - [i71]Yuxun Tang, Yuning Wu, Jiatong Shi, Qin Jin:
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models. CoRR abs/2406.08905 (2024) - [i70]Fengyuan Zhang, Zhaopei Huang, Xinjie Zhang, Qin Jin:
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition. CoRR abs/2406.08997 (2024) - [i69]Yuxun Tang, Jiatong Shi, Yuning Wu, Qin Jin:
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction. CoRR abs/2406.10911 (2024) - [i68]Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin:
ESCoT: Towards Interpretable Emotional Support Dialogue Systems. CoRR abs/2406.10960 (2024) - [i67]Yuting Mei, Linli Yao, Qin Jin:
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos. CoRR abs/2406.16301 (2024) - [i66]Ye Wang, Yuting Mei, Sipeng Zheng, Qin Jin:
QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds. CoRR abs/2406.16578 (2024) - [i65]Dingyi Yang, Qin Jin:
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation. CoRR abs/2408.14622 (2024) - [i64]Anwen Hu, Haiyang Xu, Liang Zhang, Jiabo Ye, Ming Yan, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou:
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding. CoRR abs/2409.03420 (2024) - [i63]Liangyu Chen, Zihao Yue, Boshen Xu, Qin Jin:
Unveiling Visual Biases in Audio-Visual Localization Benchmarks. CoRR abs/2409.06709 (2024) - [i62]Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. CoRR abs/2409.07226 (2024) - [i61]Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024) - [i60]Lei Sun, Jinming Zhao, Qin Jin:
Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues. CoRR abs/2409.19723 (2024) - 2023
- [j17]Liang Zhang, Ludan Ruan, Anwen Hu, Qin Jin:
Multimodal Pretraining from Monolingual to Multilingual. Mach. Intell. Res. 20(2): 220-232 (2023) - [j16]Yun Zhang, Qi Lu, Qin Jin, Wanting Meng, Shuhu Yang, Shen Huang, Yanling Han, Zhonghua Hong, Zhansheng Chen, Weiliang Liu:
Global Sea Surface Height Measurement From CYGNSS Based on Machine Learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 16: 841-852 (2023) - [c160]Yuqi Liu, Luhui Xu, Pengfei Xiong, Qin Jin:
Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language. AAAI 2023: 1781-1789 - [c159]Yawen Zeng, Qin Jin, Tengfei Bao, Wenfeng Li:
Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval. AAAI 2023: 3376-3383 - [c158]Ludan Ruan, Anwen Hu, Yuqing Song, Liang Zhang, Sipeng Zheng, Qin Jin:
Accommodating Audio Modality in CLIP for Multimodal Processing. AAAI 2023: 9641-9649 - [c157]Liang Zhang, Anwen Hu, Jing Zhang, Shuo Hu, Qin Jin:
MPMQA: Multimodal Question Answering on Product Manuals. AAAI 2023: 13958-13966 - [c156]Tao Qian, Fan Lou, Jiatong Shi, Yuning Wu, Shuai Guo, Xiang Yin, Qin Jin:
UniLG: A Unified Structure-aware Framework for Lyrics Generation. ACL (1) 2023: 983-1001 - [c155]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation. ACL (1) 2023: 3171-3185 - [c154]Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin:
Movie101: A New Movie Understanding Benchmark. ACL (1) 2023: 4669-4684 - [c153]Dingyi Yang, Qin Jin:
Attractive Storyteller: Stylized Visual Storytelling with Unpaired Text. ACL (1) 2023: 11053-11066 - [c152]Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo:
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation. CVPR 2023: 10219-10228 - [c151]Sipeng Zheng, Boshen Xu, Qin Jin:
Open-Category Human-Object Interaction Pre-training via Language Modeling Framework. CVPR 2023: 19392-19402 - [c150]Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Lin, Fei Huang:
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model. EMNLP (Findings) 2023: 2841-2858 - [c149]Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor. ICASSP 2023: 1-5 - [c148]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
Explore and Tell: Embodied Visual Captioning in 3D Environments. ICCV 2023: 2482-2491 - [c147]Jieting Chen, Junkai Ding, Wenping Chen, Qin Jin:
Knowledge Enhanced Model for Live Video Comment Generation. ICME 2023: 2267-2272 - [c146]Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu:
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World. ACM Multimedia 2023: 1303-1313 - [c145]Boshen Xu, Sipeng Zheng, Qin Jin:
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-view World. ACM Multimedia 2023: 2807-2816 - [c144]Dingyi Yang, Hongyu Chen, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences. ACM Multimedia 2023: 5705-5715 - [c143]Yuchen Liu, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma, Qin Jin:
Emotionally Situated Text-to-Speech Synthesis in User-Agent Conversation. ACM Multimedia 2023: 5966-5974 - [c142]Zihao Yue, Anwen Hu, Liang Zhang, Qin Jin:
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation. NeurIPS 2023 - [c141]Zhaopei Huang, Jinming Zhao, Qin Jin:
Two-Stage Adaptation for Cross-Corpus Multimodal Emotion Recognition. NLPCC (2) 2023: 431-443 - [c140]Weijing Chen, Linli Yao, Qin Jin:
Rethinking Benchmarks for Cross-modal Image-text Retrieval. SIGIR 2023: 1241-1251 - [c139]Linli Yao, Weijing Chen, Qin Jin:
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge. WWW 2023: 2392-2401 - [i59]Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu:
TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat. CoRR abs/2301.05880 (2023) - [i58]Ludan Ruan, Anwen Hu, Yuqing Song, Liang Zhang, Sipeng Zheng, Qin Jin:
Accommodating Audio Modality in CLIP for Multimodal Processing. CoRR abs/2303.06591 (2023) - [i57]Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor. CoRR abs/2303.08607 (2023) - [i56]Liang Zhang, Anwen Hu, Jing Zhang, Shuo Hu, Qin Jin:
MPMQA: Multimodal Question Answering on Product Manuals. CoRR abs/2304.09660 (2023) - [i55]Weijing Chen, Linli Yao, Qin Jin:
Rethinking Benchmarks for Cross-modal Image-text Retrieval. CoRR abs/2304.10824 (2023) - [i54]Jieting Chen, Junkai Ding, Wenping Chen, Qin Jin:
Knowledge Enhanced Model for Live Video Comment Generation. CoRR abs/2304.14657 (2023) - [i53]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation. CoRR abs/2305.06002 (2023) - [i52]Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Edit As You Wish: Video Description Editing with Multi-grained Commands. CoRR abs/2305.08389 (2023) - [i51]Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin:
Movie101: A New Movie Understanding Benchmark. CoRR abs/2305.12140 (2023) - [i50]Zihao Yue, Anwen Hu, Liang Zhang, Qin Jin:
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation. CoRR abs/2306.13460 (2023) - [i49]Qi Zhang, Sipeng Zheng, Qin Jin:
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection. CoRR abs/2307.10567 (2023) - [i48]Dingyi Yang, Hongyu Chen, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences. CoRR abs/2307.16399 (2023) - [i47]Yuning Wu, Yifeng Yu, Jiatong Shi, Tao Qian, Qin Jin:
A Systematic Exploration of Joint-training for Singing Voice Synthesis. CoRR abs/2308.02867 (2023) - [i46]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
Explore and Tell: Embodied Visual Captioning in 3D Environments. CoRR abs/2308.10447 (2023) - [i45]Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Alex Lin, Fei Huang:
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model. CoRR abs/2310.05126 (2023) - 2022
- [j15]Ludan Ruan, Qin Jin:
Survey: Transformer based video-language pre-training. AI Open 3: 1-13 (2022) - [j14]Yuqing Song, Shizhe Chen, Qin Jin, Wei Luo, Jun Xie, Fei Huang:
Enhancing Neural Machine Translation With Dual-Side Multimodal Awareness. IEEE Trans. Multim. 24: 3013-3024 (2022) - [c138]Linli Yao, Weiying Wang, Qin Jin:
Image Difference Captioning with Pre-training and Contrastive Learning. AAAI 2022: 3108-3116 - [c137]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL (1) 2022: 5699-5710 - [c136]Yuchen Liu, Jinming Zhao, Jingwen Hu, Ruichen Li, Qin Jin:
DialogueEIN: Emotion Interaction Network for Dialogue Affective Analysis. COLING 2022: 684-693 - [c135]Liyu Meng, Yuchen Liu, Xiaolong Liu, Zhaopei Huang, Wenqiang Jiang, Tenggan Zhang, Chuanhe Liu, Qin Jin:
Valence and Arousal Estimation based on Multimodal Temporal-Aware Features for Videos in the Wild. CVPR Workshops 2022: 2344-2351 - [c134]Sipeng Zheng, Shizhe Chen, Qin Jin:
VRDFormer: End-to-End Video Visual Relation Detection with Transformers. CVPR 2022: 18814-18824 - [c133]Tenggan Zhang, Chuanhe Liu, Xiaolong Liu, Yuchen Liu, Liyu Meng, Lei Sun, Wenqiang Jiang, Fengyuan Zhang, Jinming Zhao, Qin Jin:
Multi-Task Learning Framework for Emotion Recognition In-the-Wild. ECCV Workshops (6) 2022: 143-156 - [c132]Sipeng Zheng, Shizhe Chen, Qin Jin:
Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning. ECCV (4) 2022: 297-313 - [c131]Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin:
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval. ECCV (14) 2022: 319-335 - [c130]Qi Zhang, Yuqing Song, Qin Jin:
Unifying Event Detection and Captioning as Sequence Generation via Pre-training. ECCV (36) 2022: 363-379 - [c129]Qi Zhang, Zihao Yue, Anwen Hu, Ziheng Wang, Qin Jin:
MovieUN: A Dataset for Movie Understanding and Narrating. EMNLP (Findings) 2022: 1873-1885 - [c128]Yuwen Chen, Jian Ma, Peihu Zhu, Xiaoming Huang, Qin Jin:
Leveraging Trust Relations to Improve Academic Patent Recommendation. HICSS 2022: 1-10 - [c127]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition. ICASSP 2022: 4703-4707 - [c126]Tao Qian, Jiatong Shi, Shuai Guo, Peter Wu, Qin Jin:
Training Strategies for Automatic Song Writing: A Unified Framework Perspective. ICASSP 2022: 4738-4742 - [c125]Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. INTERSPEECH 2022: 4272-4276 - [c124]Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis. INTERSPEECH 2022: 4277-4281 - [c123]Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
M4MM '22: 1st International Workshop on Methodologies for Multimedia. ACM Multimedia 2022: 7394-7396 - [c122]Si Liu, Qin Jin, Luoqi Liu, Zongheng Tang, Linli Lin:
PIC'22: 4th Person in Context Workshop. ACM Multimedia 2022: 7418-7419 - [c121]Liang Zhang, Anwen Hu, Qin Jin:
Multi-Lingual Acquisition on Multimodal Pre-training for Cross-modal Retrieval. NeurIPS 2022 - [c120]Yida Zhao, Yuqing Song, Qin Jin:
Progressive Learning for Image Retrieval with Hybrid-Modality Queries. SIGIR 2022: 1012-1021 - [e2]João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. ACM 2022, ISBN 978-1-4503-9203-7 [contents] - [i44]Linli Yao, Weiying Wang, Qin Jin:
Image Difference Captioning with Pre-training and Contrastive Learning. CoRR abs/2202.04298 (2022) - [i43]Liyu Meng, Yuchen Liu, Xiaolong Liu, Zhaopei Huang, Yuan Cheng, Meng Wang, Chuanhe Liu, Qin Jin:
Multi-modal Emotion Estimation for in-the-wild Videos. CoRR abs/2203.13032 (2022) - [i42]Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. CoRR abs/2203.17001 (2022) - [i41]Yida Zhao, Yuqing Song, Qin Jin:
Progressive Learning for Image Retrieval with Hybrid-Modality Queries. CoRR abs/2204.11212 (2022) - [i40]Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis. CoRR abs/2205.04029 (2022) - [i39]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. CoRR abs/2205.10237 (2022) - [i38]Liang Zhang, Anwen Hu, Qin Jin:
Generalizing Multimodal Pre-training into Multilingual via Language Acquisition. CoRR abs/2206.11091 (2022) - [i37]Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin:
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval. CoRR abs/2207.07852 (2022) - [i36]Qi Zhang, Yuqing Song, Qin Jin:
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training. CoRR abs/2207.08625 (2022) - [i35]Sipeng Zheng, Qi Zhang, Bei Liu, Qin Jin, Jianlong Fu:
Exploring Anchor-based Detection for Ego4D Natural Language Query. CoRR abs/2208.05375 (2022) - [i34]Linli Yao, Weijing Chen, Qin Jin:
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge. CoRR abs/2211.09371 (2022) - [i33]Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo:
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation. CoRR abs/2212.09478 (2022) - 2021
- [j13]Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan Yao, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu:
Pre-trained models: Past, present and future. AI Open 2: 225-250 (2021) - [c119]Jinming Zhao, Ruichen Li, Qin Jin:
Missing Modality Imagination Network for Emotion Recognition with Uncertain Missing Modalities. ACL/IJCNLP (1) 2021: 2608-2618 - [c118]Jingwen Hu, Yuchen Liu, Jinming Zhao, Qin Jin:
MMGCN: Multimodal Fusion via Deep Graph Convolution Network for Emotion Recognition in Conversation. ACL/IJCNLP (1) 2021: 5666-5675 - [c117]Yuqing Song, Shizhe Chen, Qin Jin:
Towards Diverse Paragraph Captioning for Untrimmed Videos. CVPR 2021: 11245-11254 - [c116]Jia Chen, Yike Wu, Shiwan Zhao, Qin Jin:
Language Resource Efficient Learning for Captioning. EMNLP (Findings) 2021: 1887-1895 - [c115]Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss. ICASSP 2021: 76-80 - [c114]Ruichen Li, Jinming Zhao, Qin Jin:
Speech Emotion Recognition via Multi-Level Cross-Modal Distillation. Interspeech 2021: 4488-4492 - [c113]Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:
MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding. ICMR 2021: 694-695 - [c112]Tenggan Zhang, Zhaopei Huang, Ruichen Li, Jinming Zhao, Qin Jin:
Multimodal Fusion Strategies for Physiological-emotion Analysis. MuSe @ ACM Multimedia 2021: 43-50 - [c111]Yuqing Song, Shizhe Chen, Qin Jin, Wei Luo, Jun Xie, Fei Huang:
Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training. ACM Multimedia 2021: 2843-2852 - [c110]Anwen Hu, Shizhe Chen, Qin Jin:
Question-controlled Text-aware Image Captioning. ACM Multimedia 2021: 3097-3105 - [c109]Ludan Ruan, Qin Jin:
Efficient Proposal Generation with U-shaped Network for Temporal Sentence Grounding. MMAsia 2021: 26:1-26:7 - [e1]Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:
MMPT@ICMR2021: Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding, Taipei, Taiwan, August 21, 2021. ACM 2021, ISBN 978-1-4503-8530-5 [contents] - [i32]Yuqi Huo, Manli Zhang, Guangzhen Liu, Haoyu Lu, Yizhao Gao, Guoxing Yang, Jingyuan Wen, Heng Zhang, Baogui Xu, Weihao Zheng, Zongzheng Xi, Yueqian Yang, Anwen Hu, Jinming Zhao, Ruichen Li, Yida Zhao, Liang Zhang, Yuqing Song, Xin Hong, Wanqing Cui, Dan Yang Hou, Yingyan Li, Junyi Li, Peiyu Liu, Zheng Gong, Chuhao Jin, Yuchong Sun, Shizhe Chen, Zhiwu Lu, Zhicheng Dou, Qin Jin, Yanyan Lan, Wayne Xin Zhao, Ruihua Song, Ji-Rong Wen:
WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training. CoRR abs/2103.06561 (2021) - [i31]Yuqing Song, Shizhe Chen, Qin Jin:
Towards Diverse Paragraph Captioning for Untrimmed Videos. CoRR abs/2105.14477 (2021) - [i30]Ludan Ruan, Jieting Chen, Yuqing Song, Shizhe Chen, Qin Jin:
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization. CoRR abs/2106.06138 (2021) - [i29]Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu:
Pre-Trained Models: Past, Present and Future. CoRR abs/2106.07139 (2021) - [i28]Jingwen Hu, Yuchen Liu, Jinming Zhao, Qin Jin:
MMGCN: Multimodal Fusion via Deep Graph Convolution Network for Emotion Recognition in Conversation. CoRR abs/2107.06779 (2021) - [i27]Anwen Hu, Shizhe Chen, Qin Jin:
ICECAP: Information Concentrated Entity-aware Image Captioning. CoRR abs/2108.02050 (2021) - [i26]Anwen Hu, Shizhe Chen, Qin Jin:
Question-controlled Text-aware Image Captioning. CoRR abs/2108.02059 (2021) - [i25]Yuqing Song, Shizhe Chen, Qin Jin, Wei Luo, Jun Xie, Fei Huang:
Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training. CoRR abs/2108.11119 (2021) - [i24]Ludan Ruan, Qin Jin:
Survey: Transformer based Video-Language Pre-training. CoRR abs/2109.09920 (2021) - [i23]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition. CoRR abs/2111.00865 (2021) - 2020
- [c108]Shizhe Chen, Qin Jin, Peng Wang, Qi Wu:
Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs. CVPR 2020: 9959-9968 - [c107]Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu:
Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning. CVPR 2020: 10635-10644 - [c106]Jia Chen, Qin Jin:
Better Captioning With Sequence-Level Exploration. CVPR 2020: 10887-10896 - [c105]Sipeng Zheng, Shizhe Chen, Qin Jin:
Skeleton-Based Interactive Graph Network For Human Object Interaction Detection. ICME 2020: 1-6 - [c104]Jiatong Shi, Nan Huo, Qin Jin:
Context-Aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training. INTERSPEECH 2020: 3057-3061 - [c103]Ruichen Li, Jinming Zhao, Jingwen Hu, Shuai Guo, Qin Jin:
Multi-modal Fusion for Video Sentiment Analysis. MuSe @ ACM Multimedia 2020: 19-25 - [c102]Weiying Wang, Jieting Chen, Qin Jin:
VideoIC: A Video Interactive Comments Dataset and Multimodal Multitask Learning for Comments Generation. ACM Multimedia 2020: 2599-2607 - [c101]Jingjun Liang, Ruichen Li, Qin Jin:
Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching. ACM Multimedia 2020: 2852-2861 - [c100]Anwen Hu, Shizhe Chen, Qin Jin:
ICECAP: Information Concentrated Entity-aware Image Captioning. ACM Multimedia 2020: 4217-4225 - [c99]Yida Zhao, Yuqing Song, Shizhe Chen, Qin Jin:
RUC_AIM3 at TRECVID 2020: Ad-hoc Video Search & Video to Text Description. TRECVID 2020 - [i22]Shizhe Chen, Qin Jin, Peng Wang, Qi Wu:
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs. CoRR abs/2003.00387 (2020) - [i21]Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu:
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning. CoRR abs/2003.00392 (2020) - [i20]Jia Chen, Qin Jin:
Better Captioning with Sequence-Level Exploration. CoRR abs/2003.03749 (2020) - [i19]Shizhe Chen, Weiying Wang, Ludan Ruan, Linli Yao, Qin Jin:
YouMakeup VQA Challenge: Towards Fine-grained Action Understanding in Domain-Specific Videos. CoRR abs/2004.05573 (2020) - [i18]Yuqing Song, Shizhe Chen, Yida Zhao, Qin Jin:
Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning. CoRR abs/2006.07896 (2020) - [i17]Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shizhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao:
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020). CoRR abs/2008.00744 (2020) - [i16]Jiatong Shi, Nan Huo, Qin Jin:
Context-aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training. CoRR abs/2008.08647 (2020) - [i15]Jingjun Liang, Ruichen Li, Qin Jin:
Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching. CoRR abs/2009.02598 (2020) - [i14]Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss. CoRR abs/2010.12024 (2020)
2010 – 2019
- 2019
- [j12]Shizhe Chen, Qin Jin, Jia Chen, Alexander G. Hauptmann:
Generating Video Descriptions With Latent Topic Guidance. IEEE Trans. Multim. 21(9): 2407-2418 (2019) - [c98]Shizhe Chen, Qin Jin, Alexander G. Hauptmann:
Unsupervised Bilingual Lexicon Induction from Mono-Lingual Multimodal Data. AAAI 2019: 8207-8214 - [c97]Jingjun Liang, Shizhe Chen, Qin Jin:
Semi-supervised Multimodal Emotion Recognition with Improved Wasserstein GANs. APSIPA 2019: 695-703 - [c96]Weiying Wang, Yongcheng Wang, Shizhe Chen, Qin Jin:
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension. EMNLP/IJCNLP (1) 2019: 5132-5142 - [c95]Jingjun Liang, Shizhe Chen, Jinming Zhao, Qin Jin, Haibo Liu, Li Lu:
Cross-culture Multimodal Emotion Recognition with Adversarial Learning. ICASSP 2019: 4000-4004 - [c94]Shizhe Chen, Qin Jin, Jianlong Fu:
From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots. IJCAI 2019: 4932-4938 - [c93]Jinming Zhao, Shizhe Chen, Jingjun Liang, Qin Jin:
Speech Emotion Recognition in Dyadic Dialogues with Attentive Interaction Modeling. INTERSPEECH 2019: 1671-1675 - [c92]Shuai Wang, Linli Yao, Jieting Chen, Qin Jin:
RUC at MediaEval 2019: Video Memorability Prediction Based on Visual Textual and Concept Related Features. MediaEval 2019 - [c91]Jinming Zhao, Ruichen Li, Jingjun Liang, Shizhe Chen, Qin Jin:
Adversarial Domain Adaption for Multi-Cultural Dimensional Emotion Recognition in Dyadic Interactions. AVEC@MM 2019: 37-45 - [c90]Sipeng Zheng, Shizhe Chen, Qin Jin:
Visual Relation Detection with Multi-Level Attention. ACM Multimedia 2019: 121-129 - [c89]Yuqing Song, Shizhe Chen, Yida Zhao, Qin Jin:
Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards. ACM Multimedia 2019: 784-792 - [c88]Shizhe Chen, Bei Liu, Jianlong Fu, Ruihua Song, Qin Jin, Pingping Lin, Xiaoyu Qi, Chunting Wang, Jin Zhou:
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences. ACM Multimedia 2019: 2236-2244 - [c87]Sipeng Zheng, Xiangyu Chen, Shizhe Chen, Qin Jin:
Relation Understanding in Videos. ACM Multimedia 2019: 2662-2666 - [c86]Yuqing Song, Yida Zhao, Shizhe Chen, Qin Jin:
RUC_AIM3 at TRECVID 2019: Video to Text. TRECVID 2019 - [i13]Shizhe Chen, Qin Jin, Alexander G. Hauptmann:
Unsupervised Bilingual Lexicon Induction from Mono-lingual Multimodal Data. CoRR abs/1906.00378 (2019) - [i12]Shizhe Chen, Qin Jin, Jianlong Fu:
From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots. CoRR abs/1906.00872 (2019) - [i11]Shizhe Chen, Yuqing Song, Yida Zhao, Qin Jin, Zhaoyang Zeng, Bei Liu, Jianlong Fu, Alexander G. Hauptmann:
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos. CoRR abs/1907.05092 (2019) - [i10]Yuqing Song, Shizhe Chen, Yida Zhao, Qin Jin:
Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards. CoRR abs/1908.05407 (2019) - [i9]Shizhe Chen, Yida Zhao, Yuqing Song, Qin Jin, Qi Wu:
Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019. CoRR abs/1910.06737 (2019) - [i8]Shizhe Chen, Bei Liu, Jianlong Fu, Ruihua Song, Qin Jin, Pingping Lin, Xiaoyu Qi, Chunting Wang, Jin Zhou:
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences. CoRR abs/1911.10460 (2019) - 2018
- [c85]Shuai Wang, Weiying Wang, Shizhe Chen, Qin Jin:
RUC at MediaEval 2018: Visual and Textual Features Exploration for Predicting Media Memorability. MediaEval 2018 - [c84]Shizhe Chen, Jia Chen, Qin Jin, Alexander G. Hauptmann:
Class-aware Self-Attention for Audio Event Recognition. ICMR 2018: 28-36 - [c83]Qin Jin:
Session details: Deep-2 (Recognition). ACM Multimedia 2018 - [c82]Jinming Zhao, Ruichen Li, Shizhe Chen, Qin Jin:
Multi-modal Multi-cultural Dimensional Continues Emotion Recognition in Dyadic Interactions. AVEC@MM 2018: 65-72 - [c81]Xiaozhu Lin, Qin Jin, Shizhe Chen, Yuqing Song, Yida Zhao:
iMakeup: Makeup Instructional Video Dataset for Fine-Grained Dense Video Captioning. PCM (3) 2018: 78-88 - [c80]Jinming Zhao, Shizhe Chen, Qin Jin:
Multimodal Dimensional and Continuous Emotion Recognition in Dyadic Video Interactions. PCM (1) 2018: 301-312 - [c79]Jia Chen, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Po-Yao Huang, Junwei Liang, Vaibhav, Xiaojun Chang, Jiang Liu, Ting-Yao Hu, Wenhe Liu, Wei Ke, Wayner Barrios, Haroon Idrees, Donghyun Yoo, Yaser Sheikh, Ruslan Salakhutdinov, Kris Kitani, Dong Huang:
Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video. TRECVID 2018 - [i7]Shizhe Chen, Yuqing Song, Yida Zhao, Jiarong Qiu, Qin Jin, Alexander G. Hauptmann:
RUC+CMU: System Report for Dense Captioning Events in Videos. CoRR abs/1806.08854 (2018) - 2017
- [j11]Mengying Zhang, Qin Jin, Hongwei Liu:
Group division based on common weights in cross efficiency evaluation. Int. J. Inf. Decis. Sci. 9(3): 209-223 (2017) - [c78]Xinrui Li, Shizhe Chen, Qin Jin:
Facial Action Units Detection with Multi-Features and -AUs Fusion. FG 2017: 860-865 - [c77]Shuai Wang, Wenxuan Wang, Jinming Zhao, Shizhe Chen, Qin Jin, Shilei Zhang, Yong Qin:
Emotion recognition with multimodal features and temporal models. ICMI 2017: 598-602 - [c76]Shuai Wang, Shizhe Chen, Jinming Zhao, Wenxuan Wang, Qin Jin:
RUC at MediaEval 2017: Predicting Media Interestingness Task. MediaEval 2017 - [c75]Shizhe Chen, Jia Chen, Qin Jin:
Generating Video Descriptions with Topic Guidance. ICMR 2017: 5-13 - [c74]Shizhe Chen, Qin Jin, Jinming Zhao, Shuai Wang:
Multimodal Multi-task Learning for Dimensional and Continuous Emotion Recognition. AVEC@ACM Multimedia 2017: 19-26 - [c73]Shizhe Chen, Jia Chen, Qin Jin, Alexander G. Hauptmann:
Video Captioning with Guidance of Multimodal Latent Topics. ACM Multimedia 2017: 1838-1846 - [c72]Qin Jin, Shizhe Chen, Jia Chen, Alexander G. Hauptmann:
Knowing Yourself: Improving Video Caption via In-depth Recap. ACM Multimedia 2017: 1906-1911 - [c71]Jia Chen, Junwei Liang, Jiang Liu, Shizhe Chen, Chenqiang Gao, Qin Jin, Alexander G. Hauptmann:
Informedia @ TRECVID 2017. TRECVID 2017 - [i6]Shizhe Chen, Jia Chen, Qin Jin:
Generating Video Descriptions with Topic Guidance. CoRR abs/1708.09666 (2017) - [i5]Shizhe Chen, Jia Chen, Qin Jin, Alexander G. Hauptmann:
Video Captioning with Guidance of Multimodal Latent Topics. CoRR abs/1708.09667 (2017) - [i4]Shizhe Chen, Qin Jin:
Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction. CoRR abs/1709.02251 (2017) - 2016
- [j10]Gang Yang, Shaohui Wu, Qin Jin, Jieping Xu:
A hybrid approach based on stochastic competitive Hopfield neural network and efficient genetic algorithm for frequency assignment problem. Appl. Soft Comput. 39: 104-116 (2016) - [j9]Mengying Zhang, Qin Jin:
Coordinate the Express Delivery Supply Chain with Option Contracts. Int. J. Inf. Syst. Supply Chain Manag. 9(4): 1-21 (2016) - [j8]Mengying Zhang, Qin Jin, Hongwei Liu:
The Study of the Entrepreneurial Leadership Style of Real Estate Industry in China: Based on the Content Analysis of Microblog. Int. J. Knowl. Based Organ. 6(3): 45-57 (2016) - [j7]Jia Chen, Qin Jin, Shiwan Zhao, Shenghua Bao, Li Zhang, Zhong Su, Yong Yu:
Boosting Recommendation in Unexplored Categories by User Price Preference. ACM Trans. Inf. Syst. 35(2): 12:1-12:27 (2016) - [c70]Guankun Mu, Haibing Cao, Qin Jin:
Violent Scene Detection Using Convolutional Neural Networks and Deep Audio Features. CCPR (2) 2016: 451-463 - [c69]Shizhe Chen, Yujie Dian, Xinrui Li, Xiaozhu Lin, Qin Jin, Haibo Liu, Li Lu:
Emotion Recognition in Videos via Fusing Multimodal Features. CCPR (2) 2016: 632-644 - [c68]Shizhe Chen, Xinrui Li, Qin Jin, Shilei Zhang, Yong Qin:
Video emotion recognition in the wild based on fusion of multimodal features. ICMI 2016: 494-500 - [c67]Qin Jin, Junwei Liang, Xiaozhu Lin:
Generating Natural Video Descriptions via Multimodal Processing. INTERSPEECH 2016: 570-574 - [c66]Shizhe Chen, Yujie Dian, Qin Jin:
RUC at MediaEval 2016: Predicting Media Interestingness Task. MediaEval 2016 - [c65]Shizhe Chen, Qin Jin:
RUC at MediaEval 2016 Emotional Impact of Movies Task: Fusion of Multimodal Features. MediaEval 2016 - [c64]Qin Jin, Junwei Liang:
Video Description Generation using Audio and Visual Cues. ICMR 2016: 239-242 - [c63]Shizhe Chen, Qin Jin:
Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction. ACM Multimedia 2016: 571-575 - [c62]Xirong Li, Yujia Huo, Qin Jin, Jieping Xu:
Detecting Violence in Video using Subclasses. ACM Multimedia 2016: 586-590 - [c61]Yifan Xiong, Jia Chen, Qin Jin, Chao Zhang:
History Rhyme: Searching Historic Events by Multimedia Knowledge. ACM Multimedia 2016: 749-751 - [c60]Jia Chen, Qin Jin, Yifan Xiong:
Semantic Image Profiling for Historic Events: Linking Images to Phrases. ACM Multimedia 2016: 1028-1037 - [c59]Qin Jin, Jia Chen, Shizhe Chen, Yifan Xiong, Alexander G. Hauptmann:
Describing Videos using Multi-modal Fusion. ACM Multimedia 2016: 1087-1091 - [c58]Xirong Li, Qin Jin:
Improving Image Captioning by Concept-Based Sentence Reranking. PCM (2) 2016: 231-240 - [c57]Junwei Liang, Jia Chen, Poyao Huang, Xuanchong Li, Lu Jiang, Zhenzhong Lan, Pingbo Pan, Hehe Fan, Qin Jin, Jiande Sun, Yang Chen, Yi Yang, Alexander G. Hauptmann:
Informedia @ TRECVID 2016. TRECVID 2016 - [i3]Xirong Li, Yujia Huo, Jieping Xu, Qin Jin:
Detecting Violence in Video using Subclasses. CoRR abs/1604.08088 (2016) - [i2]Xirong Li, Qin Jin:
Improving Image Captioning by Concept-based Sentence Reranking. CoRR abs/1605.00855 (2016) - 2015
- [j6]Jia Chen, Min Li, Qin Jin, Shenghua Bao, Zhong Su, Yong Yu:
Lead curve detection in drawings with complex cross-points. Neurocomputing 168: 35-46 (2015) - [j5]Qin Jin, Shizhe Chen, Xirong Li, Gang Yang, Jieping Xu:
基于声学特征的语言情感识别 (Speech Emotion Recognition Based on Acoustic Features). 计算机科学 42(9): 24-28 (2015) - [j4]Shimin Chen, Qin Jin:
Persistent B+-Trees in Non-Volatile Main Memory. Proc. VLDB Endow. 8(7): 786-797 (2015) - [j3]Jia Chen, Qin Jin, Shenghua Bao, Zhong Su, Shimin Chen, Yong Yu:
Exploitation and Exploration Balanced Hierarchical Summary for Landmark Images. IEEE Trans. Multim. 17(10): 1773-1786 (2015) - [c56]Huimin Wu, Qin Jin:
Improving emotion classification on Chinese microblog texts with auxiliary cross-domain data. ACII 2015: 821-826 - [c55]Xirong Li, Qin Jin, Shuai Liao, Junwei Liang, Xixi He, Yujia Huo, Weiyu Lan, Bin Xiao, Yanxiong Lu, Jieping Xu:
RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation. CLEF (Working Notes) 2015 - [c54]Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li:
Detecting semantic concepts in consumer videos using audio. ICASSP 2015: 2279-2283 - [c53]Qin Jin, Chengxin Li, Shizhe Chen, Huimin Wu:
Speech emotion recognition with acoustic and lexical features. ICASSP 2015: 4749-4753 - [c52]Qin Jin, Xirong Li, Haibing Cao, Yujia Huo, Shuai Liao, Gang Yang, Jieping Xu:
RUCMM at MediaEval 2015 Affective Impact of Movies Task: Fusion of Audio and Visual Cues. MediaEval 2015 - [c51]Qin Jin, Junwei Liang, Xixi He, Gang Yang, Jieping Xu, Xirong Li:
Semantic Concept Annotation For User Generated Videos Using Soundtracks. ICMR 2015: 599-602 - [c50]Shizhe Chen, Qin Jin:
Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks. AVEC@ACM Multimedia 2015: 49-56 - [c49]Jia Chen, Qin Jin, Yong Yu, Alexander G. Hauptmann:
Image Profiling for History Events on the Fly. ACM Multimedia 2015: 291-300 - 2014
- [j2]Neil Y. Yen, Qin Jin, Ching-Hsien Hsu, Qiangfu Zhao:
Special Issue on "Hybrid intelligence for growing internet and its applications". Future Gener. Comput. Syst. 37: 401-403 (2014) - [c48]Thomas Fang Zheng, Qin Jin, Lantian Li, Jun Wang, Fanhu Bie:
An overview of robustness related issues in speaker recognition. APSIPA 2014: 1-10 - [c47]Xirong Li, Xixi He, Gang Yang, Qin Jin, Jieping Xu:
Renmin University of China at ImageCLEF 2014 Scalable Concept Image Annotation. CLEF (Working Notes) 2014: 380-385 - [c46]Gang Yang, Xirong Li, Jieping Xu, Qin Jin:
Structure Perturbation Optimization for Hopfield-Type Neural Networks. ICANN 2014: 307-314 - [c45]Shizhe Chen, Qin Jin, Xirong Li, Gang Yang, Jieping Xu:
Speech emotion classification using acoustic features. ISCSLP 2014: 579-583 - [c44]Chengxin Li, Huimin Wu, Qin Jin:
Emotion Classification of Chinese Microblog Text via Fusion of BoW and eVector Feature Representations. NLPCC 2014: 217-228 - [c43]Xixi He, Xirong Li, Gang Yang, Jieping Xu, Qin Jin:
Adaptive Tag Selection for Image Annotation. PCM 2014: 11-21 - [c42]Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li:
Semantic Concept Annotation of Consumer Videos at Frame-Level Using Audio. PCM 2014: 113-122 - [c41]Jia Chen, Qin Jin, Shiwan Zhao, Shenghua Bao, Li Zhang, Zhong Su, Yong Yu:
Does product recommendation meet its waterloo in unexplored categories?: no, price comes to help. SIGIR 2014: 667-676 - [c40]Gang Yang, Xirong Li, Jieping Xu, Qin Jin, Hui Sun:
A guided Hopfield evolutionary algorithm with local search for maximum clique problem. SMC 2014: 979-982 - [i1]Xixi He, Xirong Li, Gang Yang, Jieping Xu, Qin Jin:
Adaptive Tag Selection for Image Annotation. CoRR abs/1409.4995 (2014) - 2013
- [c39]Xirong Li, Shuai Liao, Binbin Liu, Gang Yang, Qin Jin, Jieping Xu, Xiaoyong Du:
Renmin University of China at ImageCLEF 2013 Scalable Concept Image Annotation. CLEF (Working Notes) 2013 - [c38]Jia Chen, Qin Jin, Weipeng Zhang, Shenghua Bao, Zhong Su, Yong Yu:
Tell me what happened here in history. ACM Multimedia 2013: 467-468 - 2012
- [c37]Qin Jin, Peter Franz Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, Florian Metze:
Event-based Video Retrieval Using Audio. INTERSPEECH 2012: 2085-2088 - 2011
- [c36]Kornel Laskowski, Qin Jin:
Harmonic Structure Transform for Speaker Recognition. INTERSPEECH 2011: 365-368 - [c35]Udhyakumar Nallasamy, Michael Garbus, Florian Metze, Qin Jin, Thomas Schaaf, Tanja Schultz:
Analysis of Dialectal Influence in Pan-Arabic ASR. INTERSPEECH 2011: 1721-1724 - [c34]Qian Yang, Qin Jin, Tanja Schultz:
Investigation of Cross-Show Speaker Diarization. INTERSPEECH 2011: 2925-2928 - [c33]Lei Bao, Longfei Zhang, Shoou-I Yu, Zhen-zhong Lan, Lu Jiang, Arnold Overwijk, Qin Jin, Shohei Takahashi, Brian Langner, Yuanpeng Li, Michael Garbus, Susanne Burger, Florian Metze, Alexander G. Hauptmann:
Informedia@TRECVID 2011: Surveillance Event Detection. TRECVID 2011 - 2010
- [c32]Qin Jin, Runxin Li, Qian Yang, Kornel Laskowski, Tanja Schultz:
Speaker identification with distant microphone speech. ICASSP 2010: 4518-4521 - [c31]Florian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz:
The 2010 CMU GALE speech-to-text system. INTERSPEECH 2010: 1501-1504 - [c30]Kornel Laskowski, Qin Jin:
Modeling Prosody for Speaker Recognition: Why Estimating Pitch May Be a Red Herring. Odyssey 2010: 4
2000 – 2009
- 2009
- [c29]Qin Jin, Arthur R. Toth, Tanja Schultz, Alan W. Black:
Speaker de-identification via voice transformation. ASRU 2009: 529-533 - [c28]Qin Jin, Arthur R. Toth, Tanja Schultz, Alan W. Black:
Voice convergin: Speaker de-identification by voice transformation. ICASSP 2009: 3909-3912 - [c27]Haizhou Li, Bin Ma, Kong-Aik Lee, Hanwu Sun, Donglai Zhu, Khe Chai Sim, Changhuai You, Rong Tong, Ismo Kärkkäinen, Chien-Lin Huang, Vladimir Pervouchine, Wu Guo, Yijie Li, Li-Rong Dai, Mohaddeseh Nosratighods, Tharmarajah Thiruvaran, Julien Epps, Eliathamby Ambikairajah, Chng Eng Siong, Tanja Schultz, Qin Jin:
The I4U system in NIST 2008 speaker recognition evaluation. ICASSP 2009: 4201-4204 - [c26]Kornel Laskowski, Qin Jin:
Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum. ICASSP 2009: 4541-4544 - [c25]Mark C. Fuhs, Qin Jin, Tanja Schultz:
Detecting bandlimited audio in broadcast television shows. ICASSP 2009: 4589-4592 - [c24]Runxin Li, Tanja Schultz, Qin Jin:
Improving speaker segmentation via speaker identification and text segmentation. INTERSPEECH 2009: 904-907 - [c23]Matthias Wölfel, Qian Yang, Qin Jin, Tanja Schultz:
Speaker identification using warped MVDR cepstral features. INTERSPEECH 2009: 912-915 - 2008
- [c22]Qin Jin, Arthur R. Toth, Alan W. Black, Tanja Schultz:
Is voice transformation a threat to speaker identification? ICASSP 2008: 4845-4848 - [c21]Roger Hsiao, Mark C. Fuhs, Yik-Cheung Tam, Qin Jin, Tanja Schultz:
The CMU-interACT 2008 Mandarin transcription system. INTERSPEECH 2008: 1445-1448 - [c20]Qin Jin, Tanja Schultz:
Robust far-field speaker identification under mismatched conditions. INTERSPEECH 2008: 1893-1896 - 2007
- [j1]Qin Jin, Tanja Schultz, Alex Waibel:
Far-Field Speaker Recognition. IEEE Trans. Speech Audio Process. 15(7): 2023-2032 (2007) - [c19]Hazim Kemal Ekenel, Qin Jin, Mika Fischer, Rainer Stiefelhagen:
ISL Person Identification Systems in the CLEAR 2007 Evaluations. CLEAR 2007: 256-265 - [c18]Hazim Kemal Ekenel, Mika Fischer, Qin Jin, Rainer Stiefelhagen:
Multi-modal Person Identification in a Smart Environment. CVPR 2007 - [c17]Qin Jin, Szu-Chen Stan Jou, Tanja Schultz:
Whispering Speaker Identification. ICME 2007: 1027-1030 - 2006
- [c16]Hazim Kemal Ekenel, Qin Jin:
ISL Person Identification Systems in the CLEAR Evaluations. CLEAR 2006: 249-257 - [c15]Qin Jin, Yue Pan, Tanja Schultz:
Far-Field Speaker Recognition. ICASSP (1) 2006: 937-940 - 2005
- [c14]Alexander G. Hauptmann, Robert V. Baron, Michael G. Christel, R. Concescu, Jiang Gao, Qin Jin, Wei-Hao Lin, J.-Y. Pan, Scott M. Stevens, Rong Yan, Jun Yang, Y. Zhang:
CMU Informedia's TRECVID 2005 Skirmishes. TRECVID 2005 - 2004
- [c13]Hagen Soltau, Hua Yu, Florian Metze, Christian Fügen, Qin Jin, Szu-Chen Stan Jou:
The 2003 ISL rich transcription system for conversational telephony speech. ICASSP (1) 2004: 773-776 - [c12]Qin Jin, Tanja Schultz:
Speaker segmentation and clustering in meetings. INTERSPEECH 2004: 597-600 - [c11]Kornel Laskowski, Qin Jin, Tanja Schultz:
Crosscorrelation-based multispeaker speech activity detection. INTERSPEECH 2004: 973-976 - [c10]Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen:
Issues in meeting transcription - the ISL meeting transcription system. INTERSPEECH 2004: 1709-1712 - 2003
- [c9]Douglas A. Reynolds, Walter D. Andrews, Joseph P. Campbell, Jirí Navrátil, Barbara Peskin, André Adami, Qin Jin, David Klusácek, Joy S. Abramson, Radu Mihaescu, John J. Godfrey, Douglas A. Jones, Bing Xiang:
The SuperSID project: exploiting high-level information for high-accuracy speaker recognition. ICASSP (4) 2003: 784-787 - [c8]Jirí Navrátil, Qin Jin, Walter D. Andrews, Joseph P. Campbell:
Phonetic speaker recognition using maximum-likelihood binary-decision tree models. ICASSP (4) 2003: 796-799 - [c7]Qin Jin, Jirí Navrátil, Douglas A. Reynolds, Joseph P. Campbell, Walter D. Andrews, Joy S. Abramson:
Combining cross-stream and time dimensions in phonetic speaker recognition. ICASSP (4) 2003: 800-803 - 2002
- [c6]Tanja Schultz, Qin Jin, Kornel Laskowski, Alicia Tribble, Alex Waibel:
Improvements in Non-Verbal Cue Identification Using Multilingual Phone Strings. Speech-to-Speech Translation@ACL 2002: 101-78 - [c5]Qin Jin, Tanja Schultz, Alex Waibel:
Speaker identification using multilingual phone strings. ICASSP 2002: 145-148 - [c4]Qin Jin, Tanja Schultz, Alex Waibel:
Phonetic speaker identification. INTERSPEECH 2002: 1345-1348 - 2000
- [c3]Qin Jin, Alex Waibel:
Application of LDA to speaker recognition. INTERSPEECH 2000: 250-253 - [c2]Qin Jin, Alex Waibel:
A na ve de-lambing method for speaker identification. INTERSPEECH 2000: 466-469
1990 – 1999
- 1998
- [c1]Qin Jin, Luo Si, Qixiu Hu:
A high-performance text-independent speaker identification system based on BCDM. ICSLP 1998
Coauthor Index
aka: Alexander G. Hauptmann
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-07 20:30 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint