default search action

combined dblp search
author search
venue search
publication search

ask others

Zheng Shou 0001

Mike Zheng Shou

> Home > Persons

Person information

affiliation: National University of Singapore
affiliation (former): Columbia University, New York, NY, USA

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/pr/WuZLLZSB25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pr/WuZLLZSB25
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai:
A large cross-modal video retrieval dataset with reading comprehension. Pattern Recognit. 157: 110818 (2025)
2024
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/ZhaoWZ00S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/ZhaoWZ00S24
Henry Hengyuan Zhao, Pichao Wang, Yuyang Zhao, Hao Luo, Fan Wang, Mike Zheng Shou:
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels. Int. J. Comput. Vis. 132(3): 731-749 (2024)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/WangZSY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/WangZSY24
Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided Text Prompts. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 3406-3421 (2024)
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/tcsv/WuZLSZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcsv/WuZLSZS24
Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou:
Continual Learning for Image Segmentation With Dynamic Query. IEEE Trans. Circuits Syst. Video Technol. 34(6): 4874-4886 (2024)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/tmm/LiFHFLKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmm/LiFHFLKS24
Ming Li, Huazhu Fu, Shengfeng He, Hehe Fan, Jun Liu, Jussi Keppo, Mike Zheng Shou:
DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition. IEEE Trans. Multim. 26: 6297-6309 (2024)
[c82]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/XuZLYLZFS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/XuZLYLZFS24
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou:
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. CVPR 2024: 1481-1490
[c81]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GuZWYL0WZST24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GuZWYL0WZST24
Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang:
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence. CVPR 2024: 7621-7630
[c80]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GuWGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GuWGSS24
Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CVPR 2024: 7631-7640
[c79]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LiuCWMG0KSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LiuCWMG0KSS24
Jia-Wei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CVPR 2024: 7664-7674
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/RanCL0ZWKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/RanCL0ZWKS24
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou:
X- Adapter: Universal Compatibility of Plugins for Upgraded Diffusion Model. CVPR 2024: 8775-8784
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Gao0BOLMWZWGWZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Gao0BOLMWZWGWZS24
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou:
AssistGUI: Task-Oriented PC Graphical User Interface Automation. CVPR 2024: 13289-13298
[c76]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/XieD0LH0SGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/XieD0LH0SGSS24
Jinheng Xie, Songhe Deng, Bing Li, Haozhe Liu, Yawen Huang, Yefeng Zheng, Jürgen Schmidhuber, Bernard Ghanem, Linlin Shen, Mike Zheng Shou:
Tune-an-Ellipse: CLIP Has Potential to Find what you Want. CVPR 2024: 13723-13732
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GaoTLCS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GaoTLCS24
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin, Joya Chen, Mike Zheng Shou:
Bootstrapping SparseFormers from Vision Foundation Models. CVPR 2024: 17710-17721
[c74]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChenLWLSGLGMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChenLWLSGLGMS24
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou:
VideoLLM-online: Online Video Large Language Model for Streaming Video. CVPR 2024: 18407-18418
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Sun0FGMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Sun0FGMS24
Jingtao Sun, Yaonan Wang, Mingtao Feng, Yulan Guo, Ajmal Mian, Mike Zheng Shou:
L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream. CVPR 2024: 21146-21156
[c72]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LeiGYZGSGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LeiGYZGSGSS24
Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
VIT-LENS: Towards Omni-modal Representations. CVPR 2024: 26637-26647
[c71]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhaoZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhaoZS24
Henry Hengyuan Zhao, Pan Zhou, Mike Zheng Shou:
GENIXER: Empowering Multimodal Large Language Model as a Powerful Data Generator. ECCV (23) 2024: 129-147
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhaoGWZLWKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhaoGWZLWKS24
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou:
MotionDirector: Motion Customization of Text-to-Video Diffusion Models. ECCV (56) 2024: 273-290
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WuLGZHZSLGZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WuLGZHZSLGZ24
Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang:
DragAnything: Motion Control for Anything Using Entity Representation. ECCV (22) 2024: 331-348
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/CiYSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/CiYSS24
Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou:
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-key Identification. ECCV (28) 2024: 338-354
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/LinHWWLS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/LinHWWLS24
Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou:
Parrot Captions Teach CLIP to Spot Text. ECCV (42) 2024: 368-385
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/LinZGXCGXXS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/LinZGXCGXXS24
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou:
Learning Video Context as Interleaved Multimodal Sequences. ECCV (49) 2024: 375-396
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SongWZS024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SongWZS024
Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. ICASSP 2024: 226-230
[c64]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/GaoT0S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/GaoT0S24
Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. ICLR 2024
[c63]
- view
  - electronic edition @ ijcai.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/ijcai/WangMBWS00024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/WangMBWS00024
Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. IJCAI 2024: 3160-3168
[c62]
- view
  - electronic edition @ ijcai.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/ijcai/HuLGTW0S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/HuLGTW0S24
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces. IJCAI 2024: 5862-5871
[c61]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/MaoCGFS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/MaoCGFS24
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou:
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance. ACM Multimedia 2024: 6842-6850
[c60]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/GaoHBLS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/GaoHBLS24
Difei Gao, Siyuan Hu, Zechen Bai, Qinghong Lin, Mike Zheng Shou:
AssistEditor: Multi-Agent Collaboration for GUI Workflow Automation in Video Creation. ACM Multimedia 2024: 11255-11257
[i135]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-00849
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-00849
Alex Jinpeng Wang, Linjie Li, Kevin Qinghong Lin, Jianfeng Wang, Kevin Lin, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou:
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training. CoRR abs/2401.00849 (2024)
[i134]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-01827
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-01827
David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo:
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions. CoRR abs/2401.01827 (2024)
[i133]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-07781
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-07781
Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, Weisi Lin, Wynne Hsu, Ying Shan, Mike Zheng Shou:
Towards A Better Metric for Text-to-Video Generation. CoRR abs/2401.07781 (2024)
[i132]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-13516
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-13516
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces. CoRR abs/2401.13516 (2024)
[i131]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-01345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-01345
Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang, Mike Zheng Shou:
Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models. CoRR abs/2402.01345 (2024)
[i130]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-13724
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-13724
Zechen Bai, Peng Chen, Xiaolan Peng, Lu Liu, Hui Chen, Mike Zheng Shou, Feng Tian:
Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters. CoRR abs/2402.13724 (2024)
[i129]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-07420
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-07420
Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang:
DragAnything: Motion Control for Anything using Entity Representation. CoRR abs/2403.07420 (2024)
[i128]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-12728
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-12728
Jingtao Sun, Yaonan Wang, Mingtao Feng, Chao Ding, Mike Zheng Shou, Ajmal Saeed Mian:
Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation. CoRR abs/2403.12728 (2024)
[i127]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-02747
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-02747
Wentian Zhang, Haozhe Liu, Jinheng Xie, Francesco Faccio, Mike Zheng Shou, Jürgen Schmidhuber:
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models. CoRR abs/2404.02747 (2024)
[i126]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-14055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-14055
Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou:
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification. CoRR abs/2404.14055 (2024)
[i125]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-15909
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-15909
Jinheng Xie, Jiajun Feng, Zhaoxu Tian, Kevin Qinghong Lin, Yawen Huang, Xi Xia, Nanxu Gong, Xu Zuo, Jiaqi Yang, Yefeng Zheng, Mike Zheng Shou:
Learning Long-form Video Prior via Generative Pre-Training. CoRR abs/2404.15909 (2024)
[i124]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-18930
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-18930
Zechen Bai, Pichao Wang, Tianjun Xiao, Tong He, Zongbo Han, Zheng Zhang, Mike Zheng Shou:
Hallucination of Multimodal Large Language Models: A Survey. CoRR abs/2404.18930 (2024)
[i123]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-14974
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-14974
Henry Hengyuan Zhao, Pan Zhou, Difei Gao, Mike Zheng Shou:
LOVA3: Learning to Visual Question Answering, Asking and Assessment. CoRR abs/2405.14974 (2024)
[i122]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19333
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19333
Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun:
Multi-Modal Generative Embedding Model. CoRR abs/2405.19333 (2024)
[i121]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-20339
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-20339
Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun:
Visual Perception by Large Language Model's Weights. CoRR abs/2405.20339 (2024)
[i120]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02547
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02547
Alex Jinpeng Wang, Linjie Li, Yiqi Lin, Min Li, Lijuan Wang, Mike Zheng Shou:
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning. CoRR abs/2406.02547 (2024)
[i119]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06062
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06062
Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou:
ProcessPainter: Learn Painting Process from Sequence Data. CoRR abs/2406.06062 (2024)
[i118]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08337
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08337
Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou:
WMAdapter: Adding WaterMark Control to Latent Diffusion Models. CoRR abs/2406.08337 (2024)
[i117]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09026
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09026
Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou:
Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? CoRR abs/2406.09026 (2024)
[i116]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-10227
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-10227
Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen Wu, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou:
VideoGUI: A Benchmark for GUI Automation from Instructional Videos. CoRR abs/2406.10227 (2024)
[i115]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-11816
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-11816
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou:
VideoLLM-online: Online Video Large Language Model for Streaming Video. CoRR abs/2406.11816 (2024)
[i114]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-13719
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-13719
Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou:
GUI Action Narrator: Where and When Did That Action Take Place? CoRR abs/2406.13719 (2024)
[i113]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-09521
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-09521
Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. CoRR abs/2407.09521 (2024)
[i112]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-21757
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-21757
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou:
Learning Video Context as Interleaved Multimodal Sequences. CoRR abs/2407.21757 (2024)
[i111]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-07249
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-07249
Zechen Bai, Tianjun Xiao, Tong He, Pichao Wang, Zheng Zhang, Thomas Brox, Mike Zheng Shou:
GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval. CoRR abs/2408.07249 (2024)
[i110]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-12528
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-12528
Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou:
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation. CoRR abs/2408.12528 (2024)
[i109]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-16730
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-16730
Shiwei Wu, Joya Chen, Kevin Qinghong Lin, Qimeng Wang, Yan Gao, Qianli Xu, Tong Xu, Yao Hu, Enhong Chen, Mike Zheng Shou:
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation. CoRR abs/2408.16730 (2024)
[i108]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19375
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19375
Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu, Mike Zheng Shou, Changqing Zhang:
DOTA: Distributional Test-Time Adaptation of Vision-Language Models. CoRR abs/2409.19375 (2024)
[i107]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19580
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19580
Zhongcong Xu, Chaoyue Song, Guoxian Song, Jianfeng Zhang, Jun Hao Liew, Hongyi Xu, You Xie, Linjie Luo, Guosheng Lin, Jiashi Feng, Mike Zheng Shou:
High Quality Human Image Animation using Regional Supervision and Motion Blur Condition. CoRR abs/2409.19580 (2024)
[i106]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19603
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19603
Zechen Bai, Tong He, Haiyang Mei, Pichao Wang, Ziteng Gao, Joya Chen, Lei Liu, Zheng Zhang, Mike Zheng Shou:
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos. CoRR abs/2409.19603 (2024)
[i105]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-03858
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-03858
Ziyu Wang, Shuangpeng Han, Mike Zheng Shou, Mengmi Zhang:
Unsupervised Prior Learning: Discovering Categorical Pose Priors from Videos. CoRR abs/2410.03858 (2024)
[i104]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-05470
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-05470
Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu:
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise. CoRR abs/2410.05470 (2024)
[i103]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-07133
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-07133
Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, Yuchao Gu, Lingmin Ran, Xiang Wang, Jay Zhangjie Wu, Junhao Zhang, Yingya Zhang, Mike Zheng Shou:
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models. CoRR abs/2410.07133 (2024)
2023
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/tip/WangCZYLWS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tip/WangCZYLWS23
Wenqian Wang, Faliang Chang, Junhao Zhang, Rui Yan, Chunsheng Liu, Bin Wang, Mike Zheng Shou:
Magi-Net: Meta Negative Network for Early Activity Prediction. IEEE Trans. Image Process. 32: 3254-3265 (2023)
[c59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/LeiGWWLZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/LeiGWWLZS23
Stan Weixian Lei, Difei Gao, Jay Zhangjie Wu, Yuxuan Wang, Wei Liu, Mengmi Zhang, Mike Zheng Shou:
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task. AAAI 2023: 1250-1259
[c58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/YanSGW0C023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/YanSGW0C023
Rui Yan, Mike Zheng Shou, Yixiao Ge, Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions for Retrieval. AAAI 2023: 3100-3108
[c57]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/ZhangSGXWYSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/ZhangSGXWYSS23
Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. AAAI 2023: 3393-3400
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HouZ0GYCNSD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HouZ0GYCNSD23
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Mike Zheng Shou, Nan Duan:
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding. ACL (1) 2023: 8013-8028
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChangWWFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChangWWFS23
Shuning Chang, Pichao Wang, Fan Wang, Jiashi Feng, Mike Zheng Shou:
DOAD: Decoupled One Stage Action Detection Network. CVPR Workshops 2023: 3123-3232
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChangWLWZ0S23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChangWLWZ0S23
Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou:
Making Vision Transformers Efficient from A Token Sparsification View. CVPR 2023: 6195-6205
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangGYGLT0CWSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangGYGLT0CWSQS23
Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Kevin Qinghong Lin, Satoshi Tsutsui, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-Training. CVPR 2023: 6598-6608
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChenGLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChenGLS23
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou:
Affordance Grounding from Demonstration Video to Target Image. CVPR 2023: 6799-6808
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GaoZ0ZYS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GaoZ0ZYS23
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou:
MIST : Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering. CVPR 2023: 14773-14783
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/0003THLSJC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/0003THLSJC23
Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CVPR 2023: 14846-14855
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangZSY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangZSY23
Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Position-Guided Text Prompt for Vision-Language Pre-Training. CVPR 2023: 23242-23251
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/IlaslanSCGLXLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/IlaslanSCGLXLS23
Muhammet Ilaslan, Chenan Song, Joya Chen, Difei Gao, Weixian Lei, Qianli Xu, Joo Lim, Mike Zheng Shou:
GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations. EMNLP 2023: 10462-10479
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuZSZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuZSZS23
Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen:
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models. ICCV 2023: 1206-1217
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LinZCPGWYS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LinZCPGWYS23
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou:
UniVTG: Towards Unified Video-Language Temporal Grounding. ICCV 2023: 2782-2792
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WangLZLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WangLZLS23
Alex Jinpeng Wang, Kevin Qinghong Lin, David Junhao Zhang, Stan Weixian Lei, Mike Zheng Shou:
Too Large; Data Reduction for Vision-Language Pre-Training. ICCV 2023: 3124-3134
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LiXFZLLLKSY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LiXFZLLLKSY23
Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, Shuicheng Yan:
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition. ICCV 2023: 5083-5092
[c43]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/PramanickSNLSSC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/PramanickSNLSSC23
Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang:
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone. ICCV 2023: 5262-5274
[c42]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/XieLHLZ0S23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/XieLHLZ0S23
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou:
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion. ICCV 2023: 7418-7427
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuGWLGSHSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuGWLGSHSQS23
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Stan Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. ICCV 2023: 7589-7599
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/SinghLSLGT0SKZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/SinghLSLGT0SKZ23
Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Difei Gao, Morgan B. Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang:
Learning to Learn: How to Continuously Teach Humans and Machines. ICCV 2023: 11674-11685
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/FanBXZHZSSLSBZF23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/FanBXZHZSSLSBZF23
Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He:
Unsupervised Open-Vocabulary Object Localization in Videos. ICCV 2023: 13701-13709
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LiuCYXKSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LiuCYXKSQS23
Jia-Wei Liu, Yan-Pei Cao, Tianyuan Yang, Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. ICCV 2023: 18437-18448
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuZHZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuZHZS23
Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou:
Label-Efficient Online Continual Object Detection in Streaming Video. ICCV 2023: 19189-19198
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ChangWLWS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ChangWLWS23
Shuning Chang, Pichao Wang, Hao Luo, Fan Wang, Mike Zheng Shou:
Revisiting Vision Transformer from the View of Path Ensemble. ICCV 2023: 19832-19842
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/icdar/WuZLLSPKB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icdar/WuZLLSPKB23
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai:
ICDAR 2023 Competition on Video Text Reading for Dense and Small Text. ICDAR (2) 2023: 405-419
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icde/Ooi0STTXYZ023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icde/Ooi0STTXYZ023
Beng Chin Ooi, Gang Chen, Mike Zheng Shou, Kian-Lee Tan, Anthony K. H. Tung, Xiaokui Xiao, James Wei Luen Yip, Bingxue Zhang, Meihui Zhang:
The Metaverse Data Deluge: What Can We Do About It? ICDE 2023: 3675-3687
[c33]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/XuZLZBFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/XuZLZBFS23
Eric Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou:
PV3D: A 3D Generative Model for Portrait Video Generation. ICLR 2023
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/lgm3a/Shou23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lgm3a/Shou23
Mike Zheng Shou:
Large Generative Models Meet Multimodal Video Intelligence. LGM3A@MM 2023: 1
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XueYL0T0YSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XueYL0T0YSS23
Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Satoshi Tsutsui, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou:
Transformer-based Open-world Instance Segmentation with Cross-task Consistency Regularization. ACM Multimedia 2023: 2507-2515
[c30]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GuWWSCFXZCWGSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuWWSCFXZCWGSS23
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. NeurIPS 2023
[c29]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/WangSZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangSZ23
Ziyu Wang, Mike Zheng Shou, Mengmi Zhang:
Object-centric Learning with Cyclic Walks between Parts and Whole. NeurIPS 2023
[c28]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/WuZCGZHZSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WuZCGZHZSS23
Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen:
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models. NeurIPS 2023
[c27]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/Xie0LLL0SS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Xie0LLL0SS23
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou:
Learning Visual Prior via Generative Pre-Training. NeurIPS 2023
[c26]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/XuZLFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/XuZLFS23
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Jiashi Feng, Mike Zheng Shou:
XAGen: 3D Expressive Human Avatars Generation. NeurIPS 2023
[i102]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-03046
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-03046
Ming Li, Jun Liu, Hehe Fan, Jiawei Liu, Jiahe Li, Mike Zheng Shou, Jussi Keppo:
STPrivacy: Spatio-Temporal Tubelet Sparsification and Anonymization for Privacy-preserving Action Recognition. CoRR abs/2301.03046 (2023)
[i101]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08023
Ziyu Wang, Mike Zheng Shou, Mengmi Zhang:
Object-centric Learning with Cyclic Walks between Parts and Whole. CoRR abs/2302.08023 (2023)
[i100]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01740
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01740
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Zheng Qin, Mike Zheng Shou:
DeepfakeMAE: Facial Part Consistency Aware Masked Autoencoder for Deepfake Video Detection. CoRR abs/2303.01740 (2023)
[i99]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-07910
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-07910
Hengyuan Zhao, Hao Luo, Yuyang Zhao, Pichao Wang, Fan Wang, Mike Zheng Shou:
Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm. CoRR abs/2303.07910 (2023)
[i98]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-08685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-08685
Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou:
Making Vision Transformers Efficient from A Token Sparsification View. CoRR abs/2303.08685 (2023)
[i97]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-11681
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-11681
Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen:
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models. CoRR abs/2303.11681 (2023)
[i96]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-14644
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-14644
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou:
Affordance Grounding from Demonstration Video to Target Image. CoRR abs/2303.14644 (2023)
[i95]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-00254
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-00254
Shuning Chang, Pichao Wang, Fan Wang, Jiashi Feng, Mike Zheng Shou:
DOAD: Decoupled One Stage Action Detection Network. CoRR abs/2304.00254 (2023)
[i94]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-03768
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-03768
Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. CoRR abs/2304.03768 (2023)
[i93]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04023
Binqian Xu, Xiangbo Shu, Rui Yan, Guo-Sen Xie, Yixiao Ge, Mike Zheng Shou:
Attack is Good Augmentation: Towards Skeleton-Contrastive Representation Learning. CoRR abs/2304.04023 (2023)
[i92]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04376
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04376
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai:
ICDAR 2023 Video Text Reading Competition for Dense and Small Text. CoRR abs/2304.04376 (2023)
[i91]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-08271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-08271
Jinheng Xie, Zhaochuan Luo, Yuexiang Li, Haozhe Liu, Linlin Shen, Mike Zheng Shou:
Open-World Weakly-Supervised Object Localization. CoRR abs/2304.08271 (2023)
[i90]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-12281
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-12281
Jiawei Liu, Yan-Pei Cao, Tianyuan Yang, Eric Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. CoRR abs/2304.12281 (2023)
[i89]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-03347
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-03347
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai:
A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension. CoRR abs/2305.03347 (2023)
[i88]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05943
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05943
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Mover: Mask and Recovery based Facial Part Consistency Aware Method for Deepfake Video Detection. CoRR abs/2305.05943 (2023)
[i87]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13777
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13777
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou:
VisorGPT: Learning Visual Prior via Generative Pre-Training. CoRR abs/2305.13777 (2023)
[i86]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18292
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18292
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. CoRR abs/2305.18292 (2023)
[i85]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-20087
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-20087
Alex Jinpeng Wang, Kevin Qinghong Lin, David Junhao Zhang, Stan Weixian Lei, Mike Zheng Shou:
Too Large; Data Reduction for Vision-Language Pre-Training. CoRR abs/2305.20087 (2023)
[i84]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08640
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08640
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou:
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn. CoRR abs/2306.08640 (2023)
[i83]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-12642
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-12642
Binjie Zhang, Yixiao Ge, Xuyuan Xu, Ying Shan, Mike Zheng Shou:
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter. CoRR abs/2306.12642 (2023)
[i82]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15255
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15255
Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou:
GroundNLQ @ Ego4D Natural Language Queries Challenge 2023. CoRR abs/2306.15255 (2023)
[i81]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-05463
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-05463
Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang:
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone. CoRR abs/2307.05463 (2023)
[i80]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-10816
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-10816
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou:
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion. CoRR abs/2307.10816 (2023)
[i79]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-16715
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-16715
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou:
UniVTG: Towards Unified Video-Language Temporal Grounding. CoRR abs/2307.16715 (2023)
[i78]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06160
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06160
Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen:
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models. CoRR abs/2308.06160 (2023)
[i77]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06548
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06548
Shuning Chang, Pichao Wang, Hao Luo, Fan Wang, Mike Zheng Shou:
Revisiting Vision Transformer from the View of Path Ensemble. CoRR abs/2308.06548 (2023)
[i76]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06739
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06739
David Junhao Zhang, Mutian Xu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou:
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks. CoRR abs/2308.06739 (2023)
[i75]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-09921
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-09921
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces. CoRR abs/2308.09921 (2023)
[i74]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-10185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-10185
Weixian Lei, Yixiao Ge, Jianfeng Zhang, Dylan Sun, Kun Yi, Ying Shan, Mike Zheng Shou:
ViT-Lens: Towards Omni-modal Representations. CoRR abs/2308.10185 (2023)
[i73]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07698
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07698
David Junhao Zhang, Heng Wang, Chuhui Xue, Rui Yan, Wenqing Zhang, Song Bai, Mike Zheng Shou:
Dataset Condensation via Generative Model. CoRR abs/2309.07698 (2023)
[i72]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08513
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08513
Henry Hengyuan Zhao, Pichao Wang, Yuyang Zhao, Hao Luo, Fan Wang, Mike Zheng Shou:
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels. CoRR abs/2309.08513 (2023)
[i71]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09469
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09469
Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks. CoRR abs/2309.09469 (2023)
[i70]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09858
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09858
Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He:
Unsupervised Open-Vocabulary Object Localization in Videos. CoRR abs/2309.09858 (2023)
[i69]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-12865
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-12865
Xizhe Xue, Haokui Zhang, Ying Li, Liuwei Wan, Zongwen Bai, Mike Zheng Shou:
Bridging Sensor Gaps via Single-Direction Tuning for Hyperspectral Image Classification. CoRR abs/2309.12865 (2023)
[i68]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15818
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15818
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou:
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation. CoRR abs/2309.15818 (2023)
[i67]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-08465
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-08465
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou:
MotionDirector: Motion Customization of Text-to-Video Diffusion Models. CoRR abs/2310.08465 (2023)
[i66]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-10624
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-10624
Jiawei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CoRR abs/2310.10624 (2023)
[i65]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16002
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16002
Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou, Mike Zheng Shou:
Integrating View Conditions for Image Synthesis. CoRR abs/2310.16002 (2023)
[i64]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16003
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16003
Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest N. Iandola:
CVPR 2023 Text Guided Video Editing Competition. CoRR abs/2310.16003 (2023)
[i63]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-13574
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-13574
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Jiashi Feng, Mike Zheng Shou:
XAGen: 3D Expressive Human Avatars Generation. CoRR abs/2311.13574 (2023)
[i62]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-14284
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-14284
Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang:
Paragraph-to-Image Generation with Information-Enriched Diffusion Model. CoRR abs/2311.14284 (2023)
[i61]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-16081
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-16081
Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
ViT-Lens-2: Gateway to Omni-modal Intelligence. CoRR abs/2311.16081 (2023)
[i60]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-16498
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-16498
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou:
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. CoRR abs/2311.16498 (2023)
[i59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-17450
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-17450
Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou:
Continual Learning for Image Segmentation with Dynamic Query. CoRR abs/2311.17450 (2023)
[i58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-18765
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-18765
Yanqing Liu, Kai Wang, Wenqi Shao, Ping Luo, Yu Qiao, Mike Zheng Shou, Kaipeng Zhang, Yang You:
MLLMs-Augmented Visual-Language Representation Learning. CoRR abs/2311.18765 (2023)
[i57]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00583
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00583
Bardienus Pieter Duisterhof, Zhao Mandi, Yunchao Yao, Jia-Wei Liu, Mike Zheng Shou, Shuran Song, Jeffrey Ichnowski:
MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes. CoRR abs/2312.00583 (2023)
[i56]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-01987
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-01987
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin, Joya Chen, Mike Zheng Shou:
Bootstrapping SparseFormers from Vision Foundation Models. CoRR abs/2312.01987 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02015
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02015
Yufei Shi, Beijia Lu, Jia-Wei Liu, Ming Li, Mike Zheng Shou:
ColonNeRF: Neural Radiance Fields for High-Fidelity Long-Sequence Colonoscopy Reconstruction. CoRR abs/2312.02015 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02087
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02087
Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang:
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence. CoRR abs/2312.02087 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02238
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02238
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou:
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model. CoRR abs/2312.02238 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-06731
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-06731
Henry Hengyuan Zhao, Pan Zhou, Mike Zheng Shou:
Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator. CoRR abs/2312.06731 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-11396
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-11396
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou:
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance. CoRR abs/2312.11396 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-13108
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-13108
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou:
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation. CoRR abs/2312.13108 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-13324
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-13324
Weijia Mao, Yan-Pei Cao, Jia-Wei Liu, Zhongcong Xu, Mike Zheng Shou:
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors. CoRR abs/2312.13324 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-14232
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-14232
Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou:
Parrot Captions Teach CLIP to Spot Text. CoRR abs/2312.14232 (2023)
2022
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/tip/CaoZCSZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tip/CaoZCSZ22
Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou:
Deep Motion Prior for Weakly-Supervised Temporal Action Localization. IEEE Trans. Image Process. 31: 5203-5213 (2022)
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangGCY0SQS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangGCY0SQS22
Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CVPR 2022: 3303-3312
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/MaSZ0XYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/MaSZ0XYY22
Fan Ma, Mike Zheng Shou, Linchao Zhu, Haoqi Fan, Yilei Xu, Yi Yang, Zhicheng Yan:
Unified Transformer Tracker for Object Tracking. CVPR 2022: 8771-8780
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GraumanWBCFGH0L22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GraumanWBCFGH0L22
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhangLWCCQLS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhangLWCCQLS22
David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou:
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning. ECCV (35) 2022: 230-248
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WongCWLMGS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WongCWLMGS22
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant. ECCV (36) 2022: 485-501
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WangGYLFS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WangGYLFS22
Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou:
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval. ECCV (35) 2022: 709-725
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/LeiGWMLRS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/LeiGWMLRS22
Weixian Lei, Difei Gao, Yuxuan Wang, Dongxing Mao, Zihan Liang, Lingmin Ran, Mike Zheng Shou:
AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant. EMNLP (Findings) 2022: 319-338
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/ChangWWLS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/ChangWWLS22
Shuning Chang, Pichao Wang, Fan Wang, Hao Li, Zheng Shou:
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation. HCMA@MM 2022: 41-50
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XuSTFYS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XuSTFYS22
Eric Zhongcong Xu, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. ACM Multimedia 2022: 3838-3847
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/JinSZTQY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/JinSZTQY22
Zan-Xia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin:
From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA. ACM Multimedia 2022: 4564-4572
[c15]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/LinWSWYXGTZKCWD22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LinWSWYXGTZKCWD22
Kevin Qinghong Lin, Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining. NeurIPS 2022
[c14]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/LiuCMZZKSQS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LiuCMZZKSQS22
Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. NeurIPS 2022
[i47]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-04203
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-04203
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant. CoRR abs/2203.04203 (2022)
[i46]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-07303
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-07303
Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-training. CoRR abs/2203.07303 (2022)
[i45]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-07720
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-07720
Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou:
Revitalize Region Feature for Democratizing Video-Language Pre-training. CoRR abs/2203.07720 (2022)
[i44]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15175
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15175
Fan Ma, Mike Zheng Shou, Linchao Zhu, Haoqi Fan, Yilei Xu, Yi Yang, Zhicheng Yan:
Unified Transformer Tracker for Object Tracking. CoRR abs/2203.15175 (2022)
[i43]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-00486
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-00486
Yuxuan Wang, Difei Gao, Licheng Yu, Stan Weixian Lei, Matt Feiszli, Mike Zheng Shou:
GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval. CoRR abs/2204.00486 (2022)
[i42]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-15595
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-15595
Satoshi Tsutsui, Weijia Mao, Sijing Lin, Yunyi Zhu, Murong Ma, Mike Zheng Shou:
Novel View Synthesis for High-fidelity Headshot Scenes. CoRR abs/2205.15595 (2022)
[i41]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-15723
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-15723
Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. CoRR abs/2205.15723 (2022)
[i40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-00309
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-00309
Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou:
Label-Efficient Online Continual Object Detection in Streaming Video. CoRR abs/2206.00309 (2022)
[i39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-01670
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-01670
Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining. CoRR abs/2206.01670 (2022)
[i38]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-02082
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-02082
Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CoRR abs/2206.02082 (2022)
[i37]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-10326
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-10326
Beng Chin Ooi, Kian-Lee Tan, Anthony K. H. Tung, Gang Chen, Mike Zheng Shou, Xiaokui Xiao, Meihui Zhang:
Sense The Physical, Walkthrough The Virtual, Manage The Metaverse: A Data-centric Perspective. CoRR abs/2206.10326 (2022)
[i36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-01334
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-01334
Kevin Qinghong Lin, Alex Jinpeng Wang, Rui Yan, Eric Zhongcong Xu, Rong-Cheng Tu, Yanru Zhu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. CoRR abs/2207.01334 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-01622
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-01622
Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining @ Ego4D Challenge 2022. CoRR abs/2207.01622 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-09023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-09023
Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou:
Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization. CoRR abs/2208.09023 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-12037
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-12037
Stan Weixian Lei, Difei Gao, Jay Zhangjie Wu, Yuxuan Wang, Wei Liu, Mengmi Zhang, Mike Zheng Shou:
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task. CoRR abs/2208.12037 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-10918
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-10918
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan:
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding. CoRR abs/2209.10918 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-06954
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-06954
Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. CoRR abs/2210.06954 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-08776
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-08776
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan:
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022. CoRR abs/2211.08776 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-15470
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-15470
Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Daniel Gao, Morgan Bruce Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang:
Learning to Learn: How to Continuously Teach Humans and Machines. CoRR abs/2211.15470 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03185
Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CoRR abs/2212.03185 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-06384
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-06384
Eric Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou:
PV3D: A 3D Generative Model for Portrait Video Generation. CoRR abs/2212.06384 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-09522
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-09522
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou:
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering. CoRR abs/2212.09522 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-09737
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-09737
Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Position-guided Text Prompt for Vision-Language Pre-training. CoRR abs/2212.09737 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-11565
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-11565
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. CoRR abs/2212.11565 (2022)
2021
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/PanCSLS021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/PanCSLS021
Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu, Jing Shao, Hongsheng Li:
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. CVPR 2021: 464-474
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/CaoCSZZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/CaoCSZZ21
Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou:
On Pursuit of Designing Multi-modal Transformer for Video Grounding. EMNLP (1) 2021: 9810-9823
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/GongW0FWY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/GongW0FWY21
Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan:
Searching for Two-Stream Models in Multivariate Space for Video Recognition. ICCV 2021: 8013-8022
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ShouL0GF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ShouL0GF21
Mike Zheng Shou, Stan Weixian Lei, Weiyao Wang, Deepti Ghadiyaram, Matt Feiszli:
Generic Event Boundary Detection: A Benchmark for Event Segmentation. ICCV 2021: 8055-8064
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/YeR0S21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/YeR0S21
Mang Ye, Weijian Ruan, Bo Du, Mike Zheng Shou:
Channel Augmented Joint Learning for Visible-Infrared Recognition. ICCV 2021: 13547-13556
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/TaoPDQS021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/TaoPDQS021
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. ACM Multimedia 2021: 3927-3935
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-10511
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-10511
Mike Zheng Shou, Deepti Ghadiyaram, Weiyao Wang, Matt Feiszli:
Generic Event Boundary Detection: A Benchmark for Event Segmentation. CoRR abs/2101.10511 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-06592
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-06592
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. CoRR abs/2107.06592 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2108-05607
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-05607
Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou:
Deep Motion Prior for Weakly-Supervised Temporal Action Localization. CoRR abs/2108.05607 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2108-12957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-12957
Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan:
Searching for Two-Stream Models in Multivariate Space for Video Recognition. CoRR abs/2108.12957 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2109-06085
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-06085
Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou:
On Pursuit of Designing Multi-modal Transformer for Video Grounding. CoRR abs/2109.06085 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-07058
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-07058
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-12527
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-12527
David Junhao Zhang, Kunchang Li, Yunpeng Chen, Yali Wang, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou:
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video. CoRR abs/2111.12527 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-14448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-14448
Eric Zhongcong Xu, Zeyang Song, Chao Feng, Mang Ye, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. CoRR abs/2111.14448 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-15050
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-15050
Stan Weixian Lei, Yuxuan Wang, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistSR: Affordance-centric Question-driven Video Segment Retrieval. CoRR abs/2111.15050 (2021)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2112-00656
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-00656
Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CoRR abs/2112.00656 (2021)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2112-01194
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-01194
Rui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions. CoRR abs/2112.01194 (2021)
2020
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/MaZYZKFS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/MaZYZKFS20
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou:
SF-Net: Single-Frame Supervision for Temporal Action Localization. ECCV (4) 2020: 420-437
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2003-06845
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-06845
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou:
SF-Net: Single-Frame Supervision for Temporal Action Localization. CoRR abs/2003.06845 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07976
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07976
Junting Pan, Siyu Chen, Zheng Shou, Jing Shao, Hongsheng Li:
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. CoRR abs/2006.07976 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[b1]
- view
  authority control:
- export record
  dblp key:
  - phd/us/Shou19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/us/Shou19
Zheng Shou:
Deep Learning for Action Understanding in Video. Columbia University, USA, 2019
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouLKSRCY19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouLKSRCY19
Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CVPR 2019: 1268-1277
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1901-03460
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1901-03460
Zheng Shou, Zhicheng Yan, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Xudong Lin, Shih-Fu Chang:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CoRR abs/1901.03460 (2019)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1905-09904
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-09904
Jiawei Ma, Zheng Shou, Alireza Zareian, Hassan Mansour, Anthony Vetro, Shih-Fu Chang:
CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation. CoRR abs/1905.09904 (2019)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1910-11285
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-11285
Xudong Lin, Zheng Shou, Shih-Fu Chang:
LPAT: Learning to Predict Adaptive Threshold for Weakly-supervised Temporal Action Localization. CoRR abs/1910.11285 (2019)
2018
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ShouGZMC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ShouGZMC18
Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang:
AutoLoc: Weakly-Supervised Temporal Action Localization in Untrimmed Videos. ECCV (16) 2018: 162-179
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ShouPCMMVNC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ShouPCMMVNC18
Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giró-i-Nieto, Shih-Fu Chang:
Online Detection of Action Start in Untrimmed, Streaming Videos. ECCV (3) 2018: 551-568
[c3]
- view
- export record
  dblp key:
  - conf/nips/GaoSZZC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GaoSZZC18
Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang:
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks. NeurIPS 2018: 983-993
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1802-06822
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1802-06822
Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giró-i-Nieto, Shih-Fu Chang:
Online Action Detection in Untrimmed, Streaming Videos - Modeling and Evaluation. CoRR abs/1802.06822 (2018)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1807-08333
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1807-08333
Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang:
AutoLoc: Weakly-supervised Temporal Action Localization. CoRR abs/1807.08333 (2018)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1810-11730
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-11730
Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang:
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks. CoRR abs/1810.11730 (2018)
2017
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouCZMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouCZMC17
Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, Shih-Fu Chang:
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. CVPR 2017: 1417-1426
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/ShouCZMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ShouCZMC17
Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, Shih-Fu Chang:
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. CoRR abs/1703.01515 (2017)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1708-05038
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1708-05038
Du Tran, Jamie Ray, Zheng Shou, Shih-Fu Chang, Manohar Paluri:
ConvNet Architecture Search for Spatiotemporal Feature Learning. CoRR abs/1708.05038 (2017)
2016
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouWC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouWC16
Zheng Shou, Dongang Wang, Shih-Fu Chang:
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs. CVPR 2016: 1049-1058
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/ShouWC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ShouWC16
Zheng Shou, Dongang Wang, Shih-Fu Chang:
Action Temporal Localization in Untrimmed Videos via Multi-stage CNNs. CoRR abs/1601.02129 (2016)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/WangSLC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangSLC16
Dongang Wang, Zheng Shou, Hongyi Liu, Shih-Fu Chang:
EventNet Version 1.1 Technical Report. CoRR abs/1605.07289 (2016)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.