default search action
Rohit Girdhar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c31]Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra:
InstanceDiffusion: Instance-Level Control for Image Generation. CVPR 2024: 6232-6242 - [c30]Sachit Menon, Ishan Misra, Rohit Girdhar:
Generating Illustrated Instructions. CVPR 2024: 6274-6284 - [c29]Xudong Wang, Ishan Misra, Ziyun Zeng, Rohit Girdhar, Trevor Darrell:
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation. CVPR 2024: 22755-22764 - [c28]Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman:
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos. CVPR 2024: 27242-27252 - [c27]Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra:
Factorizing Text-to-Video Generation by Explicit Image Conditioning. ECCV (62) 2024: 205-224 - [i36]Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra:
InstanceDiffusion: Instance-level Control for Image Generation. CoRR abs/2402.03290 (2024) - [i35]Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman:
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos. CoRR abs/2404.05206 (2024) - 2023
- [c26]Xudong Wang, Rohit Girdhar, Stella X. Yu, Ishan Misra:
Cut and Learn for Unsupervised Object Detection and Instance Segmentation. CVPR 2023: 3124-3134 - [c25]Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar:
Learning Video Representations from Large Language Models. CVPR 2023: 6586-6597 - [c24]Rohit Girdhar, Alaaeldin El-Nouby, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra:
OmniMAE: Single Model Masked Pretraining on Images and Videos. CVPR 2023: 10406-10417 - [c23]Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra:
ImageBind One Embedding Space to Bind Them All. CVPR 2023: 15180-15190 - [c22]Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman:
HierVL: Learning Hierarchical Video-Language Embeddings. CVPR 2023: 23066-23078 - [c21]Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross B. Girshick, Rohit Girdhar, Ishan Misra:
The effectiveness of MAE pre-pretraining for billion-scale pretraining. ICCV 2023: 5461-5471 - [i34]Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman:
What You Say Is What You Show: Visual Narration Detection in Instructional Videos. CoRR abs/2301.02307 (2023) - [i33]Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman:
HierVL: Learning Hierarchical Video-Language Embeddings. CoRR abs/2301.02311 (2023) - [i32]Xudong Wang, Rohit Girdhar, Stella X. Yu, Ishan Misra:
Cut and Learn for Unsupervised Object Detection and Instance Segmentation. CoRR abs/2301.11320 (2023) - [i31]Bahare Fatemi, Quentin Duval, Rohit Girdhar, Michal Drozdzal, Adriana Romero-Soriano:
Learning to Substitute Ingredients in Recipes. CoRR abs/2302.07960 (2023) - [i30]Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross B. Girshick, Rohit Girdhar, Ishan Misra:
The effectiveness of MAE pre-pretraining for billion-scale pretraining. CoRR abs/2303.13496 (2023) - [i29]Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra:
ImageBind: One Embedding Space To Bind Them All. CoRR abs/2305.05665 (2023) - [i28]Xudong Wang, Ishan Misra, Ziyun Zeng, Rohit Girdhar, Trevor Darrell:
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation. CoRR abs/2308.14710 (2023) - [i27]Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra:
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning. CoRR abs/2311.10709 (2023) - [i26]Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi:
Motion-Conditioned Image Animation for Video Editing. CoRR abs/2311.18827 (2023) - [i25]Sachit Menon, Ishan Misra, Rohit Girdhar:
Generating Illustrated Instructions. CoRR abs/2312.04552 (2023) - 2022
- [c20]Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar:
Masked-attention Mask Transformer for Universal Image Segmentation. CVPR 2022: 1280-1289 - [c19]Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra:
Omnivore: A Single Model for Many Visual Modalities. CVPR 2022: 16081-16091 - [c18]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990 - [c17]Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra:
Detecting Twenty-Thousand Classes Using Image-Level Supervision. ECCV (9) 2022: 350-368 - [i24]Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra:
Detecting Twenty-thousand Classes using Image-level Supervision. CoRR abs/2201.02605 (2022) - [i23]Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra:
Omnivore: A Single Model for Many Visual Modalities. CoRR abs/2201.08377 (2022) - [i22]Rohit Girdhar, Alaaeldin El-Nouby, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra:
OmniMAE: Single Model Masked Pretraining on Images and Videos. CoRR abs/2206.08356 (2022) - [i21]Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar:
Learning Video Representations from Large Language Models. CoRR abs/2212.04501 (2022) - 2021
- [c16]Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar:
3D Spatial Recognition Without Spatially Labeled 3D. CVPR 2021: 13204-13213 - [c15]Ishan Misra, Rohit Girdhar, Armand Joulin:
An End-to-End Transformer Model for 3D Object Detection. ICCV 2021: 2886-2897 - [c14]Zaiwei Zhang, Rohit Girdhar, Armand Joulin, Ishan Misra:
Self-Supervised Pretraining of 3D Features on any Point-Cloud. ICCV 2021: 10232-10243 - [c13]Rohit Girdhar, Kristen Grauman:
Anticipative Video Transformer. ICCV 2021: 13485-13495 - [i20]Zaiwei Zhang, Rohit Girdhar, Armand Joulin, Ishan Misra:
Self-Supervised Pretraining of 3D Features on any Point-Cloud. CoRR abs/2101.02691 (2021) - [i19]Eltayeb Ahmed, Anton Bakhtin, Laurens van der Maaten, Rohit Girdhar:
Physical Reasoning Using Dynamics-Aware Models. CoRR abs/2102.10336 (2021) - [i18]Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar:
3D Spatial Recognition without Spatially Labeled 3D. CoRR abs/2105.06461 (2021) - [i17]Rohit Girdhar, Kristen Grauman:
Anticipative Video Transformer. CoRR abs/2106.02036 (2021) - [i16]Ishan Misra, Rohit Girdhar, Armand Joulin:
An End-to-End Transformer Model for 3D Object Detection. CoRR abs/2109.08141 (2021) - [i15]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021) - [i14]Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar:
Masked-attention Mask Transformer for Universal Image Segmentation. CoRR abs/2112.01527 (2021) - [i13]Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G. Schwing:
Mask2Former for Video Instance Segmentation. CoRR abs/2112.10764 (2021) - 2020
- [c12]Rohit Girdhar, Deva Ramanan:
CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning. ICLR 2020 - [c11]Jessica Lee, Deva Ramanan, Rohit Girdhar:
MetaPix: Few-Shot Video Retargeting. ICLR 2020 - [i12]Rohit Girdhar, Laura Gustafson, Aaron Adcock, Laurens van der Maaten:
Forward Prediction for Physical Reasoning. CoRR abs/2006.10734 (2020)
2010 – 2019
- 2019
- [b1]Rohit Girdhar:
Learning to Understand People via Local, Global and Temporal Reasoning. Carnegie Mellon University, USA, 2019 - [c10]Rohit Girdhar, João Carreira, Carl Doersch, Andrew Zisserman:
Video Action Transformer Network. CVPR 2019: 244-253 - [c9]Rohit Girdhar, Du Tran, Lorenzo Torresani, Deva Ramanan:
DistInit: Learning Video Representations Without a Single Labeled Video. ICCV 2019: 852-861 - [c8]Bhavan Jasani, Rohit Girdhar, Deva Ramanan:
Are we Asking the Right Questions in MovieQA? ICCV Workshops 2019: 1879-1882 - [i11]Rohit Girdhar, Du Tran, Lorenzo Torresani, Deva Ramanan:
DistInit: Learning Video Representations without a Single Labeled Video. CoRR abs/1901.09244 (2019) - [i10]Jessica Lee, Deva Ramanan, Rohit Girdhar:
MetaPix: Few-Shot Video Retargeting. CoRR abs/1910.04742 (2019) - [i9]Rohit Girdhar, Deva Ramanan:
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning. CoRR abs/1910.04744 (2019) - [i8]Bhavan Jasani, Rohit Girdhar, Deva Ramanan:
Are we asking the right questions in MovieQA? CoRR abs/1911.03083 (2019) - 2018
- [c7]Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, Du Tran:
Detect-and-Track: Efficient Pose Estimation in Videos. CVPR 2018: 350-359 - [i7]Xiaolong Wang, Rohit Girdhar, Abhinav Gupta:
Binge Watching: Scaling Affordance Learning from Sitcoms. CoRR abs/1804.03080 (2018) - [i6]Rohit Girdhar, João Carreira, Carl Doersch, Andrew Zisserman:
A Better Baseline for AVA. CoRR abs/1807.10066 (2018) - [i5]Rohit Girdhar, João Carreira, Carl Doersch, Andrew Zisserman:
Video Action Transformer Network. CoRR abs/1812.02707 (2018) - 2017
- [c6]Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan C. Russell:
ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification. CVPR 2017: 3165-3174 - [c5]Xiaolong Wang, Rohit Girdhar, Abhinav Gupta:
Binge Watching: Scaling Affordance Learning from Sitcoms. CVPR 2017: 3366-3375 - [c4]Rohit Girdhar, Deva Ramanan:
Attentional Pooling for Action Recognition. NIPS 2017: 34-45 - [i4]Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan C. Russell:
ActionVLAD: Learning spatio-temporal aggregation for action classification. CoRR abs/1704.02895 (2017) - [i3]Rohit Girdhar, Deva Ramanan:
Attentional Pooling for Action Recognition. CoRR abs/1711.01467 (2017) - [i2]Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, Du Tran:
Detect-and-Track: Efficient Pose Estimation in Videos. CoRR abs/1712.09184 (2017) - 2016
- [c3]Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta:
Learning a Predictable and Generative Vector Representation for Objects. ECCV (6) 2016: 484-499 - [c2]Rohit Girdhar, David F. Fouhey, Kris M. Kitani, Abhinav Gupta, Martial Hebert:
Cutting through the clutter: Task-relevant features for image matching. WACV 2016: 1-9 - [i1]Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta:
Learning a Predictable and Generative Vector Representation for Objects. CoRR abs/1603.08637 (2016) - 2014
- [c1]Rohit Girdhar, Jayaguru Panda, C. V. Jawahar:
Optimizing Storage Intensive Vision Applications to Device Capacity. ACCV (5) 2014: 460-475
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:37 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint