news

Apr 18, 2025 I presented our work “Large-scale Pre-training for Grounded Video Caption Generation” at the weekly webinar of TwelveLabs. [YouTube link]
Mar 13, 2025 Our paper “Large-scale Pre-training for Grounded Video Caption Generation” is now on arXiv. [arXiv] [Project webpage] [Code] (available soon, stay tuned!)
Feb 28, 2025 I received the 2024 IJCV Outstanding Reviewer Award. Announcement
Nov 01, 2024 I started a new role as a Postdoctoral Researcher at Czech Institute of Informatics, Robotics and Cybernetics (CIIRC) at CTU in Prague. My research will focus on multimodal understanding using video and language
Feb 27, 2024 Our paper with title “TIM: A Time Interval Machine for Audio-Visual Action Recognition” has been accepted at CVPR 2024 [paper] [project page]
Jan 18, 2024 Our paper with title “Graph Guided Question Answer Generation for Procedural Question-Answering” has been accepted at EACL 2024 [paper]
Feb 17, 2023 Our paper with title “Epic-sounds: A large-scale dataset of actions that sound” has been accepted at ICASSP 2023 [paper] [project page]
May 30, 2022 I joined Samsung AI Center in Cambridge as a Research Scientist
Apr 27, 2022 I successfully defended my PhD dissertation with title “Audio-Visual Egocentric Action Recognition[link]