Our work “Large-scale Pre-training for Grounded Video Caption Generation” has been accepted to ICCV 2025!