【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
-
Updated
Aug 2, 2024 - Python
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
A Survey on video and language understanding.
Video Graph Transformer for Video Question Answering (ECCV'22)
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
A repository of Video Language papers, code and datasets.
Add a description, image, and links to the video-language-understanding topic page so that developers can more easily learn about it.
To associate your repository with the video-language-understanding topic, visit your repo's landing page and select "manage topics."