The project is continuously updated, welcome to starts ⭐ & comments 💹 & sharing 😀 !!!
Other awesome projects: Awesome-Referring-Video-Object-Segmentation
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
OOKD | Offline-to-Online Knowledge Distillation for Video Instance Segmentation | WACV | Online | ||
MobileInst | MobileInst: Video Instance Segmentation on the Mobile | AAAI | Online | ||
LBVQ | Learning Better Video Query with SAM for Video Instance Segmentation | TCSVT | Offline | Code | |
OV2Seg+ | OV-VIS: Open-Vocabulary Video Instance Segmentation | IJCV | Online | Code | |
OMG-Seg | OMG-Seg: Is One Model Good Enough For All Segmentation? | CVPR | Semi-Online | Code | |
UniVS | UniVS: Unified and Universal Video Segmentation with Prompts as Queries | CVPR | Online | Code | |
GLEE | General Object Foundation Model for Images and Videos at Scale | CVPR | Offline | Code | |
UVIS | UVIS: Unsupervised Video Instance Segmentation | CVPRW | Online | ||
DVIS-DAQ | DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries | ECCV | Online/Offline | Code | |
VISAGE | VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement | ECCV | Online | Code | |
OVFormer | Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | ECCV | Semi-Online | Code | |
GvSeg | General and Task-Oriented Video Segmentation | ECCV | Semi-Online | Code | |
RAP-SAM | RAP-SAM : Towards Real-Time All-Purpose Segment Anything | Arxiv | Online | Code | |
BriVIS | Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation | Arxiv | Offline | Code | |
InstFormer | OpenVIS: Open-vocabulary Video Instance Segmentation | Arxiv | Online | ||
CLIP-VIS | CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation | Arxiv | Online | Code | |
PointVIS | What is Point Supervision Worth in Video Instance Segmentation? | Arxiv | Online | ||
OW-VISCap | OW-VISCap: Open-World Video Instance Segmentation and Captioning | Arxiv | Online | Code | |
PM-VIS | PM-VIS: High-Performance Box-Supervised Video Instance Segmentation | Arxiv | Online | ||
PM-VIS+ | PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation | MIPR | Online | Code | |
CAVIS | CAVIS: Context-Aware Video Instance Segmentation | Arxiv | Online/Offline | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
InstanceFormer | InstanceFormer: An Online Video Instance Segmentation Framework | AAAI | Online | Code | |
GenVIS | A Generalized Framework for Video Instance Segmentation | CVPR | Online/Semi-Online | Code | |
MDQE | MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos | CVPR | Semi-Online | Code | |
Mask-Free VIS | Mask-Free Video Instance Segmentation | CVPR | Online | Code | |
InstMove | InstMove: Instance Motion for Object-centric Video Segmentation | CVPR | Online | Code | |
VideoCutLER | VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation | CVPR | Offline | Code | |
TarViS | TarViS: A Unified Approach for Target-based Video Segmentation | CVPR | Offline | Code | |
CAROQ | Context-Aware Relative Object Queries To Unify Video Instance and Panoptic Segmentation | CVPR | Online | ||
UNINEXT | Universal Instance Perception as Object Discovery and Retrieval | CVPR | Offline | Code | |
CTVIS | CTVIS: Consistent Training for Online Video Instance Segmentation | ICCV | Online | Code | |
DVIS | DVIS: Decoupled Video Instance Segmentation Framework | ICCV | Online/Offline | Code | |
OV2Seg | Towards Open-Vocabulary Video Instance Segmentation | ICCV | Online | Code | |
TCOVIS | TCOVIS: Temporally Consistent Online Video Instance Segmentation | ICCV | Online | Code | |
Tube-Link | Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation | ICCV | Semi-Online | Code | |
TMT-VIS | TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation | NeurIPS | Offline | Code | |
NOVIS | NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation | ICML | Semi-Online | ||
TIVE | TIVE: A Toolbox for Identifying Video Instance Segmentation Errors | Neurocomputing | Toolbox | Code | |
VLKP | VLKP: Video Instance Segmentation with Visual-Linguistic Knowledge Prompts | ICASSP | Offline | ||
IAST | IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation | ICASSP | Offline | Code | |
HEVis* | Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows | JAS | Offline | Code | |
TAFormer | Towards Robust Video Instance Segmentation with Temporal-Aware Transformer | Arxiv | Offline | ||
UVOSAM | UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model | Arxiv | Online | ||
RefineVIS | RefineVIS: Video Instance Segmentation with Temporal Attention Refinement | Arxiv | Online | ||
GRAtt-VIS | GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation | Arxiv | Online | Code | |
BoxVIS | BoxVIS: Video Instance Segmentation with Box Annotations | Arxiv | Online | Code | |
OW-VISFormer | Video Instance Segmentation in an Open-World | Arxiv | Offline | Code | |
DVIS++ | DVIS++: Improved Decoupled Framework for Universal Video Segmentation | Arxiv | Online/Offline | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
HIATF | Hybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation | AAAI | Online | ||
Mask2former-VIS | Mask2former for Video Instance Segmentation | CVPR | Offline | Code | |
Video K-Net | Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation | CVPR | Offline | Code | |
VISOLO | VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation | CVPR | Online | Code | |
TeViT | Temporally Efficient Vision Transformer for Video Instance Segmentation | CVPR | Offline | Code | |
EfficientVIS | Efficient Video Instance Segmentation via Tracklet Query and Proposal | CVPR | Online | Code | |
SeqFormer | SeqFormer: Sequential Transformer for Video Instance Segmentation | ECCV | Offline | Code | |
IDOL | In Defense of Online Models for Video Instance Segmentation | ECCV | Online | Code | |
MS-STS VIS | Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer | ECCV | Offline | Code | |
Self-Shot VIS | Less than Few: Self-Shot Video Instance Segmentation | ECCV | Offline | ||
VMT | Video Mask Transfiner for High-Quality Video Instance Segmentation | ECCV | Offline | Code | |
STC | STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation | ECCV | Online | ||
IAI | Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation | ECCV | Online | Code | |
VITA | VITA: Video Instance Segmentation via Object Token Association | NeurIPS | Offline | Code | |
MinVIS | MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training | NeurIPS | Online | Code | |
InsPro | InsPro: Propagating Instance Query and Proposal for Online Video Instance Segmentation | NeurIPS | Online | ||
SipMaskv2 | SipMaskv2: Enhanced Fast Image and Video Instance Segmentation | TPAMI | Online | Code | |
TPR | Improving Video Instance Segmentation via Temporal Pyramid Routing | TPAMI | Online | Code | |
IFA | Video Instance Segmentation by Instance Flow Assembly | TMM | Online | ||
DefVIS | Deformable VisTR : Spatio temporal deformable attention for video instance segmentation | ICASSP | Offline | Code | |
TBA | Tag-Based Attention Guided Bottom-Up Approach for Video Instance Segmentation | ICPR | Offline | ||
DeVIS | DeVIS: Making Deformable Transformers Work for Video Instance Segmentation | Arxiv | Offline | Code | |
RCF | Online Video Instance Segmentation via Robust Context Fusion | Arxiv | Online | ||
IFR | Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention | Arxiv | Offline | ||
ROVIS | Robust Online Video Instance Segmentation with Track Queries | Arxiv | Online | Code | |
CiCo | One-stage Video Instance Segmentation: From Frame-in Frame-out to Clip-in Clip-out | Arxiv | Offline | Code | |
TLTM | Two-Level Temporal Relation Model for Online Video Instance Segmentation | Arxiv | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
CompFeat | CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation | AAAI | Online | Code | |
VisTR | End-to-End Video Instance Segmentation with Transformers | CVPR | Offline | Code | |
SG-Net | SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation | CVPR | Online | Code | |
STMask | Spatial Feature Calibration and Temporal Fusion for Effective One-Stage Video Instance Segmentation | CVPR | Online | Code | |
CrossVIS | Crossover Learning for Fast Online Video Instance Segmentation | ICCV | Online | Code | |
Propose-Reduce | Video Instance Segmentation with a Propose-Reduce Paradigm | ICCV | Offline | Code | |
VisSTG | End-to-end Video Instance Segmentation via Spatial-Temporal Graph Neural Networks | ICCV | Online | Code | |
QueryInst | Instances as Queries | ICCV | Online | Code | |
HEVis | Learning Hierarchical Embedding for Video Instance Segmentation | ACM MM | Offline | Code | |
SRNet | SRNet: Spatial Relation Network for Efficient Single-stage Instance Segmentation in Videos | ACM MM | Online | ||
IFC | Video Instance Segmentation using Inter-Frame Communication Transformers | NeurIPS | Offline | Code | |
PCAN | Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation | NeurIPS | Online | Code | |
CMaskTrack R-CNN | Occluded Video Instance Segmentation: A Benchmark | IJCV | Online | Dataset | |
RGNNVIS++ | Recurrent Graph Neural Networks for Video Instance Segmentation | IJCV | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
MaskProp | Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation | CVPR | Offline | ||
VAE | Video Instance Segmentation Tracking with a Modified VAE Architecture | CVPR | Online | ||
SipMask | Sipmask: Spatial Information Preservation for Fast Image and Video Instance Segmentation | ECCV | Online | Code | |
STEm-Seg | STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos | ECCV | Offline | Code | |
RGNNVIS | Learning Video Instance Segmentation with Recurrent Graph Neural Networks | GCPR | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
MaskTrack R-CNN | Video instance segmentation | ICCV | Online | Code |