Main Idea

Exploit the existing spatio-temporal correlations in videos by decomposing the motion and content in the task of unsupervised deterministic frame prediction.

Motion pathway encodes the local dynamics of spatial regions while content pathway encodes the spatial layout of the salient parts of an image.

Contributions

MCnet separates the information streams (motion and content) into different encoder pathways.
The proposed network is end-to-end trainable and naturally learns to decompose motion and content without separate training, and reduces the task of frame prediction to transforming the last observed frame into the next by the observed motion.
Evaluate the proposed model on challenging real-world video datasets, and show that it outperforms previous approaches on frame prediction

Takeaways

Separating modeling of motion and content improves the quality of the pixel-level future prediction.
Using Residual in the network can help generalize well to the unseen contents.
Use L_GAN for generating “realistic” frames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decomposing Motion and Content for Natural Video Sequence Prediction.md

Decomposing Motion and Content for Natural Video Sequence Prediction.md

Main Idea

Contributions

Takeaways

Files

Decomposing Motion and Content for Natural Video Sequence Prediction.md

Latest commit

History

Decomposing Motion and Content for Natural Video Sequence Prediction.md

File metadata and controls

Main Idea

Contributions

Takeaways