Initial release and pre-trained weights
Initial release including three pre-trained weights:
- m2d_clap_vit_base-80x608p16x16-240128.zip (1.45GB)
- m2d_as_vit_base-80x608p16x16-240213-mr7.zip (1.46 GB)
- m2d_vit_base-80x608p16x16-221006-mr7.zip (1.44 GB)
- m2d_vit_base-80x608p16x16-221006-mr7_enconly.zip (Encoder only, 302 MB)
- m2d_vit_base-80x608p16x16-220930-mr7_enconly.zip (Encoder only, 302 MB)
- m2d_as_vit_base-80x608p16x16p32k-240413_enconly.zip (Encoder only, 302.17 MB, 32 kHz input)
- m2d_vit_base-80x608p16x16-221006-mr6.zip (1.44 GB)
- m2d_vit_base-80x200p16x4-230529.zip (1.45 GB)
- msm_mae_vit_base-80x608p16x16-220924-mr75.zip (976 MB)
- Example logs.
All weights are 16 kHz input unless denoted.