Skip to content

NVIDIA Megatron Core 0.8.0

Latest
Compare
Choose a tag to compare
@ko3n1g ko3n1g released this 13 Aug 12:12
  • Multimodal
    • Added initial support for training vision language models using the LLaVA architecture
    • Added initial support for inference with multimodal inputs
    • End-to-end multimodal example from data collection to training to evaluation is provided in examples/multimodal
  • MoE
    • Context Parallel support.
    • Distributed checkpoint support for grouped GEMM.
  • Mamba
    • Added initial support for training and inference of Mamba-2 models
    • Support for hybrid models consisting of Mamba-2, attention, and MLP layers
    • Examples provided in examples/mamba