mllm

Here are 36 public repositories matching this topic...

X-PLUG / mPLUG-2

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)

video vqa image-retrieval multimodal video-retrieval video-question-answering foundation-models multimodal-pretraining mllm mplug

Updated Jul 21, 2023
Python

UCSC-VLAA / Sight-Beyond-Text

Star

This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"

alignment vlm ai-alignment vision-language vicuna llm mllm llava llama2

Updated Sep 15, 2023
Python

X-PLUG / Youku-mPLUG

Star

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

benchmark video dataset chinese youku multimodal video-retrieval video-question-answering multimodal-pretraining mllm multimodal-large-language-models

Updated Jan 8, 2024
Python

X-PLUG / mPLUG-HalOwl

Star

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

benchmark contrastive-learning hallucinations mllm multimodal-large-language-models multimodal-hallucination

Updated Jan 29, 2024
Python

bigai-nlco / LSTP-Chat

Star

A Video Chat Agent with Temporal Prior

spatial-temporal video-language llm mllm visual-instruction-tuning multimodal-large-language-models

Updated Feb 28, 2024
Python

zzq2000 / MIKO

Star

MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discover

social-media intention llm mllm

Updated Mar 5, 2024
Python

BAAI-DCAI / DataOptim

Star

A collection of visual instruction tuning datasets.

llm mllm visual-instruction-tuning

Updated Mar 14, 2024
Python

FoundationVision / GenerateU

Star

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

open-world object-detection multimodality open-vocabulary mllm open-vocabulary-detection

Updated Mar 25, 2024
Python

360CVGroup / SEEChat

Star

Multimodal chatbot with computer vision capabilities integrated

chatbot gpt4 mllm

Updated May 17, 2024
Python

BUAADreamer / Chinese-LLaVA-Med

Star

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

bonjour-npy / UndergraduateDissertation

Star

Undergraduate Dissertation of Guilin University of Electronic Technology

prompt-learning prompt-tuning llm mllm

Updated May 24, 2024
Python

VisualWebBench / VisualWebBench

Star

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated May 31, 2024
Python

showlab / VisInContext

Star

Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning

efficient in-context-learning llm mllm

Updated Jun 6, 2024
Python

FoundationVision / Groma

Star

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

llama multimodal grounding foundation-models large-language-models llm mllm vision-language-model llama2

Updated Jun 7, 2024
Python

TideDra / VL-RLHF

Star

A RLHF Infrastructure for Vision-Language Models

vlm lmm dpo llm rlhf mllm

Updated Jun 12, 2024
Python

BradyFU / Woodpecker

Star

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.

multimodality hallucination hallucinations large-language-models llm mllm multimodal-large-language-models

Updated Jun 17, 2024
Python

BUAADreamer / MLLM-Finetuning-Demo

Star

使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory

transformers lora pretraining huggingface-datasets supervised-finetuning mllm llava finetune-llm llama-factory paligemma yi-vl

Updated Jun 28, 2024
Python

Ahnsun / merlin

Star

[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds

mllm

Updated Jul 4, 2024
Python

dvlab-research / LLMGA

Star

This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024

image-editing image-generation multi-modal aigc llm large-language-model mllm image-design-assistant

Updated Jul 12, 2024
Python

microsoft / unilm

Star

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Updated Jul 15, 2024
Python

Improve this page

Add a description, image, and links to the mllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mllm

Here are 36 public repositories matching this topic...

X-PLUG / mPLUG-2

UCSC-VLAA / Sight-Beyond-Text

X-PLUG / Youku-mPLUG

X-PLUG / mPLUG-HalOwl

bigai-nlco / LSTP-Chat

zzq2000 / MIKO

BAAI-DCAI / DataOptim

FoundationVision / GenerateU

360CVGroup / SEEChat

BUAADreamer / Chinese-LLaVA-Med

bonjour-npy / UndergraduateDissertation

VisualWebBench / VisualWebBench

showlab / VisInContext

FoundationVision / Groma

TideDra / VL-RLHF

BradyFU / Woodpecker

BUAADreamer / MLLM-Finetuning-Demo

Ahnsun / merlin

dvlab-research / LLMGA

microsoft / unilm

Improve this page

Add this topic to your repo