mllm

Star

Here are 37 public repositories matching this topic...

Ahnsun / merlin

Star

[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds

mllm

Updated Jul 4, 2024
Python

med-air / PICG2scoring

Star

[MICCAI'24] Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring

pytorch prostate-cancer mllm

Updated Aug 2, 2024
Python

BUAADreamer / MLLM-Finetuning-Demo

Star

使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory

transformers lora pretraining huggingface-datasets supervised-finetuning mllm llava finetune-llm llama-factory paligemma yi-vl

Updated Jun 28, 2024
Python

zzq2000 / MIKO

Star

MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discover

social-media intention llm mllm

Updated Mar 5, 2024
Python

baaivision / DenseFusion

Star

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

vlm image-descriptions visual-perception mllm multimodal-large-language-models vision-language-models

Updated Jul 31, 2024
Python

VisualWebBench / VisualWebBench

Star

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated May 31, 2024
Python

bonjour-npy / UndergraduateDissertation

Star

Undergraduate Dissertation of Guilin University of Electronic Technology

prompt-learning prompt-tuning llm mllm

Updated May 24, 2024
Python

BUAADreamer / Chinese-LLaVA-Med

Star

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

bigai-nlco / LSTP-Chat

Star

A Video Chat Agent with Temporal Prior

spatial-temporal video-language llm mllm visual-instruction-tuning multimodal-large-language-models

Updated Feb 28, 2024
Python

wangclnlp / Vision-LLM-Alignment

Star

This repo contains the codes for supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) designed for vision LLMs.

vision alignment multi-model reward ppo sft dpo llm rlhf mllm llava

Updated Jul 21, 2024
Python

showlab / VisInContext

Star

Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning

efficient in-context-learning llm mllm

Updated Jun 6, 2024
Python

UCSC-VLAA / Sight-Beyond-Text

Star

This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"

alignment vlm ai-alignment vision-language vicuna llm mllm llava llama2

Updated Sep 15, 2023
Python

X-PLUG / mPLUG-HalOwl

Star

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

benchmark contrastive-learning hallucinations mllm multimodal-large-language-models multimodal-hallucination

Updated Jan 29, 2024
Python

Hon-Wong / Elysium

Star

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

tracking benchmark dataset video-processing gpt vlm eccv visual-object-tracking sot mllm eccv2024

Updated Jul 17, 2024
Python

BAAI-DCAI / DataOptim

Star

A collection of visual instruction tuning datasets.

llm mllm visual-instruction-tuning

Updated Mar 14, 2024
Python

baaivision / EVE

Star

EVE: Encoder-Free Vision-Language Models

clip vlm instruction-following large-language-models llm mllm multimodal-large-language-models vision-language-models encoder-free-vlm

Updated Jul 20, 2024
Python

ZebangCheng / Emotion-LLaMA

Star

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

affective-computing instruction-tuning mllm

Updated Aug 7, 2024
Python

CircleRadon / TokenPacker

Star

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

connector lmm mllm token-reduction visual-projector tokenpacker

Updated Jul 26, 2024
Python

TideDra / VL-RLHF

Star

A RLHF Infrastructure for Vision-Language Models

vlm lmm dpo llm rlhf mllm

Updated Jun 12, 2024
Python

thu-ml / MMTrustEval

Star

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust)

benchmark privacy toolbox safety multi-modal fairness robustness claude gpt-4 trustworthy-ai truthfulness mllm

Updated Aug 8, 2024
Python

Improve this page

Add a description, image, and links to the mllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mllm

Here are 37 public repositories matching this topic...

Ahnsun / merlin

med-air / PICG2scoring

BUAADreamer / MLLM-Finetuning-Demo

zzq2000 / MIKO

baaivision / DenseFusion

VisualWebBench / VisualWebBench

bonjour-npy / UndergraduateDissertation

BUAADreamer / Chinese-LLaVA-Med

bigai-nlco / LSTP-Chat

wangclnlp / Vision-LLM-Alignment

showlab / VisInContext

UCSC-VLAA / Sight-Beyond-Text

X-PLUG / mPLUG-HalOwl

Hon-Wong / Elysium

BAAI-DCAI / DataOptim

baaivision / EVE

ZebangCheng / Emotion-LLaMA

CircleRadon / TokenPacker

TideDra / VL-RLHF

thu-ml / MMTrustEval

Improve this page

Add this topic to your repo