support MiniCPM-V-2.5 #7599

tc-mb · 2024-05-28T20:39:01Z

Dear llama.cpp Official,

Hi, I'm writing to address our new PR submission for integrating our model MiniCPM-Llama3-V 2.5 into llama.cpp, which has been trending on Huggingface for over a week and has garnered significant user demand. During the previous PR attempt of MiniCPM-V, we identified several critical implementation bugs. The official minicpm-v team has since fixed all these issues, resulting in a performance that matches our PyTorch version. These changes also distinguish our implementation significantly from LLaVA example codebase.

Here are some key differences and improvements we've made:

Flexible Image Handling: We support arbitrary image sizes by dynamically segmenting images into sub-images, allowing our ViT to accept various aspect ratios, unlike the fixed dimensions required by other models.
2D Resampler: Our model uses a 2D resampler to down sample image features into smaller sequences, significantly speeding up inference.
Enhanced Embedding: Unlike the original positional encoding of VIT used in previous VLMs, we employ a new approach for image embedding with a PosEmbedding layer.
Distinct Tokenizer: Our tokenizer is different from LLaVA's, leading to unique special token decoding.
Upper Framework Support: We've optimized our model for better integration with frameworks like Ollama.
CLI Optimization: We've made modifications to better adapt the CLI for Android use.
NPU-Optimized ViT: We've rewritten the Vision Transformer (ViT) component to leverage NPU on mobile devices, optimizing I/O for Android inference. (this week)

While some aspects of our implementation may appear similar to LLaVA example codebase, these distinct features and optimizations set our model apart. We can reference LLaVA for the overlapping components to maintain code integrity, but this might compromise the standalone nature of different examples, akin to how Huggingface Transformers ensures each model has its unique implementation.

Given the extensive user interest and the robust performance of our implementation, merging this model would significantly benefit the community. We are open to collaborating on any adjustments you deem necessary and are committed to ensuring the highest code quality and usability.

Thank you for considering our request. We look forward to your feedback and hope for a positive resolution.

Best regards,
MiniCPM-V Official ^_^

Fixed Line

sync master

tc-mb · 2024-08-07T15:25:45Z

Ah there appears to be a conflict - need to resolve this so the CI can run

I tried to solve most of the problems, but there seems to be an error in the original llava-cli. Can you help me check it?

ggerganov · 2024-08-07T17:04:34Z

Does this patch fix it?

diff --git a/examples/llava/clip.h b/examples/llava/clip.h
index f028f187..2ff4d399 100644
--- a/examples/llava/clip.h
+++ b/examples/llava/clip.h
@@ -18,8 +18,6 @@
 #    define CLIP_API
 #endif
 
-struct clip_ctx;
-
 #ifdef __cplusplus
 extern "C" {
 #endif

tc-mb · 2024-08-08T09:43:44Z

Does this patch fix it?

diff --git a/examples/llava/clip.h b/examples/llava/clip.h
index f028f187..2ff4d399 100644
--- a/examples/llava/clip.h
+++ b/examples/llava/clip.h
@@ -18,8 +18,6 @@
 #    define CLIP_API
 #endif
 
-struct clip_ctx;
-
 #ifdef __cplusplus
 extern "C" {
 #endif

This does work and only leaves one now.

tc-mb · 2024-08-09T10:08:58Z

@ggerganov I'm glad that the CI is all green. Can we merge this pr now?
If merge, I will submit a pr of MiniCPM-V 2.6 today.

cmp-nct · 2024-08-09T11:15:33Z

Grats, awesome to see this progress so much. Thanks for the effort, looking forwarding to see 2.6

examples/llava/requirements.txt

chigkim · 2024-08-11T18:01:02Z

@tc-mb, This is awesome!!! Hopefully 2.6 is on the way!
@ggerganov, is the server still broken with no support for vision language models? Any plan to bring it back?
Thanks EVERYONE!

tc-mb · 2024-08-13T03:32:47Z

Hi, @ggerganov, I have submitted the PR for MiniCPM-V 2.6. This PR only updates the model, and the CI is all green too. Could you please take a look at it if you are free in the near future?

#8967

lin72h · 2024-08-15T03:49:36Z

@cmp-nct I've been following your contributions about Vision models for a while. Very interested to hear your opinion about MiniCPM-V-2.6 and MiniCPM-V-2.5 versions.

fairydreaming · 2024-08-20T17:32:22Z

Did anyone actually try to convert the model with the provided scripts as described in README-minicpmv2.5.md? It looks like there is a problem: #9098

tc-mb · 2024-09-11T03:09:59Z

mmproj and image are invalid tokens when using llama-minicpmv-cli binary file, but both of these tokens are used in the "example usage" line when running the binary file. Neither of these tokens are listed in "llama-minicpmv-cli --help" section. Attempting to run on android phone (Qualcomm 8650) through adb shell.

Hi, have you tried it on a PC? I think the problem is not with the code logic, but may be caused by cross-compilation.

tc-mb and others added 28 commits May 23, 2024 19:28

init

7a49a6f

rename

c536fa6

add run android for termux in readme

2b91903

add android readme

0480d5f

add instructions in readme

ec1cea7

change name in readme

a491f45

Update README.md

7573b63

fixed line

94dcaba

Merge pull request #1 from harvestingmoon/minicpm-v2.5

b31f51f

Fixed Line

add result in readme

629420e

random pos_embed

b48708a

add positions index

d9fbc1d

change for ollama

18fe620

change for ollama

2997a68

better pos_embed in clip

8541e99

support ollama

d8974b8

updata cmakelist

e73a0c7

updata cmakelist

6366d62

rename wrapper

056d178

clear code

3c306f1

replace and organize code

9495504

add link

b37ab0b

Merge branch 'prepare-PR-of-minicpm-v2.5' into prepare-PR

8767ce2

Merge pull request #7 from OpenBMB/prepare-PR

8bd47ce

sync master

Merge pull request #8 from OpenBMB/master

28d4a7f

sync master

sync master

02eb445

fix warnings

07f48f9

fix warnings

c38d152

tc-mb marked this pull request as ready for review May 28, 2024 20:41

github-actions bot added the examples label May 28, 2024

tc-mb and others added 7 commits August 7, 2024 01:14

Merge branch 'master' into prepare-PR-of-minicpm-v2.5

5ec4de7

fix Type-Check error

5ab9577

fix Type-Check error

28230d0

fix Type-Check error

e3eff2a

fix Type-Check error

0eb0bfa

fix makefile error

712fd7c

fix ubuntu-make error

616f3ea

try fix clip

2d14c81

This was referenced Aug 8, 2024

add openbmb MiniCPM-V-2_6 ollama/ollama#6267

Closed

Ollama Creat 手动部署报错 Error: invalid file magic ollama/ollama#6272

Closed

try fix 1

069631e

ggerganov merged commit 3071c0a into ggerganov:master Aug 9, 2024
54 checks passed

ggerganov mentioned this pull request Aug 9, 2024

make : fix llava obj file race #8946

Merged

4 tasks

Forevery1 mentioned this pull request Aug 11, 2024

add MiniCPM-V-2_5 ollama/ollama#6307

Closed

compilade reviewed Aug 11, 2024

View reviewed changes

examples/llava/requirements.txt Show resolved Hide resolved

ggerganov mentioned this pull request Aug 11, 2024

py : fix requirements check '==' -> '~=' #8982

Merged

4 tasks

chigkim mentioned this pull request Aug 11, 2024

openbmb / MiniCPM-Llama3-V-2_5 ollama/ollama#6313

Closed

compilade mentioned this pull request Aug 11, 2024

fix: Fixes wrong input type for raw_dtype in ggml to gguf scripts #8928

Merged

4 tasks

tc-mb deleted the prepare-PR-of-minicpm-v2.5 branch August 12, 2024 08:48

RobinJing mentioned this pull request Aug 21, 2024

Minicpm-V-2.5 Llama.cpp and Ollama Support intel-analytics/ipex-llm#11886

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support MiniCPM-V-2.5 #7599

support MiniCPM-V-2.5 #7599

tc-mb commented May 28, 2024 •

edited

Loading

tc-mb commented Aug 7, 2024

ggerganov commented Aug 7, 2024

tc-mb commented Aug 8, 2024 •

edited

Loading

tc-mb commented Aug 9, 2024

cmp-nct commented Aug 9, 2024

chigkim commented Aug 11, 2024 •

edited

Loading

tc-mb commented Aug 13, 2024

lin72h commented Aug 15, 2024 •

edited

Loading

fairydreaming commented Aug 20, 2024

tc-mb commented Sep 11, 2024

support MiniCPM-V-2.5 #7599

support MiniCPM-V-2.5 #7599

Conversation

tc-mb commented May 28, 2024 • edited Loading

tc-mb commented Aug 7, 2024

ggerganov commented Aug 7, 2024

tc-mb commented Aug 8, 2024 • edited Loading

tc-mb commented Aug 9, 2024

cmp-nct commented Aug 9, 2024

chigkim commented Aug 11, 2024 • edited Loading

tc-mb commented Aug 13, 2024

lin72h commented Aug 15, 2024 • edited Loading

fairydreaming commented Aug 20, 2024

tc-mb commented Sep 11, 2024

tc-mb commented May 28, 2024 •

edited

Loading

tc-mb commented Aug 8, 2024 •

edited

Loading

chigkim commented Aug 11, 2024 •

edited

Loading

lin72h commented Aug 15, 2024 •

edited

Loading