Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教一下关于SFT的问题 #26

Open
df2046df opened this issue Jun 23, 2024 · 3 comments
Open

请教一下关于SFT的问题 #26

df2046df opened this issue Jun 23, 2024 · 3 comments

Comments

@df2046df
Copy link

我在运行SFT时出现了系统内存不足的情况:
RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 15132180480 bytes. Error code 12 (Cannot allocate memory)
请问这种问题可能是什么原因导致的呢,初次接触SFT,对这方面不太懂,想请您解答一下

@Coobiw
Copy link
Owner

Coobiw commented Jun 23, 2024

你应该是运行的7B版本的sft吧,我在sft load模型的时候,模型会先load到cpu,然后再pipeline parallel到对应的GPU上,你这里我感觉应该是CPU内存不够,存不下半精度的7~8B的模型(7~8B模型 半精度,约需要14~16GB的存储空间)

@df2046df
Copy link
Author

你应该是运行的7B版本的sft吧,我在sft load模型的时候,模型会先load到cpu,然后再pipeline parallel到对应的GPU上,你这里我感觉应该是CPU内存不够,存不下半精度的7~8B的模型(7~8B模型 半精度,约需要14~16GB的存储空间)

那可以做到把模型直接load到gpu上吗,我这边的cpu可能达不到这个要求

@Coobiw
Copy link
Owner

Coobiw commented Jun 25, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants