Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2:4 sparsity with to_sparse_semi_structured method from pytorch results in memory issue #28

Open
Ahmed-Roushdy opened this issue Mar 10, 2024 · 0 comments

Comments

@Ahmed-Roushdy
Copy link

Ahmed-Roushdy commented Mar 10, 2024

I am trying to reduce the memory footprint of the 2:4 sparsegpt pruned LLaMA2 model using to_sparse_semi_structured method from PyTorch. However, when I apply this to modify the way the sparse parameters are stored, I got out of memory. Please note that I did not get out of memory for the original dense model.
Below is the code I was running, where model_path is the path to the pruned model.

from torch.sparse import to_sparse_semi_structured, SparseSemiStructuredTensor
model = AutoModelForCausalLM.from_pretrained(model_path)
model = model.to(device).half()

for fqn, module in model.named_modules():
    # print(fqn)
    if isinstance(module, nn.Linear):
        module.weight = nn.Parameter(to_sparse_semi_structured(module.weight))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant