Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put ancestors on the same device as the KV cache #660

Merged
merged 1 commit into from
Feb 14, 2024

Conversation

raphaelchinchilla
Copy link
Contributor

Addressed the issue reported in #656 with the solution proposed by @rlouf .

I am really not sure whether this is the best solution.

Addressed the issue reported in outlines-dev#656 with the solution proposed by @rlouf . 

I am really not sure whether this is the best solution.
Copy link
Collaborator

@lapp0 lapp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing!

I think we should synchronize the device to ancestors since I believe (correct me if I'm wrong) it's possible that the kv_cache layers are on different devices.

Also could you please smoke test this with all 3 samplers on a 2 GPU system with device="auto" set?

@raphaelchinchilla
Copy link
Contributor Author

Thanks for contributing!

I am not sure whether this is really a contribution, I just did what @rlouf suggested :)

I think we should synchronize the device to ancestors since I believe (correct me if I'm wrong) it's possible that the kv_cache layers are on different devices.

Honestly, I understand very little about the package (I started using it yesterday), and I do not feel I can actually give my opinion on whether you are wrong about it or not. It does feel like too simple of a contribution.

Also could you please smoke test this with all 3 samplers on a 2 GPU system with device="auto" set?

Sorry, which 3 samplers are you talking about?

@rlouf
Copy link
Member

rlouf commented Feb 14, 2024

I just checked transformers and that's how they do it. ancestors is orders of magnitude smaller than layer so even with several layers that's very likely the most efficient. Anyway, if it solves @raphaelchinchilla 's problem let's merge it.

@raphaelchinchilla thank you for pushing the change, much appreciated! 🙏

@rlouf rlouf changed the title Update generator.py Put ancestors on the same device as the KV cache Feb 14, 2024
@rlouf rlouf merged commit c548a8a into outlines-dev:main Feb 14, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants