Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixtral official #942

Merged
merged 3 commits into from
Dec 12, 2023
Merged

Mixtral official #942

merged 3 commits into from
Dec 12, 2023

Conversation

winglian
Copy link
Collaborator

there is a mixtral fix after the latest transformers release, so can't pin it yet. there is also a mixtral hot-fix ticket open in transformers that we should keep an eye on

this PR feels a bit cleaner than previous multipack patches since it abuses the MIXTRAL_ATTENTION_CLASSES functionality to replace the attention class cleanly

@winglian winglian merged commit 7fabc4d into main Dec 12, 2023
4 checks passed
@winglian winglian deleted the mixtral-official branch December 12, 2023 04:44
@JustinMeans JustinMeans mentioned this pull request Dec 12, 2023
8 tasks
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
* multipack support for official mixtral implementation

* fix patch to load multipack for mixtral

* chore: lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant