Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add support for upstream FA2 #626

Merged
merged 6 commits into from
Sep 26, 2023

Conversation

NanoCode012
Copy link
Collaborator

@NanoCode012 NanoCode012 commented Sep 23, 2023

Untested!

Only works for

  • llama
  • falcon

TODO:

  • Test

Ref Upstream PR: https://github.com/huggingface/transformers/pull/25598/files

@NanoCode012 NanoCode012 marked this pull request as draft September 24, 2023 23:09
@mhenrichsen
Copy link
Collaborator

Maybe add is_falcon_derived_model to the falcon examples?

@winglian
Copy link
Collaborator

@NanoCode012
Copy link
Collaborator Author

Does sample packing work for falcon?

I'll consolidate the check for falcon derived model.

@winglian
Copy link
Collaborator

Does sample packing work for falcon?

I'll consolidate the check for falcon derived model.

no, it's basic flash attention support only

@NanoCode012
Copy link
Collaborator Author

NanoCode012 commented Sep 26, 2023

Tested with

CUDA_VISIBLE_DEVICES=0 accelerate launch -m axolotl.cli.train examples/openllama-3b/lora.yml

using sample_packing off.

With main 5e5296a, it gives: TypeError: flashattn_forward() got an unexpected keyword argument 'padding_mask'
With this PR,

[2023-09-26 13:00:15,139] [INFO] [axolotl.train.train:84] [PID:751] [RANK:0] Pre-saving adapter config to ./lora-out
[2023-09-26 13:00:15,141] [INFO] [axolotl.train.train:108] [PID:751] [RANK:0] Starting trainer...
{'loss': 1.3152, 'learning_rate': 1e-05, 'epoch': 0.0}

With sample_packing on, it runs as usual. main has no issues here either.

@NanoCode012 NanoCode012 marked this pull request as ready for review September 26, 2023 13:01
@winglian winglian merged commit 19a600a into axolotl-ai-cloud:main Sep 26, 2023
4 checks passed
@NanoCode012 NanoCode012 deleted the feat/fa2-hf branch September 26, 2023 14:33
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
* Feat: Add support for upstream FA2

* chore: add is_falcon_derived_model: true to examples

* chore: add config to readme for documentation

* feat: add extra model types

* fix: remove old falcon flash patch

* chore: pin transformers and accelerate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants