Feat: Add support for upstream FA2 #626

NanoCode012 · 2023-09-23T05:21:09Z

Untested!

Only works for

llama
falcon

TODO:

Test

Ref Upstream PR: https://github.com/huggingface/transformers/pull/25598/files

mhenrichsen · 2023-09-25T08:16:51Z

Maybe add is_falcon_derived_model to the falcon examples?

winglian · 2023-09-25T15:53:01Z

there is some falcon patching that also already happens here https://github.com/OpenAccess-AI-Collective/axolotl/blob/f3d939016a6ec681be40e83c6a57c682cb60e2b1/src/axolotl/utils/models.py#L117-L127

NanoCode012 · 2023-09-25T15:58:01Z

Does sample packing work for falcon?

I'll consolidate the check for falcon derived model.

winglian · 2023-09-25T17:54:24Z

Does sample packing work for falcon?

I'll consolidate the check for falcon derived model.

no, it's basic flash attention support only

NanoCode012 · 2023-09-26T13:01:33Z

Tested with

CUDA_VISIBLE_DEVICES=0 accelerate launch -m axolotl.cli.train examples/openllama-3b/lora.yml

using sample_packing off.

With main 5e5296a, it gives: TypeError: flashattn_forward() got an unexpected keyword argument 'padding_mask'
With this PR,

[2023-09-26 13:00:15,139] [INFO] [axolotl.train.train:84] [PID:751] [RANK:0] Pre-saving adapter config to ./lora-out
[2023-09-26 13:00:15,141] [INFO] [axolotl.train.train:108] [PID:751] [RANK:0] Starting trainer...
{'loss': 1.3152, 'learning_rate': 1e-05, 'epoch': 0.0}

With sample_packing on, it runs as usual. main has no issues here either.

* Feat: Add support for upstream FA2 * chore: add is_falcon_derived_model: true to examples * chore: add config to readme for documentation * feat: add extra model types * fix: remove old falcon flash patch * chore: pin transformers and accelerate

Feat: Add support for upstream FA2

5bdb494

NanoCode012 marked this pull request as draft September 24, 2023 23:09

NanoCode012 added 2 commits September 25, 2023 19:50

chore: add is_falcon_derived_model: true to examples

97d8aaf

chore: add config to readme for documentation

354ee7c

NanoCode012 added 2 commits September 26, 2023 03:16

feat: add extra model types

ea4e804

fix: remove old falcon flash patch

e44993e

NanoCode012 marked this pull request as ready for review September 26, 2023 13:01

NanoCode012 requested a review from winglian September 26, 2023 13:14

chore: pin transformers and accelerate

fd4cea2

winglian approved these changes Sep 26, 2023

View reviewed changes

winglian merged commit 19a600a into axolotl-ai-cloud:main Sep 26, 2023
4 checks passed

NanoCode012 deleted the feat/fa2-hf branch September 26, 2023 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Add support for upstream FA2 #626

Feat: Add support for upstream FA2 #626

NanoCode012 commented Sep 23, 2023 •

edited

Loading

mhenrichsen commented Sep 25, 2023

winglian commented Sep 25, 2023

NanoCode012 commented Sep 25, 2023

winglian commented Sep 25, 2023

NanoCode012 commented Sep 26, 2023 •

edited

Loading

Feat: Add support for upstream FA2 #626

Feat: Add support for upstream FA2 #626

Conversation

NanoCode012 commented Sep 23, 2023 • edited Loading

mhenrichsen commented Sep 25, 2023

winglian commented Sep 25, 2023

NanoCode012 commented Sep 25, 2023

winglian commented Sep 25, 2023

NanoCode012 commented Sep 26, 2023 • edited Loading

NanoCode012 commented Sep 23, 2023 •

edited

Loading

NanoCode012 commented Sep 26, 2023 •

edited

Loading