Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable One-Shot Launch from Finetuning Script #1907

Merged
merged 9 commits into from
Dec 21, 2023
Merged

Enable One-Shot Launch from Finetuning Script #1907

merged 9 commits into from
Dec 21, 2023

Conversation

Satrat
Copy link
Contributor

@Satrat Satrat commented Dec 14, 2023

This PR sets up the finetuning script to be able to launch a one-shot flow. The change log for alternating finetuning was starting to pile up, so wanted to open this as a preliminary PR before adding in the alternating logic.

Summary of Changes

  • Add oneshot path to text_generation.py (originally the finetuning script)
  • Added StageRunner class for managing the different flows (train, eval, oneshot), rather than doing everything inside the entrypoint text_generation.py script
  • Moved a bunch of helper functions that are now shared between text_generation.py and obcq.py (Eventually we can probably get rid of the OBCQ script)

Testing

test_oneshot_recipe.yaml

test_stage:
  obcq_modifiers:
    SparseGPTModifier:
      sparsity: 0.5
      block_size: 128
      sequential_update: False
      quantize: False
      percdamp: 0.01
      prunen: 0
      prunem: 0
      targets: [
        "re:model.layers.\\d+$"
      ]
      target_ids: ["attention_mask", "position_ids"]  

Test script:

def run():
    from sparseml.transformers.finetune.text_generation import run_general
    
    model = "Xenova/llama2.c-stories15M"
    dataset_name = "open_platypus"
    concatenate_data = False
    do_oneshot = True
    output_dir = "./output_oneshot"
    overwrite_output_dir = True
    recipe = "test_oneshot_recipe.yaml"
    splits = {
        "calibration": "train[:90%]",
    }

    run_general(
        model_name_or_path=model,
        dataset_name=dataset_name,
        do_oneshot=do_oneshot,
        output_dir=output_dir,
        overwrite_output_dir=overwrite_output_dir,
        recipe=recipe,
        concatenate_data = concatenate_data,
        splits = splits
    )

if __name__ == "__main__":
    run()

@Satrat Satrat marked this pull request as ready for review December 14, 2023 23:16
@Satrat Satrat mentioned this pull request Dec 15, 2023
@Satrat Satrat merged commit f088321 into main Dec 21, 2023
12 checks passed
@Satrat Satrat deleted the alternate_flows branch December 21, 2023 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants