Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow Default Behavior for Specified Plugins of FMS Accel #334

Open
fabianlim opened this issue Sep 9, 2024 · 0 comments
Open

RFC: Allow Default Behavior for Specified Plugins of FMS Accel #334

fabianlim opened this issue Sep 9, 2024 · 0 comments

Comments

@fabianlim
Copy link
Collaborator

fabianlim commented Sep 9, 2024

Motivation: users may not always know how to best use fms-acceleration plugins. While there exists quite a bit of documentation on how to use the plugins, typical users will not read carefully. The hope is that we can reduce the burden on users to apply plugins if "there is no loss (and only gain) from doing so".

Users: These include various types:

  1. Users of the product Docker images.
  2. Users of fms-hf-tuning as a tuning library.

Idea: Implement a default behavior:

  • that can be tied to specified plugins that have been installed.
  • if a plugin is not installed, it is construed that the plugin is unwanted and no default behavior should be considered.
  • when default behavior is activated on an installed plugin, the plugin will activate without any concious behavior from the user.

To clarify what we mean by no concious behavior, it means that the installed plugin with default behavior will activate,
even if the user has not specified any command line arguments to activate it.

Various options for instantiating default behavior

Manual

provide a set of recommended set command line args, or append to command line args that a user provides.

This is the simplest form of default behavior.

  • requires no code change to fms-acceleration and its integration in fms-hf-tuning.
  • will however require code changes at the Docker packaging level to provide switches to append the commad line args.
  • it is very explicit and clear. Easy to debug because there will be command line traces.
  • there is no graceful failing. If the default args fail the run, then the user will have to re-run without the recommended defaults.

Automatic

augment the AccelerationFrameworkConfig integration of fms-accel into fms-hf-tuning to automatically run an installed plugin with defaults.

This is going to be much more complicated and various considerations must be carefully made.

  • requires quit abit code change to fms-acceleration and its integration in fms-hf-tuning, as the integration was not originally planned for default behaviors in mind.
  • may require very little changes on the Docker packaging.
  • We lose the explicit command line arguments. Debugging will require peering into the stderr logs. For example, we should set TRANSFORMERS_VERBOSITY=info to log which AccelerationFrameworkPlugins had been activated.
  • need to account for graceful failing the best we can.
  • need to be able to disable default behavior. If the run fails we need to allow the user to take over.

Some of the additional code items would be:

  • implement a way to override default behavior, this includes:
    • user manually activates the command args and sets (no default values).
    • user does not want automatic default behavior, and wants an installed plugin to remain inactive if no command line args are specified.
  • implement feedback to indicate to that a plugin has been deactivated because it had failed to instantiate default behavior.
  • implement logic to allow for a plugin to be set default
  • select plugins that should have default behavior. These must:
    • be able to fail over safely relatively most of the time.

Plugins that could be default

These may include:

  • padding_free: this plugin switching out the data collator, or patching the model (if transformers < 0.44).
  • fused-ops-and-kernels: this plugin includes some model-specific rules to replace functions with kernelized versions. If the model-rules do not match the model archiecture, it will fail over quite safely.
    • however one complication is that the fused-op for quantized lora requires a manual setting of the 4bit method (e.g., auto_gptq or bnb), because there is not really a simple way to infer it at the moment.
    • One way to handle such complications, is to simply do not make lora fused op a default, just the other kernels.
@fabianlim fabianlim changed the title RFC: Default Behavior of FMS Accel RFC: Allow Default Behavior of Specific Plugins of FMS Accel Sep 9, 2024
@fabianlim fabianlim changed the title RFC: Allow Default Behavior of Specific Plugins of FMS Accel RFC: Allow Default Behavior for Specified Plugins of FMS Accel Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant