Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass FC type along for all FFN types #1196

Merged
merged 4 commits into from
May 11, 2024
Merged

Conversation

dakinggg
Copy link
Collaborator

@dakinggg dakinggg commented May 11, 2024

Previously, we were accidentally setting the ffn_config to the ffn_config_defaults dictionary directly. This meant that the modification of self.ffn_config here (

self.ffn_config['fc_type'] = self.fc_type
) could potentially alter the defaults dictionary. This led to unexpected results if the constructor is called multiple times (which it is, during the HF construction process, internally to huggingface code). The result was the fc_type was always added to the ffn_config. With the recent overhaul of the config system, we altered this behavior by calling the constructor directly instead of from_dict, removing the extra construction call that was modifying the defaults.

This PR therefore does two things:
(1) Passes fc_type along to all ffn construction functions
(2) fixes the dangerous dictionary default setting in the MPT config class

@dakinggg dakinggg requested a review from vchiley May 11, 2024 01:38
@ShashankMosaicML ShashankMosaicML self-requested a review May 11, 2024 04:15
@dakinggg dakinggg merged commit eef4872 into mosaicml:main May 11, 2024
9 checks passed
@dakinggg dakinggg deleted the fix-fc-type branch June 22, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants