Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable tie_word_embeddings config setting to enable / disable weight tied embeddings #728

Merged
merged 20 commits into from
Nov 13, 2023

Conversation

vchiley
Copy link
Contributor

@vchiley vchiley commented Nov 9, 2023

Enable disabling embedding weight tying
(Using huggingface PretrainedConfig's tie_word_embeddings kwarg)

Same seed, same config (except tie_word_embeddings: {true, false}), produces slightly different convergence curves
Screenshot 2023-11-09 at 5 04 05 PM
and the resulting param count is different:
Screenshot 2023-11-09 at 5 06 02 PM

@vchiley vchiley force-pushed the notie_embd branch 2 times, most recently from 1fe7e55 to 83993bf Compare November 10, 2023 20:40
@vchiley vchiley changed the title (draft; do not merge) Enable disabling embed weight tying Enable tie_word_embeddings config to enable / disable weight tied embeddings Nov 10, 2023
@vchiley vchiley changed the title Enable tie_word_embeddings config to enable / disable weight tied embeddings Enable tie_word_embeddings config setting to enable / disable weight tied embeddings Nov 10, 2023
@vchiley vchiley marked this pull request as ready for review November 10, 2023 22:01
llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved
llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved
llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved
llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved
llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved
tests/test_model.py Outdated Show resolved Hide resolved
dakinggg
dakinggg previously approved these changes Nov 13, 2023
tests/test_model.py Outdated Show resolved Hide resolved
@vchiley vchiley merged commit 7899178 into mosaicml:main Nov 13, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants