Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MASK] appears in the augmented sentences #175

Closed
saeedEnte opened this issue Nov 9, 2020 · 3 comments
Closed

[MASK] appears in the augmented sentences #175

saeedEnte opened this issue Nov 9, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@saeedEnte
Copy link

If I try augmenting the sentences of my dataset (using word augmenter with contextual embedding), for the case of action=substitute, the string [MASK] starts to appear for some augmentation tries. Do you know why is it the case?

@makcedward
Copy link
Owner

may you share a sample inputs?

@saeedEnte
Copy link
Author

@makcedward I have given the following command:
aug = naw. ContextualWordEmbsAug(model_path='distilbert-base-uncased', top_k=5, top_p=0.5, aug_p=0.4, action='substitute')
generated_sents.append(aug.augment(input_sent, n=1))

@makcedward makcedward added the bug Something isn't working label Nov 11, 2020
@makcedward
Copy link
Owner

Since all potential candidates are filtered, the final result includes [MASK]. Will fix it by providing the original token.

However, it may return original texts if top_p is too low. Suggest increasing top_p to 0.9 or 0.8 and have a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants