Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix pos encoding error + target prefixing error in LM decoding #2099

Merged
merged 6 commits into from
Sep 20, 2021

Conversation

funboarder13920
Copy link
Collaborator

No description provided.

x = x.permute(perm).contiguous()
x = x.permute(perm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary?
If there is a break in terms of memory layout, shouldn't we fix it at the nearest place possible?
Leave a tensor variable to stay broken rather than fix it right away feels worrying.

Copy link
Collaborator Author

@funboarder13920 funboarder13920 Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter that the operation is not done at this point ? It will be performed in the next lines. I can remove it but I will have to perform the second contiguous operation in this function anyhow.
The tile function is/should not be called more than once per vector and per line of inference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it does not matter actually. But I personally prefer the array to stays in good condition at any point in the function if such change does not have any benefits in terms of performance.
Here is a nice explanation for contiguous or non-contiguous.

Comment on lines 1094 to 1095
log_probs = fn_map_state(log_probs, dim=1)
if fn_map_state is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look good.

Copy link
Collaborator Author

@funboarder13920 funboarder13920 Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could not use the variable as a function and then check if it's None or not. Also, I see there was another call for the same process a few lines before which seems redundant.
But now, it's all good.

@francoishernandez francoishernandez merged commit c8081af into OpenNMT:master Sep 20, 2021
@funboarder13920 funboarder13920 deleted the fix_gpt_inference branch September 22, 2021 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants