Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Err in char_rnn tutorial #193

Closed
williamFalcon opened this issue Jan 10, 2018 · 4 comments · Fixed by #2374
Closed

Err in char_rnn tutorial #193

williamFalcon opened this issue Jan 10, 2018 · 4 comments · Fixed by #2374
Assignees
Labels
docathon-h1-2023 A label for the docathon in H1 2023 medium

Comments

@williamFalcon
Copy link

williamFalcon commented Jan 10, 2018

@apaszke
This tutorial "char_rnn_classification" has a bug in the forward part of this code:

import torch.nn as nn
from torch.autograd import Variable

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.hidden_size = hidden_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def initHidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)

The RNN formula is:
image

Which is implemented by these lines:

        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)

However, these lines:

        output = self.i2o(combined)   
        output = self.softmax(output)

Are trying to project into the classification space. However, the self.i2o operates on the combined output instead of the ht output.

This implementation uses the wrong formula:
image

But the correct formula is:
image

Which can be implemented as:

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(hidden) # this line changed   (bc hidden = ht. combined = ht-1)
        output = self.softmax(output)
        return output, hidden

Basically, the current implementation does this:
image

But it really should do this:
image

@ayush1999
Copy link

@williamFalcon Small question. You mentioned that

combined = torch.cat((input, hidden), 0)
hidden = self.i2h(combined)

Performs the RNN formula. I didn't get how concatenating the input and hidden tensors is useful in this scenario. Why the need for concatenation?

@EmreOzkose
Copy link

I think it should be changed. The implemented formula is wrong?

@Lguyogiro
Copy link

@williamFalcon Small question. You mentioned that

combined = torch.cat((input, hidden), 0)
hidden = self.i2h(combined)

Performs the RNN formula. I didn't get how concatenating the input and hidden tensors is useful in this scenario. Why the need for concatenation?

I know this is an old issue/comment....but you don't "need" to concatenate the input and hidden tensors, this is just a way to calculate the same thing without having two separate weights tensors (the standard formulae typically assume two weights matrices, see below)
image

Also I just wanted to chime in. I agree that this should be changed.

@svekars svekars added medium docathon-h1-2023 A label for the docathon in H1 2023 labels May 31, 2023
@mikebrow
Copy link
Contributor

/assigntome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docathon-h1-2023 A label for the docathon in H1 2023 medium
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants