Err in char_rnn tutorial #193

williamFalcon · 2018-01-10T13:00:29Z

@apaszke
This tutorial "char_rnn_classification" has a bug in the forward part of this code:

import torch.nn as nn
from torch.autograd import Variable

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.hidden_size = hidden_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def initHidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)

The RNN formula is:

Which is implemented by these lines:

        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)

However, these lines:

        output = self.i2o(combined)   
        output = self.softmax(output)

Are trying to project into the classification space. However, the self.i2o operates on the combined output instead of the ht output.

This implementation uses the wrong formula:

But the correct formula is:

Which can be implemented as:

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(hidden) # this line changed   (bc hidden = ht. combined = ht-1)
        output = self.softmax(output)
        return output, hidden

Basically, the current implementation does this:

But it really should do this:

ayush1999 · 2018-03-25T16:17:54Z

@williamFalcon Small question. You mentioned that

combined = torch.cat((input, hidden), 0)
hidden = self.i2h(combined)

Performs the RNN formula. I didn't get how concatenating the input and hidden tensors is useful in this scenario. Why the need for concatenation?

EmreOzkose · 2018-10-27T11:30:29Z

I think it should be changed. The implemented formula is wrong?

Lguyogiro · 2022-10-08T22:25:25Z

@williamFalcon Small question. You mentioned that
combined = torch.cat((input, hidden), 0)
hidden = self.i2h(combined)
Performs the RNN formula. I didn't get how concatenating the input and hidden tensors is useful in this scenario. Why the need for concatenation?

I know this is an old issue/comment....but you don't "need" to concatenate the input and hidden tensors, this is just a way to calculate the same thing without having two separate weights tensors (the standard formulae typically assume two weights matrices, see below)

Also I just wanted to chime in. I agree that this should be changed.

mikebrow · 2023-05-31T16:58:41Z

/assigntome

svekars added medium docathon-h1-2023 A label for the docathon in H1 2023 labels May 31, 2023

github-actions bot assigned mikebrow May 31, 2023

mikebrow mentioned this issue May 31, 2023

Address Err in char_rnn tutorial issue #2374

Merged

4 tasks

BeniaminC mentioned this issue Jun 1, 2023

I found a mistake in an official case #1052

Closed

svekars closed this as completed in #2374 Jun 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Err in char_rnn tutorial #193

Err in char_rnn tutorial #193

williamFalcon commented Jan 10, 2018 •

edited

Loading

ayush1999 commented Mar 25, 2018

EmreOzkose commented Oct 27, 2018

Lguyogiro commented Oct 8, 2022

mikebrow commented May 31, 2023

Err in char_rnn tutorial #193

Err in char_rnn tutorial #193

Comments

williamFalcon commented Jan 10, 2018 • edited Loading

ayush1999 commented Mar 25, 2018

EmreOzkose commented Oct 27, 2018

Lguyogiro commented Oct 8, 2022

mikebrow commented May 31, 2023

williamFalcon commented Jan 10, 2018 •

edited

Loading