[Question] Actual usage examples? #29

Maghoumi · 2019-02-19T00:04:53Z

Besides the toy examples listed in the docs and tests, are there actual examples of this library available anywhere?

I'm interested in using this library for a sequence labeling project, but I'm curious to know if I'm using this library correctly. What I have is something like this:

class MyModel(nn.Module):
    def __init__(self, num_features, num_classes):
        super(MyModel, self).__init__()
        self.num_features = num_features
        self.num_classes = num_classes
        self.lstm = nn.LSTM(num_features, 128)
        self.fc = nn.Linear(128, num_classes)
        self.crf = CRF(num_classes)

# ----------------------------------------------------------
model = MyModel(...)

# Training loop:
y_hat = model(batch)  # The network's forward returns fc(lstm(batch))
loss = -model.crf(y_hat, y)
loss.backward()
optimizer.step()

Although this seems to work and the loss is decreasing, I have a feeling that I might be missing something.
Any help is appreciated. Thanks!

The text was updated successfully, but these errors were encountered:

kmkurn · 2019-02-21T00:55:06Z

Hi,

Your usage seems alright. The examples are meant to show how to use the CRF layer given that one has produced the emission scores, i.e. (unnormalized) log P(y_t | X) where y_t is the tag at position t and X is the input sentence. In your code, y_hat would have a shape of (seq_length, batch_size, num_classes) where each y_hat[i, j, k] contains the score of the j-th example in the batch having tag k in the i-th position, which is as expected. I'll consider adding a more complete example in the docs. Thanks for the suggestion!

Maghoumi · 2019-02-21T01:20:28Z

Thanks for your response. What was confusing to me originally was the fact that your CRF layer is actually a loss that one can minimize, whereas other PyTorch implementations had a separately-defined Viterbi loss module.

Yes, dimensions that you mentioned coincide with what I have in my code.
After reading your explanation, I think the only change needed in my pseudo-code above would be to change loss = -model.crf(y_hat, y) to loss = -model.crf(y_hat.log_softmax(2), y) (given that the output of the FC layer is directly returned from the network, but we need emission scores).

kmkurn · 2019-02-21T01:42:35Z

What was confusing to me originally was the fact that your CRF layer is actually a loss that one can minimize, whereas other PyTorch implementations had a separately-defined Viterbi loss module.

Actually, this is something that I think about every now and then. Right now the forward method returns the loss, which does not really fit to other patterns where forward returns some kind of prediction and the loss object is the one computing the loss from the given prediction and gold target, as you mentioned. It may be helpful to provide such loss class for those who are more comfortable with this pattern such as yourself. Thanks for bringing this up.

would be to change loss = -model.crf(y_hat, y) to loss = -model.crf(y_hat.log_softmax(2), y)

You don't have to. The CRF layer accepts unnormalized emission scores just fine. It'll normalize the score of y over all possible sequence of tags.

Maghoumi · 2019-02-21T01:47:34Z

It may be helpful to provide such loss class for those who are more comfortable with this pattern such as yourself. Thanks for bringing this up.

No problem! That would be an excellent idea.
Also thanks for clarification regarding the log_softmax() bit.

Feel free to close this issue, or keep it open as a reminder if you decide to incorporate more examples and also change the library such that loss is separate from the CRF's output. I'd personally be very happy if the changes/examples are added.
Also, thanks for the great library!

xiaodaoyoumin · 2019-06-11T17:47:39Z

Hi, I got some problem
I download the test_crf.py which you provide some examples.
then in the final line I add following codes

if __name__ == "__main__":
    m = TestDecode()
    
    m.test_batched_decode()

first error come as CRF model has no the attribute batch_first , I solve it manually by ‘ CRF.batch_first = False’

then when I run the above code , the error come as

*** IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

I am not sure if it is because my torch version is 1.1.0 , by the way , which torch version do you recommend?

xiaodaoyoumin · 2019-06-13T06:27:03Z

I have solve it , when the torch version is <= 1.0.0 , then no error

kmkurn · 2019-06-13T22:23:00Z

@Huijun-Cui Thanks for letting me know. Next time please open a separate issue.

Repository owner locked as off-topic and limited conversation to collaborators Jun 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Actual usage examples? #29

[Question] Actual usage examples? #29

Maghoumi commented Feb 19, 2019 •

edited

Loading

kmkurn commented Feb 21, 2019

Maghoumi commented Feb 21, 2019

kmkurn commented Feb 21, 2019

Maghoumi commented Feb 21, 2019

xiaodaoyoumin commented Jun 11, 2019 •

edited

Loading

xiaodaoyoumin commented Jun 13, 2019

kmkurn commented Jun 13, 2019

[Question] Actual usage examples? #29

[Question] Actual usage examples? #29

Comments

Maghoumi commented Feb 19, 2019 • edited Loading

kmkurn commented Feb 21, 2019

Maghoumi commented Feb 21, 2019

kmkurn commented Feb 21, 2019

Maghoumi commented Feb 21, 2019

xiaodaoyoumin commented Jun 11, 2019 • edited Loading

xiaodaoyoumin commented Jun 13, 2019

kmkurn commented Jun 13, 2019

Maghoumi commented Feb 19, 2019 •

edited

Loading

xiaodaoyoumin commented Jun 11, 2019 •

edited

Loading