Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

标签数量问题 #10

Open
nvliajia opened this issue Aug 12, 2020 · 10 comments
Open

标签数量问题 #10

nvliajia opened this issue Aug 12, 2020 · 10 comments

Comments

@nvliajia
Copy link

明明是二分类的数据,为何显示class_num=3呢?

@zzzzzigzag
Copy link

同问

1 similar comment
@Leanne-z
Copy link

同问

@zonghui0228
Copy link

zonghui0228 commented Dec 6, 2021

多了个unk

@caoxiaopeng123
Copy link

可以将代码中:args.class_num = len(label_field.vocab) 换成:args.class_num = len(label_field.vocab) - 1。因为代码用的时候是label_field.build_vocab(train_dataset, dev_dataset) 用的制作词汇表的代码,而词汇表中有一个unk,就是没有出现在词汇表中单词的代表形式,所以会多一个unk。label_field只对标签数量产生影响,只要把标签数量改回原始数量就行了。

@caoxiaopeng123
Copy link

Batch[1800] - loss: 0.009499 acc: 100.0000%(128/128)
Evaluation - loss: 0.000026 acc: 94.0000%(6616/7000)
early stop by 1000 steps, acc: 94.0000% 这个是作者跑出的结果;

Batch[2200] - loss: 0.008443 acc: 100.0000%(128/128)
Evaluation - loss: 0.000025 acc: 94.7429%(6632/7000)
Saving best model, acc: 94.7429%
这个是我跑出的结果

@Huashan7
Copy link

请问这个结果,只是args.class_num = len(label_field.vocab) 换成:args.class_num = len(label_field.vocab) - 1吗?:
Batch[2200] - loss: 0.008443 acc: 100.0000%(128/128)
Evaluation - loss: 0.000025 acc: 94.7429%(6632/7000)
Saving best model, acc: 94.7429%

@caoxiaopeng123
Copy link

是的,其他的我记得我也没做修改,就搭建了环境!

@Huashan7
Copy link

@caoxiaopeng123 十分感谢

@lynn1885
Copy link

lynn1885 commented May 8, 2023

或者在配置label_field的时候可以设置关掉试试,:
label_field = data.Field(sequential=False, unk_token=None)
我自己写的时候发现这样len(label_field.vocab)输出是正常的, 是2

@Cgetier520990
Copy link

大佬,那为什么我只有1000+条数据,为啥args.vocabulary_size = len(text_field.vocab)是4210多个?是因为我1000+条的数据构成了一个字典一样的东西,然后相当于字典里面存了4210个词汇吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants