Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么baseline.md里multitask的performance基本上都要比single task更差 #30

Open
OYE93 opened this issue Sep 26, 2019 · 4 comments

Comments

@OYE93
Copy link

OYE93 commented Sep 26, 2019

如题,谢谢

@JayYip
Copy link
Owner

JayYip commented Oct 9, 2019

这篇论文也是得到了类似结果,你感兴趣可以看看

我知道在微软的MTDNN文章中其多任务效果好过单任务,但是我觉得那需要cherry-pick各个任务。

@OYE93
Copy link
Author

OYE93 commented Oct 9, 2019

你好,我大概看了一下这篇文章,也是说multi-task比single training效果更好,我觉得可能是multi-task的不同机制,或者数据集也会影响效果,谢谢你的回复

@JayYip
Copy link
Owner

JayYip commented Oct 9, 2019

你好,我大概看了一下这篇文章,也是说multi-task比single training效果更好,我觉得可能是multi-task的不同机制,或者数据集也会影响效果,谢谢你的回复

你的理解可能有误,微软的MT-DNN和这个repo的做法事实上就是上面那篇文章的uniform scaling, 你看文章的Figure 2和Table 1, uniform scaling是比single task的效果要差的。

@OYE93
Copy link
Author

OYE93 commented Oct 10, 2019

好的,那应该是我对mt-dnn也没有理解,这篇文章是说uniform scaling没有single training效果好。
我再看看,非常感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants