Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation Score #5

Open
muni2773 opened this issue Aug 31, 2018 · 4 comments
Open

Validation Score #5

muni2773 opened this issue Aug 31, 2018 · 4 comments

Comments

@muni2773
Copy link

Hi You Mention in your readme that the originial paper results were

Mnemonic Reader (original paper) 71.8 81.2

But the paper references much higher results ?
https://arxiv.org/pdf/1705.02798.pdf

Could you shed some light on why this is different?

@seanliu96
Copy link
Collaborator

The link you provided is the latest paper. But this repo corresponds to the v3 version (https://arxiv.org/pdf/1705.02798v3.pdf) which was published on 5 Sep 2017. If I have free time, I will check the places that have been improved. Of course, welcome you to help to find the difference. Thanks!

@muni2773
Copy link
Author

Running the training right now on 2 GPUs. Epoch 21 - EM - 72.93 and F1 - 81.69. Looks like I might have already gotten better than published results.

Happy to help if you point me in some direction.

@muni2773
Copy link
Author

muni2773 commented Sep 4, 2018

Hi Sean,

The following are the changes to the new paper vs V3. The embedding layers are the same as before as they aren't really mentioned in the new paper at all.

1.) Reattention Mechanism - In addition to Iterative Alignment and Self Alignment there is a proposed Alignment memory layer to address the problem that each alignment is not directly aware of previous alignments. The intuition is that two words should be correlated if their attentions about same texts are highly overlapped, and be less related vice versa. Suppose that we have access to previous attentions, and then we can compute their dot product to obtain a “similarity of attention”. This is the biggest addition to the model versus V3 and would be the main effort in coding.

2.) Dynamic-critical Reinforcement Learning - This is essentially a combination of previous Memory based Answer Pointer and Reinforcement Learning with changes. The changes are few and should be easy to implement.

Let me know what you think about the above.

Thanks in Advance

Muni

@hackerwei
Copy link

@muni2773 do you have any progress in the EM or F1 compared to V3 version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants