Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generating P(T) for Anti-LM #34

Open
kuhanw opened this issue Dec 30, 2017 · 2 comments
Open

generating P(T) for Anti-LM #34

kuhanw opened this issue Dec 30, 2017 · 2 comments

Comments

@kuhanw
Copy link

kuhanw commented Dec 30, 2017

Hi all,

I am trying to understand the implementation of the anti-LM model, in particular the meaning of this line:

line 128: all_prob_t = model_step(dummy_encoder_inputs, cand['dec_inp'], dptr, target_weights, bucket_id)

where dummy_encoder_inputs is dummy_encoder_inputs = [np.array([data_utils.PAD_ID]) for _ in range(len(encoder_inputs))].

in tf_chatbot_seq2seq_antilm/lib/seq2seq_model_utils.py.

This is presumably the probability of the target (P(T)) from the paper https://arxiv.org/pdf/1510.03055.pdf, but how does feeding in an encoder input sequence of PAD give you the probability of T?

Anyone have any ideas?

Cheers,

Kuhan

@zhongpeixiang
Copy link

zhongpeixiang commented Mar 2, 2018

Probably training a separate language model on the dataset containing target responses would be an idea to calculate P(T)? Or using dummy encoder_inputs means given no input sentence, what's the probability of target response T.

@rtygbwwwerr
Copy link

@kuhanw I think it might make sense. Using PAD as the initial input of the encoder results in the same initial state of the decoder corresponding different p(T|S). Naturally, the first several output words of the decoder are more influenced by p(T|S) rather than U(T), that is consistent with the original thought of JiweiLi's paper. Commonly, the decoder is considered as a language model, and I think input empty is a simple way to implement anti-MMI without external Model. I hope it could be explicated by the author.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants