Add support for an incremental alignment model that models fertility #4

ddaspit · 2017-04-04T07:15:38Z

Currently, Thot has support for IBM1, IBM2, and HMM models, none of which models fertility. Are there any plans to support IBM3, IBM4, IBM5, or one of the extensions to HMM that models/simulates fertility? Obviously, the IBM models 3-5 are complex and might be difficult to support. Some of the fertility extensions to HMM seem simpler and would improve accuracy.

daormar · 2017-04-05T12:55:59Z

Hi Damien,

as you point out IBM models 3-5 are complex and difficult to introduce. However, the main reason
why they have not been included yet into the toolkit is because they do not seem to produce significant gains in translation quality with respect to HMM-based models (a better alignment error rate does not always result in improved translation quality).

I agree with you in that probably the way to go in this case would be to incorporate fertility extensions to HMM (I assume that with this you are referring to the 2002 paper by Toutanova et. al). The problem is that this is still not so easy to incorporate and currently we are focusing in the improvement of other aspects of the toolkit.

On the other hand, I was wondering if you need such models to generate alignments or only because of the potential improvements they would produce in translation quality.

ddaspit · 2017-04-06T07:40:36Z

First off, I just want to say that I appreciate the work that you have done on Thot. The incremental training and interactive machine translation features are invaluable.

I certainly understand that the IBM models 3-5 do not greatly improve the quality for translation and that the main purpose of Thot is machine translation. We are using Thot for machine translation, but, as you guessed, we are also using the single word alignment models to align texts for various purpose. That is why we are interested in adding fertility to the HMM model. Thot's ability to perform incremental training of the word alignment models is important for our project, so Giza++ isn't really an option. The HMM model is working well for us. We were just looking into possible ways of improving the quality of the alignments and modeling fertility seemed to be the most promising route, so I wanted to find out if there was any plans to add it.

daormar · 2017-04-07T16:50:40Z

First of all, thanks very much for your interest in the tool and for your response.

I think that HMM alignment models with fertility would be a very interesting feature to incorporate in the toolkit. In spite of the fact that currently we don't have the possibility to spend time on that, lately we are interested in incorporating new collaborators in the project, and such feature could constitute one candidate for future developments.

ddaspit · 2017-04-08T07:39:17Z

That sounds good to me. I might be able to work on it at some point and submit it as a pull request. Thank you for keeping it as a candidate for future development.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for an incremental alignment model that models fertility #4

Add support for an incremental alignment model that models fertility #4

ddaspit commented Apr 4, 2017

daormar commented Apr 5, 2017 •

edited

Loading

ddaspit commented Apr 6, 2017

daormar commented Apr 7, 2017

ddaspit commented Apr 8, 2017

Add support for an incremental alignment model that models fertility #4

Add support for an incremental alignment model that models fertility #4

Comments

ddaspit commented Apr 4, 2017

daormar commented Apr 5, 2017 • edited Loading

ddaspit commented Apr 6, 2017

daormar commented Apr 7, 2017

ddaspit commented Apr 8, 2017

daormar commented Apr 5, 2017 •

edited

Loading