Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to extend Trankit for target language generation? #72

Open
yash-srivastava19 opened this issue Jun 4, 2023 · 3 comments

Comments

@yash-srivastava19
Copy link

Hi !!

I was curious whether Trankit can be extended(by modifying/adding components to the Training Pipeline) for target language generation task, say for example morphological generation. I really like the approach that was done using Transformers, and wanted to ask whether it is possible for TL? Any help on this would be really appreciated.

Apart from that, as per the architecture given in the paper :

architecture

Is it at all possible to not go in the sequential order as given. Suppose, If I want to give the output from the Lemmatizer to the NER module or from output from the PosDep module to the NER module? Is at all possible to do it without breaking the system? Any pointers would be really really really helpful

@singhakr
Copy link

singhakr commented Jun 4, 2023

I am part of the same team as Yash, who posted this question. We have been browsing through the code and we have some clues about how it could be done, but we are not so far able to put it all together. What we want to do is something like extending the customized-mwt-ner pipeline so that it can also be trained to do morphological inflection in context, using the output of customized-mwt-ner pipeline on which lexical transfer has been carried out, so that the lemmas are now in a language different from the one on which customized-mwt-ner pipeline was run. I guess the key part for this will be using an adapter.

If we are able to build this pipeline, we would also be happy to share it here.

Or, if that is too much work or not feasible for other reasons, could we use the model from customized-mwt-ner pipeline for morphological inflection in context using a finetuning or transfer learning approach?

@singhakr
Copy link

singhakr commented Jun 4, 2023

We also want to do the second part of what Yash asked, but that is apart from the morphological inflection in context part.

@yash-srivastava19
Copy link
Author

In terms of implementation, it might be making a custom pipeline that does opposite of what is being done by the conventional pipelines in the toolkit. Instead of having to do tokenization, tagging from sentence, we can have the opposite. If this can be added as a feature, then it will be really beneficial for machine translation tasks - which like us, many will be planning on using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants