Machine Translation: RNN, Transformer, and Pretrained Model Implementations

Overview

This project focused on developing a machine translation system with two main objectives. First, a sequence-to-sequence (Seq2Seq) model was created for translating English to Italian, utilizing Gated Recurrent Units (GRUs). This was followed by an advanced Seq2Seq Transformer model, both inspired by the work of François Chollet. Additionally, for translating English to Urdu, a pretrained model from Helsinki-NLP was employed to efficiently achieve high-quality results.

The evaluation centered on assessing the latency, complexity, and accuracy of these three approaches to gain insights for educational purposes.

Implementation Details

Seq2Seq with GRU

Model Architecture: The Seq2Seq model employed GRUs for both the encoder and decoder components. The encoder processes the input English sentences and produces a context vector, which is used by the decoder to generate the Italian translation.
Training Process: Used a parallel English-Italian dataset. -Performance: Provided satisfactory translations but struggled with longer and more complex sentence structures.

Seq2Seq Transformer

Model Architecture: Transitioned to a Transformer-based Seq2Seq model to leverage self-attention mechanisms for better handling of dependencies between words.
Training Process: Continued using the same English-Italian dataset, incorporating positional encodings and multi-head attention to capture intricate sentence structures.
Performance: Demonstrated superior translation quality and efficiency compared to the GRU-based model, effectively managing long-range dependencies and improving fluency.

Pretrained Model

Model Choice: For translating English to Urdu, I utilized the pretrained model Helsinki-NLP/opus-mt-en-ur. This model, available from the Hugging Face Model Hub, was specifically designed for English to Urdu translation.
Implementation: Applied the pretrained model for translation tasks, leveraging its training on a broad range of text to achieve high-quality translations with minimal additional fine-tuning.
Performance: The pretrained model provided robust translations between English and Urdu, reflecting the effectiveness of leveraging pretrained architectures for specific language pairs.

Analysis

Translation Quality: The Transformer model improved translation accuracy and fluency for English to Italian. The use of Helsinki-NLP/opus-mt-en-ur ensured high-quality translations for English to Urdu with minimal effort.
Efficiency: The Transformer architecture’s ability to handle long-range dependencies and parallel processing contributed to reduced training times and better performance.
Pretrained Model Utility: The pretrained model for English to Urdu showcased the practical benefits of using models trained on diverse datasets for rapid deployment in specific translation tasks.

Model Comparison Table

Model	Data Used for Training	Training Duration*	Implementation Complexity
GRU	Yes	Long	Moderate
Transformer	Yes	Moderate	High
Helsinki-NLP/opus-mt-en-ur	No	Not Applicable	Low

* Both the GRU and Transformer models were trained for 15 epochs each. Helsinki-NLP/opus-mt-en-ur was already pretrained.

Future Work

Model Enhancement: Explore advanced Transformer variants or integrate additional contextual information to further enhance translation accuracy and context understanding.
Multilingual Capabilities: Expand the project to support more language pairs.

Data Source for English-Italian Translation Corpus

https://www.manythings.org/anki/ita-eng.zip

Bibliography

François Chollet, “Deep_Learning_with_Python”. Manning Publications. 2021.

License

This project is licensed under the Raza Mehar License. See the LICENSE.md file for details.

Contact

For any questions or clarifications, please contact Raza Mehar at [raza.mehar@gmail.com].

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
pretrained_model		pretrained_model
rnn		rnn
transformer		transformer
English_to_Italian_Machine_Translation_Transformer.ipynb		English_to_Italian_Machine_Translation_Transformer.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Translation: RNN, Transformer, and Pretrained Model Implementations

Overview

Implementation Details

Seq2Seq with GRU

Seq2Seq Transformer

Pretrained Model

Analysis

Model Comparison Table

Future Work

Data Source for English-Italian Translation Corpus

Bibliography

License

Contact

About

Releases

Packages

Languages

License

razamehar/Machine-Translation

Folders and files

Latest commit

History

Repository files navigation

Machine Translation: RNN, Transformer, and Pretrained Model Implementations

Overview

Implementation Details

Seq2Seq with GRU

Seq2Seq Transformer

Pretrained Model

Analysis

Model Comparison Table

Future Work

Data Source for English-Italian Translation Corpus

Bibliography

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages