Skip to content

A machine translation project featuring RNN-based Seq2Seq, Transformer model, and pretrained models for translating English to Spanish and Urdu.

License

Notifications You must be signed in to change notification settings

razamehar/Machine-Translation

Repository files navigation

Machine Translation: RNN, Transformer, and Pretrained Model Implementations

Overview

This project focused on developing a machine translation system with two main objectives. First, a sequence-to-sequence (Seq2Seq) model was created for translating English to Italian, utilizing Gated Recurrent Units (GRUs). This was followed by an advanced Seq2Seq Transformer model, both inspired by the work of François Chollet. Additionally, for translating English to Urdu, a pretrained model from Helsinki-NLP was employed to efficiently achieve high-quality results.

The evaluation centered on assessing the latency, complexity, and accuracy of these three approaches to gain insights for educational purposes.

Implementation Details

Seq2Seq with GRU

  • Model Architecture: The Seq2Seq model employed GRUs for both the encoder and decoder components. The encoder processes the input English sentences and produces a context vector, which is used by the decoder to generate the Italian translation.
  • Training Process: Used a parallel English-Italian dataset. -Performance: Provided satisfactory translations but struggled with longer and more complex sentence structures.

Seq2Seq Transformer

  • Model Architecture: Transitioned to a Transformer-based Seq2Seq model to leverage self-attention mechanisms for better handling of dependencies between words.
  • Training Process: Continued using the same English-Italian dataset, incorporating positional encodings and multi-head attention to capture intricate sentence structures.
  • Performance: Demonstrated superior translation quality and efficiency compared to the GRU-based model, effectively managing long-range dependencies and improving fluency.

Pretrained Model

  • Model Choice: For translating English to Urdu, I utilized the pretrained model Helsinki-NLP/opus-mt-en-ur. This model, available from the Hugging Face Model Hub, was specifically designed for English to Urdu translation.
  • Implementation: Applied the pretrained model for translation tasks, leveraging its training on a broad range of text to achieve high-quality translations with minimal additional fine-tuning.
  • Performance: The pretrained model provided robust translations between English and Urdu, reflecting the effectiveness of leveraging pretrained architectures for specific language pairs.

Analysis

  • Translation Quality: The Transformer model improved translation accuracy and fluency for English to Italian. The use of Helsinki-NLP/opus-mt-en-ur ensured high-quality translations for English to Urdu with minimal effort.
  • Efficiency: The Transformer architecture’s ability to handle long-range dependencies and parallel processing contributed to reduced training times and better performance.
  • Pretrained Model Utility: The pretrained model for English to Urdu showcased the practical benefits of using models trained on diverse datasets for rapid deployment in specific translation tasks.

Model Comparison Table

Model Data Used for Training Training Duration* Implementation Complexity
GRU Yes Long Moderate
Transformer Yes Moderate High
Helsinki-NLP/opus-mt-en-ur No Not Applicable Low

* Both the GRU and Transformer models were trained for 15 epochs each. Helsinki-NLP/opus-mt-en-ur was already pretrained.

Future Work

  • Model Enhancement: Explore advanced Transformer variants or integrate additional contextual information to further enhance translation accuracy and context understanding.
  • Multilingual Capabilities: Expand the project to support more language pairs.

Data Source for English-Italian Translation Corpus

https://www.manythings.org/anki/ita-eng.zip

Bibliography

François Chollet, “Deep_Learning_with_Python”. Manning Publications. 2021.

License

This project is licensed under the Raza Mehar License. See the LICENSE.md file for details.

Contact

For any questions or clarifications, please contact Raza Mehar at [raza.mehar@gmail.com].

About

A machine translation project featuring RNN-based Seq2Seq, Transformer model, and pretrained models for translating English to Spanish and Urdu.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published