A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.
-
Updated
Sep 29, 2024 - Python
A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Demonstration of AI/neural word alignment of English & Japanese text using mBERT-based machine learning models.
A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.
WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction, to appear at ACL 2023 main conference.
Inference library and evaluation script for WSPAlign (https://github.com/qiyuw/WSPAlign)
Create "pretty" graphs for aligned sentences
Word Alignment Visualization is a Python package for visualizing word alignments between two sentences in a Jupyter notebook. The package provides an interactive widget that displays original and translated sentences with word alignment lines.
Using alignments and posteriorgrams extracted from lyrics as novel input into source separation models
Enhanced awesome-align for low-resource languages and noise simulation: https://arxiv.org/abs/2301.09685
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Word-alignment models for Bible translations in 100+ historical and contemporary languages
Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries (ACL 2020)
Are Girls Neko or Shōjo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization (ACL 2019)
Assignment 1: Word Alignment in 'Statistical Machine Translation' course by Dr. Roee Aharoni at Bar-Ilan University.
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021
X-SRL Dataset. Including the code for the SRL annotation projection tool and an out-of-the-box word alignment tool based on Multilingual BERT embeddings.
This project provide an API to perform word alignment
Leveraging Almost Black-Box NMT for Word Alignment
Add a description, image, and links to the word-alignment topic page so that developers can more easily learn about it.
To associate your repository with the word-alignment topic, visit your repo's landing page and select "manage topics."