Standardisation of Indonesian SNS writing
This project is focused on normalising non-standard words (Indonesian only) using some approaches. The first implemented methodology is using sequence to sequence model (RNN with attention). After completing the first approach, we aimed to compare with Statistical Machine Translation (SMT) concept. Edit Distance and learning edit distance techniques will also be implemented on this project.
Note: The data used (corpus) during the training is from political online news.