Predict next word based on last N words typed with two kinds of model
- N-gram model
- Train and test the model with the following parameters: n, training_file, test_file, output_file, backoff
python3 NGram.py [--N n] [--training_file training_file] [--test_file test_file] [--output_file output_file] [--backoff True/False]
- LSTM model
- download pretrain Glvoe data
cd data
chmod +x download_glove.sh
./download_glove.sh
- building vocabulary dictionary (data/vocab) with pretrain Glove data(e.g. data/glove.6B.300d.txt) and training corpus(e.g. corpus/train_all.pkl)
python3 vocabulary.py [Glove_file] [corpus_file]
- training Neural Network with training corpus(e.g. corpus/train_all.pkl) and save (data/model_all.pt)
required: vocabulary file (data/vocab)
optional: window length
python3 LSTM.py --train corpus_file [--length N]
- testing model(e.g. data/model_all.pt) with testing corpus
required: vocabulary file (data/vocab)
optional: window length
python3 LSTM.py --test model_file [--length N]
- Next Word Preditor
required: vocabulary file (data/vocab), model file(e.g. data/model_all.pt)
python3 app.py