- Builds a baseline statistical tagger by using the assignment#2's hash of hashes.
- Train baseline lexicalized statistical tagger on the entire BROWN corpus.
- Uses the baseline lexicalized statistical tagger to tag all the words in the SnapshotBROWN.pos.all.txt file.
- Evaluates and reports the performance of this baseline tagger on the Snapshot file.
- Adds rules for unknown word tagging.
- Tests on new text collected from article.
- Maps each parse tree in the BROWN.pos.all file into one-line sentences.
- Each sentence spans a single line in the output file.
- Generates the hash of hashes from the clean file BROWN-clean.pos.txt in word:pos:freq format.
- Takes the most frequent tag and use it to tag the words in all the sentences from the SnapshotBROWN-clean.pos.txt file.
- Report the performance (Accuracy, error, percentile not present in tag set) of this tagger.
Current Version : v1.0.2.1
Last Update : 02.28.2018 (Time : 06:22am)