Roadmap 🧭

With winkNLP's production ready release in late 2020, the core is already in place. Apart from sustainment, our goal is to continuously improve it by adding new features and capabilities. We have listed some of the features that should be added to winkNLP:

S. No.	Feature	Complexity	Status
01.	Extractive Summarization: Add `its.sentenceWiseImprotance` helper to extract sentence wise impotance from a document. This may be used for extractive summarization apart from other usage. While it should be language agnostic, but it should leverage loaded language model's capability to improve summarization.	Simple	Completed
02.	Text Pre-processor: Add a text preprocessing utility that provides options to (a) filter specific tokens based on their properties such as `pos`, `isStopWordFlag`, and `type`; (b) map entity type with a definable keyword; (c) add bigrams & trigrams and (d) inject sentiment. The API should follow winkNLP style and standards.	Medium	YTS
03.	Word Vectors Integration: Add integration with various word vectors starting with GloVe. This should include compression/decompression for fast loading, helpers for token, sentence and document vector computation.	High	Completed
04.	Sub-word Tokenizer: Add sub-word tokenization feature using techniques like Byte Pair Encoding (BPE) and/or WordPiece. The processing pipeline should allow choice of tokenizer.	Very High	YTS
05.	Compose Corpus: Add a utility to produce training corpus using patterns and cartesian product.	Simple	YTS
06.	Keywords Extraction: Add `its.keywords` helper to extract keywords/keyphrases from the text via `doc.out( its.keywords )`. While it should be language agnostic, but it should leverage loaded language model's capability to improve extraction.	Simple	YTS
07.	BM25 Vectorizer: Add a utility to train and also vectorize text based on an already trained BM25 model. It will follow wink-nlp styled API.	Medium	Completed
08.	Constituency/Dependency Parser: Add a constituency and/or dependency parser — details have to be worked out.	Very High	YTS

The above is intended to serve as a guideline for users and contributors for information, feedback and possible participation & discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROADMAP.md

ROADMAP.md

Roadmap 🧭

Files

ROADMAP.md

Latest commit

History

ROADMAP.md

File metadata and controls

Roadmap 🧭