Skip to content

Commit

Permalink
update comments
Browse files Browse the repository at this point in the history
Signed-off-by: Tim Schopf <tim.schopf@t-online.de>
  • Loading branch information
TimSchopf committed Apr 29, 2024
1 parent b44e9fe commit 0ca2367
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion keyphrase_vectorizers/keyphrase_count_vectorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ class KeyphraseCountVectorizer(_KeyphraseVectorizerMixin, BaseEstimator):
must be customized accordingly.
Additionally, the ``pos_pattern`` parameter has to be customized as the `spaCy part-of-speech tags`_ differ between languages.
Without customizing, the words will be tagged with wrong part-of-speech tags and no stopwords will be considered.
In addition, you have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
In addition, you may have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
Parameters
----------
Expand Down
2 changes: 1 addition & 1 deletion keyphrase_vectorizers/keyphrase_tfidf_vectorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class KeyphraseTfidfVectorizer(KeyphraseCountVectorizer):
must be customized accordingly.
Additionally, the ``pos_pattern`` parameter has to be customized as the `spaCy part-of-speech tags`_ differ between languages.
Without customizing, the words will be tagged with wrong part-of-speech tags and no stopwords will be considered.
In addition, you have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
In addition, you may have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
Tf means term-frequency while tf-idf means term-frequency times inverse document-frequency.
This is a common term weighting scheme in information retrieval,
Expand Down

0 comments on commit 0ca2367

Please sign in to comment.