Skip to content

Commit

Permalink
fix document split bug
Browse files Browse the repository at this point in the history
Signed-off-by: Tim Schopf <tim.schopf@t-online.de>
  • Loading branch information
TimSchopf committed May 2, 2024
1 parent 5e28116 commit 9267fdd
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion keyphrase_vectorizers/keyphrase_vectorizer_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ def _get_pos_keyphrases(self, document_list: List[str], stop_words: Union[str, L
stop_words_list.add(doc_delimiter)

# split processed documents by delimiter
processed_docs = list(filter(None, [doc.strip() for doc in processed_docs.split(doc_delimiter)]))
processed_docs = [doc.strip() for doc in processed_docs.split(doc_delimiter)][1:]

if extract_keyphrases:
# extract keyphrases that match the NLTK RegexpParser filter
Expand Down

0 comments on commit 9267fdd

Please sign in to comment.