Skip to content

v3.1.0

Compare
Choose a tag to compare
@jacksonllee jacksonllee released this 21 Feb 17:24

[3.1.0] - 2021-02-21

Added

  • Part-of-speech tagging:
    • Added the function pos_tag that takes a segmented sentence or phrase
      and returns its part-of-speech tags.
    • Added the function hkcancor_to_ud that maps a part-of-speech tag
      from the original HKCanCor annotated data to one of the tags from the
      Universal Dependencies v2 tagset.
  • Word segmentation:
    • Improved segmentation quality by revising the underlying wordlist data.
  • The test suite now covers code snippets in both the docstrings and .rst doc files.

Fixed

  • Fixed the issue of not opening text files with UTF-8 encoding
    (a possible issue on Windows).
  • jyutping_to_yale and parse_jyutping now return a null value
    (rather than raise an error) when the input is null.
  • The word segmentation function segment now strips all whitespace
    from the input unsegmented string before segmenting it.