nagisa v0.2.8
nagisa 0.2.8 incorporates the following changes:
- Fix
AttributeError
in nagisa_utils.pyx when tokenizing a text containing Latin capital letter I with dot above 'İ'
When tokenizing a text containing 'İ', an AttributeError
has occurred. This is because, as the following example shows, lowering 'İ' would have changed to the length of 2, and would not have been extracting features correctly.
>>> text = "İ" # [U+0130]
>>> print(len(text))
1
>>> text = text.lower() # [U+0069] [U+0307]
>>> print(text)
'i̇'
>>> print(len(text))
2
To avoid this error, the following preprocess was added to the source code modification 1, modification 2.
text = text.replace('İ', 'I')
- Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10, 3.11) to PyPI for Linux
- Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10) to PyPI for macOS
- Add Python wheels (3.6, 3.7, 3.8) to PyPI for Windows