Skip to content

Releases: google/sentencepiece

v0.1.81

22 Mar 16:39
c58588f
Compare
Choose a tag to compare

Fix: support tensorflow 0.13.1

v0.1.8

10 Jan 08:49
Compare
Choose a tag to compare

Feature: Get rid of the dependency to external protobuf
Feature: added (Encode|Decode)AsSerializedProto interface so Python module can get full access to the SentencePieceText proto including the byte offsets/aligments
Feature: added --treat_whitespace_as_suffix option to make _ be a suffix of word.
Feature: Added normalization rules to remove control characters in the default nmt_* normalizers
Minor fix: simplify the error messager
Minor fix: do not emit full source path in LOG(INFO)

For more detail: v0.1.7...v0.1.8

v0.1.7

25 Dec 06:08
ecc9916
Compare
Choose a tag to compare

Deprecated: --mining_sentence_size and --training_sentence_size. Load all sentences by default. --input_sentence_size can be specified to limit the sentences to be loaded
Feature: added --unk_piece/--bos_piece/--eos_piece/--pad_piece flags to change the surface representations of these special symbols.
Bug fix: added third_party directory for cmake's subdirectory.

For more detail:
v0.1.6...v0.1.7

v0.1.6pre1

11 Nov 15:39
d35413c
Compare
Choose a tag to compare

SentencePiece Windows release

v0.1.6

11 Nov 15:39
d35413c
Compare
Choose a tag to compare
  • Bug fix: do not apply normalization to the user-defined-symbols.
  • Bug fix: stop adding extra whitespaces before user-defined symbols
  • Feature: added --minloglevel flag to suppress LOG(INFO) message
  • Feature: added --split_by_number flag to allow numbers to attach other symbols.
  • Feature: added --max_sentence_length flag to control the maximum byte length of input sentence for training.
  • used tf-versioned so file for _sentencepiece_processor_ops to minimize ABI incompatibility for tf wapper.

For more detail: v0.1.5...master

v0.1.5

28 Oct 17:03
d55372c
Compare
Choose a tag to compare
Merge pull request #225 from google/sr

pushed new nfkc_cf.tsv

v0.1.4

26 Aug 15:31
Compare
Choose a tag to compare

Initial SentencePiece releases