Skip to content

Releases: segment-any-text/wtpsplit

Release 2.0.5

08 Jul 07:41
Compare
Choose a tag to compare
  • Fixes potential CUDA device error when the input has exactly 511 tokens (#121).

Release 2.0.4

01 Jul 09:32
Compare
Choose a tag to compare
  • Fix a speed issue with SaT (#118). Now it is (as expected) ~6x faster than WtP.

Release 2.0.3

26 Jun 08:05
Compare
Choose a tag to compare

Implement SaT (https://arxiv.org/abs/2406.16678) and switch the default models to SaT🚀

The previous WtP models are still available but SaT is strictly better in accuracy and speed. See the updated README for details: https://github.com/segment-any-text/wtpsplit.

SaT was implemented and developed by @markus583 @igorsterner.

Release 1.3.0

22 Jan 15:30
Compare
Choose a tag to compare

Release 1.2.3

18 Jul 13:47
Compare
Choose a tag to compare
  • fix error with text where length is not a multiple of 4 and shorter than 512 characters in canine-s-* models (#98).

Release 1.2.2

14 Jul 15:55
Compare
Choose a tag to compare
  • add strip_whitespace flag.
  • fix bug with some zero-length sentences being returned if there is lots of trailing whitespace.

Release 1.2.1

11 Jul 18:19
Compare
Choose a tag to compare
  • fix argument propagation from model wrapper (#95 #97)

Release 1.2.0

07 Jul 09:43
7b196b0
Compare
Choose a tag to compare
  • Speed up pre- & postprocessing via better vectorization (#94).
  • Proper onnxruntime support for the wtp-bert-* models, although onnx models are currently not much faster (or even slower) than PyTorch models for some reason. Will continue to look into that.
  • Adds missing pandas requirement (fixing #92).
  • Lower bounds on transformers and other requirements to make sure all the functionality we need is there.
  • Removes torch from requirements since users will want to install it themselves depending on their hardware setup, and it's not required anymore when using only the onnx models.

Release 1.1.0

17 Jun 09:58
Compare
Choose a tag to compare
  • Added missing get_threshold function
  • wtp.split adapted to some style now also allows changing the threshold via wtp.split(..., threshold=threshold). Was previously overwritten by the default.

Release 1.0.1

31 May 11:32
Compare
Choose a tag to compare

A major revamp of this library, now called wtpsplit!

See the Readme for details.