Skip to content

Releases: NVIDIA/NeMo-Curator

v0.4.0

14 Aug 21:54
07bc29d
Compare
Choose a tag to compare

Highlights

  • Semantic Deduplication
  • Resiliparse for Text Extraction
  • Improve Distributed Data Classification - Domain classifier is 1.55x faster through intelligent batching
  • Synthetic data generation for fine-tuning

What's Changed

New Contributors

Full Changelog: https://github.com/NVIDIA/NeMo-Curator/commits/v0.4.0

v0.3.0

10 Jun 21:59
e927313
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: https://github.com/NVIDIA/NeMo-Curator/commits/v0.3.0

PyPi

https://pypi.org/project/nemo-curator/0.3.0/