byte-pair-encoding
Here are 29 public repositories matching this topic...
Byte Pair Encoding (BPE)
-
Updated
Feb 25, 2019 - Python
Генерация новостных заголовков
-
Updated
Nov 21, 2022 - Python
High performance unsupervised text tokenization for Ruby
-
Updated
Dec 27, 2023 - Ruby
Code for the publication of WWW'22
-
Updated
May 31, 2022 - Python
Auto summarization from BPE tokenization
-
Updated
Aug 20, 2020 - Jupyter Notebook
Named entity recognition in Malayalam using BiLSTM and TENER (Transformer Encoder)
-
Updated
Jul 13, 2023 - Jupyter Notebook
This is project for sequence to sequence NLP task. We developed a custom model to understand the process of task using PyTorch. We also fine tuned pre-trained transformer models to improve the performance of translation task.
-
Updated
Aug 1, 2024 - Jupyter Notebook
This is a tool that encrypts a sequence of words (or pieces of texts) using the AES-256 algorithm and encodes the encrypted result into a PNG image by linking each byte value to a specific color. It also decodes the before image to get back the original sequence of words
-
Updated
Sep 23, 2023 - Go
Feature extraction from sequential data
-
Updated
Jul 4, 2019 - C++
Order-agnostic lossless compressor using BPE and Huffman Coding.
-
Updated
Jun 6, 2024 - Python
An Introduction to Natural Language Processing (NLP)
-
Updated
Nov 4, 2023 - Jupyter Notebook
an efficient ranked retrieval system for English corpora, optimised with VBE and BPE.
-
Updated
Nov 10, 2023 - Python
-
Updated
Mar 1, 2023 - Shell
Code repo for the paper "AutoGO: Automated Computation Graph Optimization for Neural Network Evolution", accepted to NeurIPS 2023.
-
Updated
Jun 7, 2024 - Python
R package for Byte Pair Encoding based on YouTokenToMe
-
Updated
Sep 16, 2023 - C++
Modern Eager TensorFlow implementation of Attention Is All You Need
-
Updated
Aug 6, 2023 - Python
A byte-level Byte Pair Encoding (BPE) algorithm for tokenization in large language models (LLMs), similar to those used in GPT, Llama, and Mistral.
-
Updated
Aug 7, 2024
Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
-
Updated
Jan 30, 2023 - Python
Improve this page
Add a description, image, and links to the byte-pair-encoding topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the byte-pair-encoding topic, visit your repo's landing page and select "manage topics."