summer internship project @ JetBrains Research
-
Updated
Aug 27, 2021 - Python
summer internship project @ JetBrains Research
Token Classification task on the Yes We Can dataset
Links to my repositories, where I implement a wide variety of Natural Language Processing models using TensorFlow and Hugging Face.
Scrap, token classification and model deployment for a selective process.
ArabiNizer is a state-of-the-art Arabic named entity recognizer (NER) leveraging the XLMR transformer model with an impressive testing accuracy of 95.00% and a remarkable testing F1-score of 88.00% on the PAN-X.AR subset from XTREME.
A 16M LLM for POS tagging in African languages
This repo provides scripts for fine-tuning HuggingFace Transformers, setting up pipelines and optimizing token classification models for inference. They are based on my experience developing a custom chatbot, I’m sharing these in the hope they will help others to quickly fine-tune and use models in their projects! 😊
Token Classification in essay level, paragraph level and sentence level with BERT, DistillBERT and NER
Keyword extraction to automate the discovery of dataset in publications and public reports
Part-Of-Speech tagging in polish with finetuned RoBERTa model
API for Yoda-NER and Yoda-FITS model. NLP models for Google Feed product optimization
A webapp built using Gradio for demonstrating the capabilities of the Spacy NER pipeline.
RE-Miner Dashboard for Visual Analytics, Review & Market Analysis
Проект в рамках ВКР под названием "Разработка программного модуля для анализа документов, подтверждающих индивидуальные достижения"
End-to-end pipeline for (1) automatic scraping and parsing of NLP research papers, (2) token-level entity annotations in Label Studio, and (3) BERT-based models for span identification and entity recognition
A state-of-the-art Arabic part-of-speech tagger leveraging the XLMR transformer model With an impressive testing accuracy of 97.49% and a remarkable testing F1-score of 96.44% on the Arabic UD Treebank.
Data pipelines for both TensorFlow and PyTorch!
Add a description, image, and links to the token-classification topic page so that developers can more easily learn about it.
To associate your repository with the token-classification topic, visit your repo's landing page and select "manage topics."