A Pytorch pipeline for Tweet Sentiment Extraction Kaggle competition

Introduction

This is a Pytorch training pipeline for a text span selection task. It also uses the Catalyst deep learning framework.

Installation

You need to have Anaconda installed
Clone the repo

git clone https://github.com/Kirill-Kravtsov/kaggle-tweet-sentiment-extraction

Create and activate provided Anaconda enviroment

conda env create -f tweet_env.yml
conda activate tweet_env

Download competition data and put in data dir in root of the project
Create folds by running

python create_folds.py

Project structure:

├── configs
│   ├── best_bertweet.yml
│   ├── best_roberta.yml
│   ├── experiments
│   └── optimization
├── create_folds.py
├── data
├── logs
├── scripts
├── src
│   ├── callbacks.py
│   ├── collators.py
│   ├── datasets.py
│   ├── data_utils.py
│   ├── hooks.py
│   ├── losses.py
│   ├── optimize_experiment.py
│   ├── tokenization.py
│   ├── train.py
│   ├── transformer_models.py
│   └── utils.py
└── tweet_env.yml

Running pipeline

To train tha basic Roberta and BERTweet models run:

python train.py --cv --config ../configs/best_roberta.yml
python train.py --cv --config ../configs/best_bertweet.yml

Note: the code is supposed to work with one gpu, so if you have multi-gpu system do not forget to specify CUDA_VISIBLE_DEVICE variable, e.g.:

CUDA_VISIBLE_DEVICES=0 python train.py --cv --config ../configs/best_roberta.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Pytorch pipeline for Tweet Sentiment Extraction Kaggle competition

Introduction

Installation

Project structure:

Running pipeline

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
configs		configs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
create_folds.py		create_folds.py
tweet_env.yml		tweet_env.yml

Kirill-Kravtsov/kaggle-tweet-sentiment-extraction

Folders and files

Latest commit

History

Repository files navigation

A Pytorch pipeline for Tweet Sentiment Extraction Kaggle competition

Introduction

Installation

Project structure:

Running pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages