Skip to content

bilzard/kaggle-hms-public

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About This Repository

This repository contains part of resources in 4th place solution (bilzard part) on the competition HMS - Harmful Brain Activity Classification

Details of this solution is described here.

LICENSE

Apache 2.0

Environment

  • Graphic card: Nvidia RTX4090
  • CUDA: 12.1

Setup

Install required packages

pip install -r requirements.txt

Install local packages with additional requirements

Below command installs this repository in your local python environment with dependent packages.

pip install --editable .

Configure Local Environment Settings

You should edit conf/env/local.yaml to meet configuration of your local environment.

name: local
num_workers: 24
infer_batch_size: 32
grad_checkpointing: false

data_dir: (path to competition data)
working_dir: (path to your working directory)
output_dir: ${env.working_dir}/${job_name}/${phase}
checkpoint_dir: ${env.working_dir}/train
submission_dir: .

How to Reproduce

To reproduce final submission, execute the following commands:

python -m run.preprocess job_name=preprocess phase=train
python -m run.fold_split job_name=fold_split phase=train
python schedule.py train --config_names=v5_eeg_24ep_cutmix --folds=0,1,2,3,4 --seeds=0,1,2

These commands will train the models which is contained in the ensemble of final submissions. Trained model checkpoints will saved in ./data/train.

Training is executed per each fold and random seed, so this generates the 15 model (5-folds and 3-seeds) checkpoints per config_name.

Performance

These are the seed & fold ensemble result per each experiments.

exp_name CV(n_votes>8.4) Private LB Public LB
v5_eeg_24ep_cutmix 0.2477 0.327657 0.256772

Detailed Explanation of Entry Points (Optional)

Usage of major entry points in this repository is explained in this section. You don't have to fully understand the details to reproduce the final submission, but it will help you to understand how to train & evaluate new models using resources in this repository.

Pre-Process

This command produces EEGs, Channel Quality Masks(CQM) and Kaggle spectrograms in numpy's ndarray format. By default EEGs are sub-sampled with 40Hz.

python -m run.preprocess job_name=preprocess phase=train

Fold-split

This command generates 5-fold train/validation splits. Each folds are generated by GroupKFold group by patient_id.

python -m run.fold_split job_name=fold_split phase=train

Train

This is useful to train multiple single models at the same time.

python schedule.py train --config_names=v5_eeg_24ep_cutmix --folds=0,1,2,3,4 --seeds=0,1,2

Batch Inference (Infer & Ensemble)

This is useful to make inference and mean ensemble with all the model specified in the specified ensemble entities at the same time. It makes mean ensemble of all models specified in ensemble_entity (see run/conf/ensemble_entity), and outputs submission.csv to working directory.

python -m run.batch_infer job_name=ensemble ensemble_entity=f01234_s012 ensemble_entity.name=v5_eeg_24ep_cutmix

Acknowledgements

The architecture of the resources in this repository was inspired by the following repositories. Thanks to the authors.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages