Skip to content
/ MNDO Public

Multivariate Normal Distribution based Oversampling

License

Notifications You must be signed in to change notification settings

baibai25/MNDO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MNDO

Python implementation of MNDO (Multivariate Normal Distribution based Oversampling).

Article about this implemention

Requirements

  • Anaconda / Python 3.6
  • tqdm 4.31.1
  • imbalanced-learn 0.4.3

Usage

Preprocessing Keel-datasets

If you use Keel-datasets, you can use the following command.

python pre_dataset.py dataset_directory
  • Preprocessing all files in a directory.
  • Remove unnecessary lines and replace class labels. (Positive class -> 1, Negative class -> 0)
  • Preprocessed data is saved in MNDO/Predataset/xxx.csv

Over-sampling

Resampled(generated) data is stored in ./pos_data

python over-sampling.py data_path

Training

python train.py data_path

train.py steps:

  1. Load data
  2. Over-sampling (MNDO, SMOTE, Borderline-SMOTE, ADASYN, SMOTE-ENN and SMOTE-Tomek Links)
  3. Scaling (Normalization or Standardization)
  4. Learning (SVM, Decision Tree and k-NN)
  5. Predict (Results is saved in MNDO/output/xxx.csv)

If you want to train all files, you can use this script:

./run.sh

ToDo

  • Provide as python library

Related works

Author

Kotaro Ambai (baibai25)