-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multilabel Classification Evaluation #5
Comments
Edit: Since the multilabel stratified k-fold cross validation is not implemented in sklearn this repository might help for the implementation of multilabel gridsearch. |
Thank you, @angrymeir! You're helping to make this humble project better! That is totally right, the current implementation of the evaluation class does not provide support for multilabel classification. What do you think of adding an extra argument to the I'm currently working on the Thanks for suggesting that repository implementing multilabel stratified k-fold cross-validation! it seems quite straightforward to use. BTW, taking into account your great ideas, suggestions, and feedback, do you mind being added to the README file as a contributor? |
BTW just in case that you're wondering regarding being added as a contributor, PySS3 follows the all-contributors specification, "Recognize all contributors, not just the ones who push code" 😎 Now that I'm done with the other Issue, I'll continue with this one 👽 ☕ |
Sounds like a plan! I would be honored to be listed as a contributor! However, the ideas are not only from me, but also my colleague @Vaiyani! |
@all-contributors could you add @Vaiyani and @angrymeir as contributors for ideas, suggestions, and feedback? |
I've put up a pull request to add @angrymeir and @Vaiyani! 🎉 |
@angrymeir and @Vaiyani, both were added to the readme file! 😎 Thanks, guys. I've also added you as contributions not only for ideas but also for data (since probably I'll be using your SemEval 2016 Task 5 dataset for the tutorials and live demo, as suggested in Issue #6). |
@sergioburdisso Thanks for this great project as well :) |
The fit/train method now supports multilabel classification. It will automatically determine if we're dealing with a multilabel classification problem by looking at the first item of the `y_train` list. If the first item is a list (of labels), i.e., if it's not a single label, it will assume we're dealing with a multilabel classification problem.
This function converts the list of training/test labels (i.e., y_train/y_test) into a membership matrix. This function is useful when working with multi-label classification problems and it is meant to be used only internally by the evaluation module (the ``Evaluation`` class). However, in case users want to perform model evaluations using custom evaluation metrics, they could use this function as shown in the following example, in which the performance will be measured in terms of Hamming loss: ``` from pyss3 import SS3 from pyss3.util import Dataset, membership_matrix from sklearn.metrics import hamming_loss x_train, y_train = Dataset.load_from_files_multilabel(...) x_test, y_test = Dataset.load_from_files_multilabel(...) clf = SS3() clf.train(x_train, y_train) y_pred = clf.predict(x_test, multilabel=True) y_test_mem = membership_matrix(clf, y_test) y_pred_mem = membership_matrix(clf, y_pred) hamming_loss(y_test_mem, y_pred_mem) ``` Documentation available here: https://pyss3.rtfd.io/en/latest/api/index.html#pyss3.util.membership_matrix
This function converts the list of training/test labels (i.e., y_train/y_test) into a membership matrix. This function is useful when working with multi-label classification problems and it is meant to be used only internally by the evaluation module (the ``Evaluation`` class). However, in case users want to perform model evaluations using custom evaluation metrics, they could use this function as shown in the following example, in which the performance will be measured in terms of Hamming loss: ``` from pyss3 import SS3 from pyss3.util import Dataset, membership_matrix from sklearn.metrics import hamming_loss x_train, y_train = Dataset.load_from_files_multilabel(...) x_test, y_test = Dataset.load_from_files_multilabel(...) clf = SS3() clf.train(x_train, y_train) y_pred = clf.predict(x_test) y_test_mem = membership_matrix(clf, y_test) y_pred_mem = membership_matrix(clf, y_pred) hamming_loss(y_test_mem, y_pred_mem) ``` Documentation available here: https://pyss3.rtfd.io/en/latest/api/index.html#pyss3.util.membership_matrix
Now, when working with multi-label classification problems, ``predict()`` will realize the user is working with multi-labeled data and set the `multilabel` argument to True by default. Therefore, if the user has trained the model using multilabeled data, then (s)he can simply call ``predict(x_test)`` without the ``multilabel=True`` argument. (#5)
Now the ``membership_matrix()`` runs 30 times faster. For instance, what before took 4.5s now takes only 150ms. This optimization was necessary because this function is called each that time the model is evaluated, which means, for instance, that is called multiple times while performing ``grid_search()``.
Evaluation.test() now supports multi-label classification as well. It supports all previous standard metrics (precision, recall, f1-score, accuracy) plus two new ones, 'hamming-lose' and 'exact-match' (equivalent to 'accuracy'). Once finished, the `test` function will also show a binary confusion matrix for each possible label.
Evaluation.test() now supports multi-label classification as well. It supports all previous standard metrics (precision, recall, f1-score, accuracy) plus two new ones, 'hamming-lose' and 'exact-match' (equivalent to 'accuracy'). Once finished, the `test` function also shows a binary confusion matrix for each possible label.
Evaluation.test() now supports multi-label classification as well. It supports all previous standard metrics (precision, recall, f1-score, accuracy) plus two new ones, 'hamming-lose' and 'exact-match' (equivalent to 'accuracy'). Once finished, the `test` function also shows a binary confusion matrix for each possible label.
Hey @sergioburdisso , Regarding the Grid Search should I create a separate Issue for that? |
Evaluation.kfold_cross_validation() now supports multi-label classification as well. It supports all previous standard metrics (precision, recall, f1-score, accuracy) plus two new ones, 'hamming-lose' and 'exact-match' (equivalent to 'accuracy').
@angrymeir Cool!!! I've just finished with the |
Evaluation.grid_search() now supports multi-label classification using the "test" method. It supports all previous standard metrics (precision, recall, f1-score, accuracy) plus two new ones, 'hamming-lose' and 'exact-match' (equivalent to 'accuracy').
Now the 3D evaluation plot (`Evaluation.plot()`) supports multi-label classification. New performance metrics have been added and binary confusion matrices for each label are shown for each evaluated model configuration.
@angrymeir @Vaiyani Guys! I've finally finished adding full multi-label classification support to the Thanks, guys, for creating this issue :) these changes were necessary. Issue #9 is also part of this overall process of adding full multi-label classification support to PySS3 so, as soon as I finish with the other two issues, I'll finally release the new version (0.6.0). Do you think guys that we should also add a new tutorial showing the new features? do you think your dataset is gonna be well suited for that or should I use a simpler one? sort of more like a "proof-of-concept" dataset... what do you think? |
@sergioburdisso thankyou for the quick and effective response from your side on this issue. I believe tutorial would be a good idea for the new people as well because tutorials are the first point of learning (from my experience). Would be really helpful. As for the data, not quite sure. Our dataset (Sem eval) is also well suited for this but at the end whichever delivers the message clearly should be the aim. |
I guess a tutorial highlighting the differences would be great! In case you need help with the notebook/don't have time to implement it, let me know and I'll create one! |
The dataset is a subset of the CMU Movie Summary Corpus (http://www.cs.cmu.edu/~ark/personas/) with 32985 summaries and only 10 movie genres. The dataset is structured according to #6, i.e., there are two files, one for the labels and another for the movie plot summaries.
Guys! I've just finished implementing the multi-label support for the Live Test tool (issue #9). Now, in the left panel, test documents are shown with a % corresponding to the label-based accuracy (aka hamming score). Besides, when a document is selected, the true labels are shown along with the predicted labels, misclassified labels are shown in red, as "drama" below: I'm about to release the new version soon, I'm just performing the final checks. Regarding the dataset for the tutorial, I've finally decided to use a subset of the CMU Movie Summary Corpus with only 10 categories (and 32985 documents/plot summaries). I've already uploaded the zipped dataset to the repo (5f5c055), it uses the same format as you suggested in Issue #6 (one file for (semicolon separated) labels, another for docs), so I'll probably start working on (a very basic version of) the tutorial soon 😊 |
PySS3 now fully support multi-label classification! :) - The ``load_from_files_multilabel()`` function was added to the ``Dataset`` class (7ece7ce, resolved #6) - The ``Evaluation`` class now supports multi-label classification (#5) - Add multi-label support to ``train()/fit()`` (4d00476) - Add multi-label support to ``Evaluation.test()`` (0a897dd) - Add multi-label support to ``show_best and get_best()`` (ef2419b) - Add multi-label support to ``kfold_cross_validation()`` (aacd3a0) - Add multi-label support to ``grid_search()`` (925156d, 79f1e9d) - Add multi-label support to the 3D Evaluation Plot (42bbc65) - The Live Test tool now supports multi-label classification as well (15657ee, b617bb7, resolved #9) - Category names are no longer case-insensitive (4ec009a, resolved #8)
Hey @sergioburdisso,
Thank you for this awesome project!
Currently the evaluation class only supports single label classification, even though SS3 inherently supports multilabel classification.
These are the steps (I see) needed to support multilabel classification evaluation:
The text was updated successfully, but these errors were encountered: