Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not detecting any outliers on purely categorical datasets #4

Open
ToxicFyre opened this issue Apr 2, 2021 · 2 comments
Open

Not detecting any outliers on purely categorical datasets #4

ToxicFyre opened this issue Apr 2, 2021 · 2 comments

Comments

@ToxicFyre
Copy link

ToxicFyre commented Apr 2, 2021

outliertree_test.zip

Hello, I am comparing some outlier detectors on purely categorical datasets, but whenever I run OutlierTree on purely categorical datasets it doesn't return any outliers (With some exceptions). Is there any different parameterization that you recommend?

I attached a sample of what I am using to test it, so you can see what I mean.

Cheers and thanks for the help,
ToxicFyre

@ToxicFyre ToxicFyre changed the title Not detecting outliers on purely categorical datasets Not detecting any outliers on purely categorical datasets Apr 2, 2021
@david-cortes
Copy link
Owner

Indeed, if you look at the parameters, it has something about minimum sizes of branches to split. Apart from that, categorical outliers are just harder to identify. Ideas for criteria are welcome.

@ToxicFyre
Copy link
Author

Thank you for your reply. Yes, I noticed that having a few ordinal and numerical columns really helps in this case, to provide additional information. I'll be sure to share any ideas for criteria that I have in the future.

Thank you for your time,
ToxicFyre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants