Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: not self consistent speed up #4652

Merged

Conversation

mrucker
Copy link
Contributor

@mrucker mrucker commented Oct 14, 2023

This pull requests modifies the behavior the EMT scorer features when using the not-self-consistent flag.

In particular, the previous implementation included all 2nd-order interactions of memory features. This worked fine on datasets with a small number of features but quickly degraded in performance for datasets with a large feature count due to computational complexity of O(n(n+1)/2). This pull requests modifies this behavior by only interacting features with their matching counterparts. This new operation has computational complexity of O(n).

To test the impact of the change I ran an experiment on 200 openml datasets. The new features only resulted in a 1% increase in loss on average.

@ataymano ataymano merged commit f8091b6 into VowpalWabbit:master Jan 23, 2024
115 of 116 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants