Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand data validation to also cover the scikit-learn experiment classes #290

Merged
merged 3 commits into from
Feb 23, 2024

Conversation

drbenvincent
Copy link
Collaborator

  • Closes Add input validation to experiment objects #78
  • We currently have _input_validation methods in just the pymc experiment classes
  • This PR extracts these out into new Mixin classes in data_validation.py
  • Now we can use the same data validation code for both the pymc and the scikit-learn versions of the experiment classes.

@drbenvincent drbenvincent changed the title Extract data validation code into Mixin classes and use in both pymc and scikit-learn experiment classes Expand data validation to also cover the scikit-learn experiment classes Feb 5, 2024
Copy link

codecov bot commented Feb 5, 2024

Codecov Report

Attention: Patch coverage is 97.64706% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 77.06%. Comparing base (67d9e0e) to head (6574481).

Files Patch % Lines
causalpy/data_validation.py 96.15% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #290      +/-   ##
==========================================
+ Coverage   76.50%   77.06%   +0.56%     
==========================================
  Files          20       21       +1     
  Lines        1345     1378      +33     
==========================================
+ Hits         1029     1062      +33     
  Misses        316      316              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@jpreszler jpreszler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to get to this. I've been starting work at 6am to overlap with coworkers in the UK and have been wiped out in the evenings.

I like splitting out the input validation, it's good progress to making things more streamlined.

There are a few suggestions I would make:

  1. codecov has flagged a couple of things that don't have tests that would be easy to add but they seem fairly low value.
  2. There's a lot of redundancy here. For example, there are 4 checks for _is_variable_dummy_coded with the only difference being the argument. Perhaps there could be a generic DataValidator class with methods for common checks that each experiment validator inherits from and the _input_validation methods just run the checks needed for that experiment.

I don't think either of these should hold up merging this PR. If you like the idea of 2, I could do that in the near future.

@drbenvincent
Copy link
Collaborator Author

Thanks @jpreszler. I've added tests for the scikit-learn versions of the experiments in test_input_validation.py

@drbenvincent drbenvincent merged commit a63a7f9 into main Feb 23, 2024
10 checks passed
@drbenvincent drbenvincent deleted the data-validation branch February 23, 2024 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add input validation to experiment objects
2 participants