Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When inappropriately applying ScalarRange constraint, InvalidDataError is being returned instead of ConstraintsNotMetError #1842

Closed
srinify opened this issue Mar 6, 2024 · 0 comments · Fixed by #1858
Assignees
Labels
bug Something isn't working feature:constraints Related to inputting rules or business logic
Milestone

Comments

@srinify
Copy link
Contributor

srinify commented Mar 6, 2024

Environment Details

  • SDV version: 0.10.0
  • Python version: 3.11.x

Error Description

If you have data outside of a range of values but you try to apply a ScalarRange constraint anyway, the ConstraintsNotMetError (link) should be thrown but instead the InvalidDataError is being thrown.

Originally identified here: #1833

Steps to reproduce

Quick code snippet to reproduce in sdv 0.10:

from sdv.datasets.demo import get_available_demos, download_demo
from sdv.single_table import CTGANSynthesizer

demos_df = get_available_demos(modality='single_table')
data, metadata_obj = download_demo('single_table', 'census_extended')

constraint = {
    'constraint_class': 'ScalarRange',
    'constraint_parameters': {
        'column_name': 'age',
        'low_value': 5,
        'high_value': 10,
        'strict_boundaries': True
    }
}

synthesizer = CTGANSynthesizer(metadata_obj, epochs=500, verbose=True)
synthesizer.add_constraints(constraints=[constraint])
synthesizer.fit(data)

This is the resulting error:

---------------------------------------------------------------------------
InvalidDataError                          Traceback (most recent call last)
[<ipython-input-4-463f9f3b4286>](https://localhost:8080/#) in <cell line: 16>()
     14 synthesizer = CTGANSynthesizer(metadata_obj, epochs=500, verbose=True)
     15 synthesizer.add_constraints(constraints=[constraint])
---> 16 synthesizer.fit(data)

2 frames
[/usr/local/lib/python3.10/dist-packages/sdv/single_table/base.py](https://localhost:8080/#) in fit(self, data)
    393         self._data_processor.reset_sampling()
    394         self._random_state_set = False
--> 395         processed_data = self._preprocess(data)
    396         self.fit_processed_data(processed_data)
    397 

[/usr/local/lib/python3.10/dist-packages/sdv/single_table/ctgan.py](https://localhost:8080/#) in _preprocess(self, data)
    211 
    212     def _preprocess(self, data):
--> 213         self.validate(data)
    214         self._data_processor.fit(data)
    215         self._print_warning(data)

[/usr/local/lib/python3.10/dist-packages/sdv/single_table/base.py](https://localhost:8080/#) in validate(self, data)
    162 
    163         if errors:
--> 164             raise InvalidDataError(errors)
    165 
    166     def _validate_transformers(self, column_name_to_transformer):

InvalidDataError: The provided data does not match the metadata:

Data is not valid for the 'ScalarRange' constraint:
   age
0   39
1   50
2   38
3   53
4   28
+32556 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature:constraints Related to inputting rules or business logic
Projects
None yet
4 participants