Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError when using specific locale strings (es_AR, fr_BE) #1439

Closed
npatki opened this issue May 24, 2023 · 0 comments · Fixed by #1642
Closed

AttributeError when using specific locale strings (es_AR, fr_BE) #1439

npatki opened this issue May 24, 2023 · 0 comments · Fixed by #1642
Assignees
Labels
bug Something isn't working
Milestone

Comments

@npatki
Copy link
Contributor

npatki commented May 24, 2023

Environment Details

  • SDV version: 1.1.0
  • Python version: 3.10
  • Operating System: Linux (Colab Notebook)

Error Description

I expect that I should be able to pass in any of the locale strings from the Faker docs into an SDV synthesizer. However, some of the locales listed in the Faker docs don't seem to work:

Both of these produce the same AttributeError when passed into the SDV locales parameter.

Steps to reproduce

from sdv.datasets.demo import download_demo
from sdv.single_table import GaussianCopulaSynthesizer

real_data, metadata = download_demo(
    modality='single_table',
    dataset_name='fake_hotel_guests'
)

synthesizer = GaussianCopulaSynthesizer(
    metadata,
    locales=['es_AR', 'fr_BE'] # problematic locales, all other ones work
) 
synthesizer.fit(real_data)
AttributeError: Invalid configuration for faker locale `es_AR`

Stack Trace

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-7-87e1341bac14>](https://localhost:8080/#) in <cell line: 13>()
     11     locales=['es_AR', 'fr_BE'] # problematic locales, all other ones work
     12 ) 
---> 13 synthesizer.fit(real_data)
     14 _ = synthesizer.sample(3)

9 frames
[/usr/local/lib/python3.10/dist-packages/sdv/single_table/base.py](https://localhost:8080/#) in fit(self, data)
    470         self._data_processor.reset_sampling()
    471         self._random_state_set = False
--> 472         processed_data = self._preprocess(data)
    473         self.fit_processed_data(processed_data)
    474 

[/usr/local/lib/python3.10/dist-packages/sdv/single_table/base.py](https://localhost:8080/#) in _preprocess(self, data)
    417     def _preprocess(self, data):
    418         self.validate(data)
--> 419         self._data_processor.fit(data)
    420         return self._data_processor.transform(data)
    421 

[/usr/local/lib/python3.10/dist-packages/sdv/data_processing/data_processor.py](https://localhost:8080/#) in fit(self, data)
    623         """
    624         self._prepared_for_fitting = False
--> 625         self.prepare_for_fitting(data)
    626         constrained = self._transform_constraints(data)
    627         LOGGER.info(f'Fitting HyperTransformer for table {self.table_name}')

[/usr/local/lib/python3.10/dist-packages/sdv/data_processing/data_processor.py](https://localhost:8080/#) in prepare_for_fitting(self, data)
    586                     f'for table {self.table_name}'
    587                 ))
--> 588                 config = self._create_config(constrained, columns_created_by_constraints)
    589                 self._hyper_transformer.set_config(config)
    590 

[/usr/local/lib/python3.10/dist-packages/sdv/data_processing/data_processor.py](https://localhost:8080/#) in _create_config(self, data, columns_created_by_constraints)
    473             elif pii:
    474                 enforce_uniqueness = bool(column in self._keys)
--> 475                 transformers[column] = self.create_anonymized_transformer(
    476                     sdtype,
    477                     column_metadata,

[/usr/local/lib/python3.10/dist-packages/sdv/data_processing/data_processor.py](https://localhost:8080/#) in create_anonymized_transformer(sdtype, column_metadata, enforce_uniqueness, locales)
    376             kwargs['enforce_uniqueness'] = True
    377 
--> 378         return get_anonymized_transformer(sdtype, kwargs)
    379 
    380     def create_regex_generator(self, column_name, sdtype, column_metadata, is_numeric):

[/usr/local/lib/python3.10/dist-packages/sdv/metadata/anonymization.py](https://localhost:8080/#) in get_anonymized_transformer(function_name, transformer_kwargs)
     99     if function_name in SDTYPE_ANONYMIZERS:
    100         transformer_kwargs.update(SDTYPE_ANONYMIZERS[function_name])
--> 101         return AnonymizedFaker(**transformer_kwargs)
    102 
    103     provider_name = _detect_provider_name(function_name, locales=locales)

[/usr/local/lib/python3.10/dist-packages/rdt/transformers/pii/anonymizer.py](https://localhost:8080/#) in __init__(self, provider_name, function_name, function_kwargs, locales, enforce_uniqueness)
    104         self._faker_random_seed = None
    105         self.locales = locales
--> 106         self.faker = faker.Faker(self.locales)
    107         if self.locales:
    108             self._check_locales()

[/usr/local/lib/python3.10/dist-packages/faker/proxy.py](https://localhost:8080/#) in __init__(self, locale, providers, generator, includes, use_weighting, **config)
     65 
     66         for locale in locales:
---> 67             self._factory_map[locale] = Factory.create(
     68                 locale,
     69                 providers,

[/usr/local/lib/python3.10/dist-packages/faker/factory.py](https://localhost:8080/#) in create(cls, locale, providers, generator, includes, use_weighting, **config)
     41         if locale not in AVAILABLE_LOCALES:
     42             msg = f"Invalid configuration for faker locale `{locale}`"
---> 43             raise AttributeError(msg)
     44 
     45         config["locale"] = locale

AttributeError: Invalid configuration for faker locale `es_AR`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants