Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting ValueError (sdv-pii-25szo) while sampling synthesizer on SDV==1.13.0 #2023

Closed
Dharmik2510 opened this issue May 21, 2024 · 4 comments
Labels
bug Something isn't working resolution:duplicate This issue or pull request already exists

Comments

@Dharmik2510
Copy link

I am using SDV version 1.13.0 with the simplify_schema option.

The model trains and saves successfully to the specified location. However, when I call the sample method to generate synthetic data, I encounter the following error:

=================================================

ValueError: invalid literal for int() with base 10: 'sdv-pii-25szo'

=================================================

@Dharmik2510 Dharmik2510 added bug Something isn't working new Automatic label applied to new issues labels May 21, 2024
@Dharmik2510 Dharmik2510 changed the title Getting an error while sampling synthesizer on SDV==1.13.0 Getting ValueError (sdv-pii-25szo) while sampling synthesizer on SDV==1.13.0 May 21, 2024
@srinify
Copy link
Contributor

srinify commented May 22, 2024

Hi there @Dharmik2510 do you mind sharing a bit more context here for us to try to reproduce this error?

  • What did your metadata look like before and after running simplify_schema()?
  • Were you using HMASynthesizer (just to double check)?
  • By "saves successfully" do you mean exporting the model using save()?
  • Does the synthesizer model work after training but before you saved / exported it? Is the error only happening when you re-import the model and try to sample?
  • Do you mind sharing the full code you ran and sharing the full stack trace? You can hide your dataset if its sensitive

Let us know and we can start digging into this! Thanks!

@srinify srinify added under discussion Issue is currently being discussed and removed new Automatic label applied to new issues labels May 22, 2024
@srinify
Copy link
Contributor

srinify commented May 30, 2024

Hi @Dharmik2510 just following up :) Let me know if you're still blocked by this issue

@srinify
Copy link
Contributor

srinify commented Jun 3, 2024

Hi @Dharmik2510 I haven't heard from you in a while so I'm closing this issue out for now. If you still need help, feel free to comment and tag me and I can re-open!

@srinify srinify closed this as completed Jun 3, 2024
@srinify srinify added resolution:cannot replicate The problem cannot be replicated and removed under discussion Issue is currently being discussed labels Jun 3, 2024
@npatki
Copy link
Contributor

npatki commented Jun 13, 2024

Hi @Dharmik2510 -- it looks like we've replicated the issue in #2064. It seems to be happening if the metadata has detected a column as 'unknown' even though it's supposed to be 'numerical'. The metadata auto-detection is not guaranteed to be accurate, so it's important to check it to ensure it accurately describes your data. A quick fix here would be to update the metadata for the column to be 'numerical'.

For more information about inspecting and updating your metadata, please see the Metadata API docs. Thanks.

@npatki npatki added resolution:duplicate This issue or pull request already exists and removed resolution:cannot replicate The problem cannot be replicated labels Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resolution:duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants