HMA Synthesizer's scale
parameter doesn't work for small values
#2045
Labels
bug
Something isn't working
feature:sampling
Related to generating synthetic data after a model is built
Milestone
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
Sampling from HMA Synthesizer using a small
scale
value can result in an error. This seems to happen whenscale
* root table's row count results in a float (e.g. 0.01 * 10 rows = 0.1 rows requested).Workaround
Until this is fixed, we recommend increasing the
scale
parameter to a higher value.For example, if you have 100 rows in your parent table, we recommend using a
scale
value greater than 0.01 (so you at least get 1 row back).Proposed Solution
Neha's proposal in the original issue was to set the minimum size of root tables used for sampling to 1 row. So even if
scale
is very low and the resulting requested row count is under 1, the user will still receive 1 row from the root (parent) table.Additionally, she pointed out that if cardinality won't be accurate in many cases if scale is this low. So we should also show a warning and encourage the user to increase the
scale
parameter:Steps to reproduce
Running this code throws a
KeyError
:The text was updated successfully, but these errors were encountered: