-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix number of rows in randomly generated lists columns #15248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the bug fix, looks good to me! :)
@@ -751,13 +752,16 @@ std::unique_ptr<cudf::column> create_random_column<cudf::list_view>(data_profile | |||
|
|||
// Generate the list column bottom-up | |||
auto list_column = std::move(leaf_column); | |||
for (int lvl = 0; lvl < dist_params.max_depth; ++lvl) { | |||
for (int lvl = dist_params.max_depth; lvl > 0; --lvl) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we moving backward/bottom up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've always generated these columns bottom-up, just the value of lvl
did not match this. lvl
is not used in the loop, outside of the new check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that makes sense.
/merge |
Description
Changing
single_level_mean
to double introduced a rounding error in the iterative process of generating random lists columns. This PR addressed the issue by enforcing the correct row count in the root lists column.Checklist