-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Fix a memory misalignment in topk operator #15948
Conversation
@access2rohit @ChaiBapchya Please also help review. Thanks |
35da47d
to
b431a59
Compare
@@ -69,15 +69,15 @@ struct Shape { | |||
* \param idx dimension index | |||
* \return the corresponding dimension size | |||
*/ | |||
MSHADOW_XINLINE index_t &operator[](index_t idx) { | |||
MSHADOW_XINLINE index_t &operator[](int idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe this change a bit more in detail ? Why is this required ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not directly fixing this bug. However, while checking the tensor struct, I noticed this data type should be int because it is indexing the dimension (which is set to int)
e658902
to
2bd2cbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
// Temp space needed by the full sorts. | ||
size_t temp_size = std::max( | ||
mxnet::op::SortByKeyWorkspaceSize<index_t, DType, xpu>(src.Size()), | ||
mxnet::op::SortByKeyWorkspaceSize<DType, index_t, xpu>(src.Size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also include mxnet::op::SortByKeyWorkspaceSize<index_t, index_t, xpu>(src.Size())
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return shape_[idx]; | ||
} | ||
/*! | ||
* \brief get corresponding index | ||
* \param idx dimension index | ||
* \return the corresponding dimension size | ||
*/ | ||
MSHADOW_XINLINE const index_t &operator[](index_t idx) const { | ||
MSHADOW_XINLINE const index_t &operator[](int idx) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other places in the same file that assumes idx have index_t type. we need to fix all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
3a1d25a
to
040c8bc
Compare
b7f9786
to
f6f0750
Compare
Thank you for the fix, @apeforest . Will this be picked to the v1.5.x branch? |
* fix alignment * use correct type for shape index * clean up unnecessary space in topk * fix lint * add additional temp space * address reviewer comment * fix incorrect nidex type
@apeforest Could you please open a dummy PR against the v1.5.x branch to make sure the branch can properly pass the CI? |
See #15999 |
This reverts commit 42746bc.
Description
Current memory alignment in topk operator is incorrect if index_t is using int64_t. This PR fixes the potential issue. It partially fixes #15703
The PR also fixes an incorrect data type usage in mshadow/tensor.h
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Bug fix
Comments