Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-42126: [C++] Move TakeXXX free functions into TakeMetaFunction and make them private #42127

Merged
merged 7 commits into from
Jun 13, 2024

Conversation

felipecrv
Copy link
Contributor

@felipecrv felipecrv commented Jun 12, 2024

Rationale for this change

Move TakeXXX free functions into TakeMetaFunction and make them private

What changes are included in this PR?

Code move and some small refactorings in preparation for #41700.

Are these changes tested?

By existing tests.

Copy link

⚠️ GitHub issue #42126 has been automatically assigned in GitHub to PR creator.

@felipecrv felipecrv changed the title GH-42126: [C++] GH-42126: [C++] Move TakeXXX free functions into TakeMetaFunction and make them private Jun 12, 2024
@felipecrv felipecrv requested a review from pitrou June 12, 2024 19:08
@felipecrv
Copy link
Contributor Author

@pitrou I took some commits from #41700 to here to reduce the code churn in there.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is only moving code around? In which case LGTM (but a question below).

TakeAAA(batch.column(j)->data(), indices.data(), options, ctx));
columns[j] = MakeArray(col_data);
static Result<std::shared_ptr<ChunkedArray>> TakeCAC(
const std::shared_ptr<ChunkedArray>& values, const Array& indices,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: is there a particular reason for taking the first arg as shared_ptr-const-ref, and the second only as value-const-ref?

Copy link
Contributor Author

@felipecrv felipecrv Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driven by the signatures of the Datum constructors [1]. Originally, they were almost all by simple &, then I tried making all of them shared_ptr& but that got overwhelming, so I settled on this mixed combination driven by the need (Array holds shared_ptr<ArrayData> which is what Datum cares about).

[1]

  /// \brief Construct from a Scalar
  Datum(std::shared_ptr<Scalar> value)  // NOLINT implicit conversion
      : value(std::move(value)) {}

  /// \brief Construct from an ArrayData
  Datum(std::shared_ptr<ArrayData> value)  // NOLINT implicit conversion
      : value(std::move(value)) {}

  /// \brief Construct from an ArrayData
  Datum(ArrayData arg)  // NOLINT implicit conversion
      : value(std::make_shared<ArrayData>(std::move(arg))) {}

  /// \brief Construct from an Array
  Datum(const Array& value);  // NOLINT implicit conversion

  /// \brief Construct from an Array
  Datum(const std::shared_ptr<Array>& value);  // NOLINT implicit conversion

  /// \brief Construct from a ChunkedArray
  Datum(std::shared_ptr<ChunkedArray> value);  // NOLINT implicit conversion

  /// \brief Construct from a RecordBatch
  Datum(std::shared_ptr<RecordBatch> value);  // NOLINT implicit conversion

  /// \brief Construct from a Table
  Datum(std::shared_ptr<Table> value);  // NOLINT implicit conversion

  /// \brief Construct from a ChunkedArray.
  ///
  /// This can be expensive, prefer the shared_ptr<ChunkedArray> constructor
  explicit Datum(const ChunkedArray& value);

  /// \brief Construct from a RecordBatch.
  ///
  /// This can be expensive, prefer the shared_ptr<RecordBatch> constructor
  explicit Datum(const RecordBatch& value);

  /// \brief Construct from a Table.
  ///
  /// This can be expensive, prefer the shared_ptr<Table> constructor
  explicit Datum(const Table& value);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to simplify these signatures further in the next PR as less will actually have to be handled by the "take" MetaFunction as "array_take" learns to handle chunked arrays by itself.

@felipecrv
Copy link
Contributor Author

I'm assuming this is only moving code around? In which case LGTM (but a question below).

I have the code moves separated from the refactorings in the commits.

Screenshot 2024-06-13 at 13 15 22

@pitrou pitrou requested a review from bkietz June 13, 2024 16:19
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jun 13, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jun 13, 2024
@pitrou
Copy link
Member

pitrou commented Jun 13, 2024

I'm away next week, feel free to merge without waiting for me or @bkietz .

@felipecrv
Copy link
Contributor Author

I'm away next week, feel free to merge without waiting for me or @bkietz .

I will merge today and any concern can be addressed in #41700.

@felipecrv felipecrv merged commit 7d84c1e into apache:main Jun 13, 2024
35 of 39 checks passed
@felipecrv felipecrv removed the awaiting changes Awaiting changes label Jun 13, 2024
@felipecrv felipecrv deleted the private_take_xxx branch June 13, 2024 21:52
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 7d84c1e.

There were 3 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about 18 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants