Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup logical type construction #2571

Merged
merged 1 commit into from
Dec 14, 2023
Merged

Conversation

benjaminwinger
Copy link
Collaborator

This reduces the amount of boiler-plate code that has to be written when creating logical types by moving that into types.cpp and providing more specific static methods to create the complex types, particularly since this is part of our public API and users don't really need to handle ExtraTypeInfo directly.

E.g. LogicalType::VAR_LIST(childType); instead of LogicalType(LogicalTypeID::VAR_LIST, std::make_unique<VarListTypeInfo>(childType));.

I'd also cleaned up a couple places where LogicalType was being used where LogicalTypeID could have been used instead, and removed some indirection from StructField to simplify it.

Ideally I think the ExtraTypeInfo implementations could be hidden inside types.cpp, however there are some more complex places, particularly for NODE, REL and RECURSIVE_REL, which I'd left alone.

One part I'm not sure about is the return types of the LogicalType static methods. The simple types all return unique_ptr (and the regular constructor is used in places where a unique_ptr is not wanted), but the complex types also need both, and it's a choice between:

  1. Provide a separate method for each. E.g. LogicalType::VAR_LIST_P to produce a unique_ptr (or a separate method not to, for consistency with the simple types?).
  2. Provide just a method which produces a LogicalType (what's implemented here), and if you want a unique_ptr wrap it with make_unique<LogicalType>(...) to move the object into the unique_ptr using the move constructor.
  3. Provide just a method which produces a unique_ptr<LogicalType> and use std::move(*LogicalType::VAR_LIST(...)) to move the data out of the unique_ptr if you want (unlike (2) this adds an unnecessary heap allocation).

@benjaminwinger benjaminwinger force-pushed the logical-type-cleanup branch 2 times, most recently from cfa5921 to eb92eab Compare December 11, 2023 22:46
Copy link

codecov bot commented Dec 11, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (ac33484) 93.15% compared to head (3799d81) 93.16%.

Files Patch % Lines
...rator/persistent/reader/parquet/parquet_reader.cpp 87.50% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2571   +/-   ##
=======================================
  Coverage   93.15%   93.16%           
=======================================
  Files        1027     1027           
  Lines       38414    38363   -51     
=======================================
- Hits        35785    35740   -45     
+ Misses       2629     2623    -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@benjaminwinger benjaminwinger force-pushed the logical-type-cleanup branch 3 times, most recently from b9c5021 to c216fc3 Compare December 12, 2023 00:56
@@ -123,12 +123,12 @@ class AggregateFunctionUtil {
static std::unique_ptr<AggregateFunction> getAvgFunc(const std::string name,
common::LogicalTypeID inputType, common::LogicalTypeID resultType, bool isDistinct);
static std::unique_ptr<AggregateFunction> getMinFunc(
const common::LogicalType& inputType, bool isDistinct);
common::LogicalTypeID inputType, bool isDistinct);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine in this PR. But in the long run, all functions (scalar, aggregation, table, ...) should have the same base class Function whose parameters are LogicalType rather than LogicalTypeID which means we will be using LogicalType almost everywhere for function related code path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly it seems weird to me to pass around versions of LogicalType which have no extraTypeInfo, even for complex types, particularly when LogicalTypeID is functionally identical.

@@ -287,80 +288,113 @@ class LogicalType {
const std::vector<std::unique_ptr<LogicalType>>& types);

static std::unique_ptr<LogicalType> BOOL() {
return std::make_unique<LogicalType>(LogicalTypeID::BOOL);
return std::unique_ptr<LogicalType>(new LogicalType(LogicalTypeID::BOOL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought make_unique is preferred compared to std::unique_ptr<>(new object). Is there a reason that you are changing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd tried making the constructors private, but only the two-argument one still is. I can revert the change for the simple types, but make_unique doesn't work with private constructors.

@andyfengHKU
Copy link
Contributor

One part I'm not sure about is the return types of the LogicalType static methods. The simple types all return unique_ptr (and the regular constructor is used in places where a unique_ptr is not wanted), but the complex types also need both, and it's a choice between:

I think we should just use unique_ptr everywhere or never use it. The former solution seems to be easier since we are already doing so in most places.

@@ -3,6 +3,15 @@

using namespace kuzu::common;

namespace kuzu::common {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper looks wired, should we provide this helper to the user? I will let @mewim comment on this.

Copy link
Collaborator Author

@benjaminwinger benjaminwinger Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just to give the C API (and there's a similar one for the java API) access to the private LogicalType constructor (using a wrapper class to make it easier to declare as friend in types.h).
Maybe we should set up per-type functions similar to the LogicalType ones for constructing types (or at least, for constructing the complex types).

@@ -239,7 +240,7 @@ class StructTypeInfo : public ExtraTypeInfo {
void serializeInternal(Serializer& serializer) const override;

private:
std::vector<std::unique_ptr<StructField>> fields;
std::vector<StructField> fields;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove the unique_ptr here?
We always use vector<unique_ptr> to avoid unintentional copy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There still is no possibility of an unintentional copy. StructField has no copy constructor (deleted implicitly since it contains a unique_ptr: E.g.). Given that its purpose is just for creating struct types, I don't think it really needs one either, so we can just leave it as a move-only type.

@benjaminwinger benjaminwinger force-pushed the logical-type-cleanup branch 3 times, most recently from 8aaa65d to e7bfd2b Compare December 12, 2023 19:34
@benjaminwinger
Copy link
Collaborator Author

I think we should just use unique_ptr everywhere or never use it. The former solution seems to be easier since we are already doing so in most places.

I've updated it to have them return std::unique_ptr, and also made some of the places which previously took a LogicalType without the unique_ptr use a unique_ptr to compensate.

@benjaminwinger benjaminwinger merged commit 2af6412 into master Dec 14, 2023
14 checks passed
@benjaminwinger benjaminwinger deleted the logical-type-cleanup branch December 14, 2023 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants