Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework offset column of LIST #3198

Closed
wants to merge 1 commit into from
Closed

Rework offset column of LIST #3198

wants to merge 1 commit into from

Conversation

hououou
Copy link
Collaborator

@hououou hououou commented Apr 2, 2024

see #3093

rework offset column

@hououou hououou changed the title Rework offset column Rework offset column of LIST Apr 2, 2024
@hououou hououou requested a review from ray6080 April 2, 2024 19:17
Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problems is you are still allocating buffer/scanning/committing for main column (chunk) after moving offsets out to a offsetColumn(Chunk). so the pr is quite confusing as it is trying to move offsets out of main column, while not fully done.

I think we need a bit more work on this. let's double see if we really need a separate offsetColumn.

src/include/storage/store/list_column.h Show resolved Hide resolved
@@ -107,7 +112,9 @@ void ListColumn::scan(Transaction* transaction, node_group_idx_t nodeGroupIdx,
columnChunk->setNumValues(0);
} else {
auto listColumnChunk = ku_dynamic_cast<ColumnChunk*, ListColumnChunk*>(columnChunk);
Column::scan(transaction, nodeGroupIdx, columnChunk, startOffset, endOffset);
Column::scan(transaction, nodeGroupIdx, listColumnChunk, startOffset, endOffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've moved the offsets to offsetColumn, do we need to scan the list column itself?

@@ -346,6 +342,7 @@ void ListColumn::prepareCommitForChunk(Transaction* transaction, node_group_idx_
} else {
// we separate the commit into three parts: offset chunk commit, size column chunk commit,
// data column chunk
Column::prepareCommitForChunk(transaction, nodeGroupIdx, dstOffsets, chunk, startSrcOffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, what are we committing here, as you've moved offsets to offsetColumn as I understand?

@@ -28,6 +28,8 @@ void ListDataColumnChunk::resizeBuffer(uint64_t numValues) {
ListColumnChunk::ListColumnChunk(
LogicalType dataType, uint64_t capacity, bool enableCompression, bool inMemory)
: ColumnChunk{std::move(dataType), capacity, enableCompression, true /* hasNullChunk */} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ListColumnChunk is still allocating buffer for offsets. Can you double check?

@ray6080
Copy link
Contributor

ray6080 commented Apr 8, 2024

see #3219.

@ray6080 ray6080 closed this Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants