Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup some reader/writer logic for raw forward index #11669

Merged
merged 1 commit into from
Sep 25, 2023

Conversation

Jackie-Jiang
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang commented Sep 25, 2023

  • Cleanup some redundant code
  • Revise the documentation
  • Rename some files to be more accurate
  • V4 raw index reader for multi-value column is not implemented. Fall back to V2 if V4 is configured

This is the preparation for complete V4 support

Copy link
Contributor Author

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saurabhd336 We want to handle the above 2 TODOs (add raw index reader for V4 multi-value index)

}
} else {
// TODO: Support V4 MV reader
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^

1);
_indexWriter = new VarByteChunkSVForwardIndexWriter(file, compressionType, totalDocs, numDocsPerChunk,
totalMaxLength, writerVersion);
// TODO: Support V4 MV reader
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^

@codecov-commenter
Copy link

codecov-commenter commented Sep 25, 2023

Codecov Report

Merging #11669 (1f60a3f) into master (e6655dd) will decrease coverage by 0.03%.
Report is 4 commits behind head on master.
The diff coverage is 57.84%.

@@             Coverage Diff              @@
##             master   #11669      +/-   ##
============================================
- Coverage     63.10%   63.07%   -0.03%     
+ Complexity     1121     1120       -1     
============================================
  Files          2343     2343              
  Lines        125693   125665      -28     
  Branches      19310    19312       +2     
============================================
- Hits          79315    79262      -53     
- Misses        40728    40752      +24     
- Partials       5650     5651       +1     
Flag Coverage Δ
integration <0.01% <0.00%> (ø)
integration1 <0.01% <0.00%> (ø)
integration2 0.00% <0.00%> (ø)
java-11 63.02% <57.84%> (-0.03%) ⬇️
java-17 62.93% <57.84%> (+0.01%) ⬆️
java-20 62.95% <57.84%> (+0.01%) ⬆️
temurin 63.07% <57.84%> (-0.03%) ⬇️
unittests 63.06% <57.84%> (-0.03%) ⬇️
unittests1 67.24% <57.84%> (+<0.01%) ⬆️
unittests2 14.45% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
.../writer/impl/FixedByteChunkForwardIndexWriter.java 96.55% <100.00%> (ø)
...io/writer/impl/VarByteChunkForwardIndexWriter.java 100.00% <ø> (ø)
.../writer/impl/VarByteChunkForwardIndexWriterV4.java 95.12% <100.00%> (ø)
.../impl/fwd/SingleValueFixedByteRawIndexCreator.java 90.47% <100.00%> (ø)
...ocal/segment/index/loader/ForwardIndexHandler.java 82.85% <ø> (ø)
...x/readers/forward/BaseChunkForwardIndexReader.java 47.16% <100.00%> (ø)
...rs/forward/FixedByteChunkSVForwardIndexReader.java 100.00% <ø> (ø)
...ward/FixedBytePower2ChunkSVForwardIndexReader.java 100.00% <ø> (ø)
...ders/forward/VarByteChunkMVForwardIndexReader.java 95.23% <100.00%> (-0.09%) ⬇️
...ders/forward/VarByteChunkSVForwardIndexReader.java 100.00% <ø> (ø)
... and 11 more

... and 17 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

return dataType.getStoredType() == DataType.STRING
private static String getValidPropertyValue(String value, boolean isMax, DataType storedType) {
String valueWithinLengthLimit = getValueWithinLengthLimit(value, isMax, storedType);
return storedType.getStoredType() == DataType.STRING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storedType == DataType.STRING?

totalMaxLength, writerVersion);
// TODO: Support V4 MV reader
// Currently fall back to V2 for backward compatible
if (writerVersion >= VarByteChunkForwardIndexWriterV4.VERSION) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use writerVersion == VarByteChunkForwardIndexWriterV4.VERSION to be explicit?

}
}
}
SegmentColumnarIndexCreator.addColumnMinMaxValueInfo(_segmentProperties, columnName, String.valueOf(min),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is common and can be moved out of switch case statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be moved because min and max has different type for different data type

@Jackie-Jiang Jackie-Jiang merged commit aa5d65f into apache:master Sep 25, 2023
20 of 21 checks passed
@Jackie-Jiang Jackie-Jiang deleted the cleanup_raw_index branch September 25, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants