Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DATETIMECONVERTWINDOWHOP function #11773

Merged
merged 1 commit into from
Nov 4, 2023

Conversation

alexch2000
Copy link
Contributor

@alexch2000 alexch2000 commented Oct 10, 2023

Adding DATETIMECONVERTWINDOWHOP converts the value from a column that contains an epoch timestamp into another time unit and buckets based on the given time granularity and window size.

#11775

Usage Examples

These examples are based on the Batch JSON Quick Start.
created_at_timestamp from milliseconds since epoch to seconds since epoch, bucketed to 1 hour window with 15 min granularity:

select id,
  created_at_timestamp,
  cast(created_at_timestamp AS long) AS timeInMs,
  DATETIMECONVERTWINDOWHOP(
    created_at_timestamp,
    '1:MILLISECONDS:EPOCH',
    '1:SECONDS:EPOCH',
    '15:MINUTES',
    '1:HOUR',
  ) AS windowHops
from githubEvents
WHERE id = 7044874134
id created_at_timestamp timeInMs windowHops
7044874134 2018-01-01 11:00:00.0 1514804402000 [1514804402,   1514803502, 1514802602, 1514801702]

Moving window of unique user counts per hour with 15 min granularity:

select id,
  DATETIMECONVERTWINDOWHOP(
    created_at_timestamp,
    '1:MILLISECONDS:EPOCH',
    '1:SECONDS:EPOCH',
    '15:MINUTES',
    '1:HOUR',
  ) AS windowHops,
  DISTINCTCOUNT(userId) AS uniqueUsers
from githubEvents
group by 1

Testing

The feature is tested by unit tests + manual testing

Screenshots run locally

Screenshot 2023-10-05 at 10 09 40

Screenshot 2023-10-10 at 18 21 02

@alexch2000 alexch2000 marked this pull request as ready for review October 10, 2023 16:27
@alexch2000 alexch2000 force-pushed the window_hop branch 2 times, most recently from 4ba5cf2 to ed32f13 Compare October 11, 2023 07:39
@Jackie-Jiang Jackie-Jiang added the release-notes Referenced by PRs that need attention when compiling the next release notes label Oct 12, 2023
@dario-liberman
Copy link
Contributor

@Jackie-Jiang - you can try it out yourself, works like a charm on single stage.

@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2023

Codecov Report

Merging #11773 (9353de0) into master (5d58102) will increase coverage by 0.03%.
Report is 16 commits behind head on master.
The diff coverage is 86.46%.

@@             Coverage Diff              @@
##             master   #11773      +/-   ##
============================================
+ Coverage     61.45%   61.49%   +0.03%     
+ Complexity     1147     1146       -1     
============================================
  Files          2375     2382       +7     
  Lines        128500   128633     +133     
  Branches      19846    19862      +16     
============================================
+ Hits          78974    79097     +123     
  Misses        43815    43815              
- Partials       5711     5721      +10     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.44% <86.46%> (+0.03%) ⬆️
java-21 61.32% <86.46%> (+0.02%) ⬆️
skip-bytebuffers-false 61.48% <86.46%> (+0.06%) ⬆️
skip-bytebuffers-true 34.76% <86.46%> (-26.53%) ⬇️
temurin 61.49% <86.46%> (+0.03%) ⬆️
unittests 61.48% <86.46%> (+0.03%) ⬆️
unittests1 46.71% <86.46%> (+0.05%) ⬆️
unittests2 27.58% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...e/pinot/common/function/TransformFunctionType.java 89.15% <100.00%> (+0.10%) ⬆️
...r/transform/function/TransformFunctionFactory.java 89.71% <100.00%> (+0.05%) ⬆️
.../datetimehop/EpochToEpochWindowHopTransformer.java 100.00% <100.00%> (ø)
...er/datetimehop/EpochToSDFHopWindowTransformer.java 100.00% <100.00%> (ø)
...er/datetimehop/SDFToEpochWindowHopTransformer.java 100.00% <100.00%> (ø)
...rmer/datetimehop/SDFToSDFWindowHopTransformer.java 100.00% <100.00%> (ø)
.../datetimehop/BaseDateTimeWindowHopTransformer.java 95.65% <95.65%> (ø)
...tetimehop/DateTimeWindowHopTransformerFactory.java 68.42% <68.42%> (ø)
...nction/DateTimeConversionHopTransformFunction.java 73.80% <73.80%> (ø)

... and 13 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@alexch2000
Copy link
Contributor Author

@Jackie-Jiang Fixed unit test. and FYI I also didn't implement it as a scalar function.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general

private final long _outputGranularityMillis;
protected final long _hopWindowSizeMillis;

public BaseDateTimeWindowHopTransformer(@Nonnull DateTimeFormatSpec inputFormat,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(convention) We usually only annotate @Nullable and assume everything non-null if not annotated. Same for other classes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@alexch2000
Copy link
Contributor Author

@Jackie-Jiang do you have any other feedback regarding this diff?

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Can you help add documentation for this new feature following instructions here?

@Jackie-Jiang Jackie-Jiang merged commit baea4a2 into apache:master Nov 4, 2023
19 checks passed
@alexch2000
Copy link
Contributor Author

Looks great! Can you help add documentation for this new feature following instructions here?

Sure! Will do this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants