Spark Connector, support for TIMESTAMP and BOOLEAN fields #8825
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change
Spark Connector doesn't support TIMESTAMP and BOOLEAN field types, which were introduced to Pinot after the connector was added. I'm adding mapping for singular and array variations of these types.
New field type mappings:
Discussion
Spark also supports a TimestampType which is backed by "Long" and stores milliseconds since epoch as explained here. It could have been a better choice from Pinot
TIMESTAMP
field, however I had a hard time correctly translating the Pinot value to microseconds for allTIMESTAMP
column. I'm open to suggestions here, would like to know if there is an easy way.Testing
Backwards Compatibility
No previous behavior is broken with the introduction of these fields. Previously the connector would throw an exception when it came across these unknown Pinot field types.
bugfix
feature