Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Migrating from joda time to java.time #27330

Closed
17 of 18 tasks
spinscale opened this issue Nov 9, 2017 · 15 comments
Closed
17 of 18 tasks

Core: Migrating from joda time to java.time #27330

spinscale opened this issue Nov 9, 2017 · 15 comments
Labels

Comments

@spinscale
Copy link
Contributor

spinscale commented Nov 9, 2017

Objective: In order to support nanosecond resolution and to remove the joda time dependency from Elasticsearch, we need to migrate a lot of code over to java.time.

This is a meta issue detailing the steps that need to be taken to migrate away from joda time and to java.time.

There are a few things to keep in mind and discuss before outlining the single steps

Retaining our gazillion date formats

Do we want to retain the tons of our named date formats, named very well like strictOrdinalDateTimeNoMillis or strictWeekDateTimeNoMillis (also we support snake and camel case for each of those!)? I think we could built a layer that is still able to read those, but then maybe convert it to formatted dates like xxxx-'W'ww-e'T'HH:mm:ssZZ (just an example). This way we could get rid of this mapping entirely or at least just maintain it for a few examples like epoch_millis, epoch_second and dateOptionalTime as the text representation are nowhere near as readable as the date time formats, once internalized.

You can find the full list in the Joda class

BWC breaking

Some classes used by builders to create queries are using joda time classes. As we move towards the REST client and away from the Transport Client this will be less of an BWC issue.

Steps

@spinscale spinscale added :Core/Infra/Core Core issues without another label >breaking Meta labels Nov 9, 2017
@clintongormley
Copy link

This way we could get rid of this mapping entirely or at least just maintain it for a few examples like epoch_millis, epoch_second and dateOptionalTime as the text representation are nowhere near as readable as the date time formats, once internalized.

++

We can deprecate their use in 6.x and say ("please replace with YYYY..."), then on upgrade to 7.0, we can auto-upgrade the mappings to use the equivalent date pattern format.

spinscale added a commit to spinscale/elasticsearch that referenced this issue Dec 19, 2017
StreamInput/Streamoutput now use java.time when a timezone is serialized
over the wire. There is one special case that needs to be taken care of
and that is when java.time is sending 'Z' instead of 'UTC', which joda
time does not understand.

Relates elastic#27330
@spinscale
Copy link
Contributor Author

some more discussion about BWC.

Painless

Right now calling doc['foo'].value returns a joda DateTime object (due to ScriptDocValues in core returning one), which makes it non-trivial to just convert to java time, as many users will have put this kind of code in their scripts to deal with dates.

A possible work around for this would be to have a doc['foo'].date_value object which returns a java time.

Client side runtime casting of bucket keys

InternalDateHistogram.Bucket.getKey() implements the Bucket interface, which returns Object for the getKey() method. In the case of the date histogram a joda DateTime is returned, which clients might just be casting to DateTime object, because it has always been like that. This might only be uncovered in runtime, and not in compile time for people using Java.

We need to check if the high level REST client is affected by this as well.

Different symbols used in date formatting

There are a couple of differences when date time formatters are used, according to the java docs. joda time has a century of era symbol, using the C character. This has been removed fully from java time. Also the YYYY formatter is year of era in joda time, but week based year in java time (need to check how much of an impact this really is).

One idea was to have a setting which allows to set which date implementation would be used, but this would mean to not have support rolling upgrades as having to have a hard move from one implementation to the other, so we need to support both time libraries at least in scripting.

@spinscale
Copy link
Contributor Author

There are some differences in TZ handling we have to cater for. Jodatime has 593 tz ids, java time 600. java time adds a few new ones (starting with SystemV which seem not to be used at all), and those can be ignored. However, java time removes EST, GMT+0, GMT-0, HST, MST, ROC.

The general recommendation seems to be to use either offsets, like UTC-08:00 or those region based names like America/New York rather than the above as those are ambiguous.

We might need to add deprecation warnings for the above timezones.

spinscale added a commit to spinscale/elasticsearch that referenced this issue Jan 22, 2018
- StreamInput/StreamOutput now use java.time to interpret date times.
- XContentBuilder uses java.time for its formatters.
- Cutover to using java time zones in MappedFieldType.rangeQuery (and
  thus RangeFieldMapper, DateFieldMapper, SimpleMappedFieldType)
- QueryStringQueryBuilder and RangeQueryBuilder uses java.time time zone
- A few tests were moved to java.time where simply any time was needed

Relates elastic#27330
@jasontedor
Copy link
Member

Relates #10005

@rjernst
Copy link
Member

rjernst commented Aug 15, 2018

I've been investigating how we might add support for a bwc layer within 6.x so that using existing joda format strings is backed by java time. This looks completely possible to do with a sysprop that controls whether a mapping layer on parsing formats is enabled, except for the lack of century-of-era support @spinscale mentioned in an earlier comment. This raises the question of whether this extremely esoteric option could be broken in a minor (since there is no migration path anyways, and IMO the likelihood of any user out there using it is almost non-existent). The alternative to this is to build an abstraction layer internally for all datetime uses, so that java/joda use can be swapped by the same sysprop, determine which underlying implementation of the abstraction to use (this would rely on replacing all uses of parsing/formatting on the abstraction). I'd like others opinions on the tradeoffs between these two approaches.

@rjernst
Copy link
Member

rjernst commented Aug 15, 2018

Note: A third alternative that I do not see as viable is implementing the century-of-era in java time. While it is possible, it would require quite a bit of code for something that I don't think anyone is actually using.

@spinscale
Copy link
Contributor Author

to add some more background: here is a link what century of era refers to in an ISO chronology, i.e. 19th century for all the years from 1900 till 1999: http://joda-time.sourceforge.net/field.html#CenturyOfEra_and_YearOfCentury - which basically leaves you with a super broad/inexact timestamp

My unscientific feeling also says, there are not a lot of users of this.

@MovGP0
Copy link

MovGP0 commented Aug 23, 2018

Support for JodaTime and NodaTime time formats is really important for me.

For all business applications I have worked on, it is way more important to have validateable data types and algorithmic correctness, than issues like having very „high precision“, „better performance“, or „no dependency on external libraries“.

Having additional datatypes that support higher precision is ok with me. But I would not invest the work in reinventing the wheel in cases where the JodaTime library works fine. It might make more sense to implement the high-precision time datatypes in the JodaTime library instead.

@rjernst
Copy link
Member

rjernst commented Aug 27, 2018

@MovGP0 The joda time project encourages users to migrate to the the java time api. From a maintainability perspective, we are moving to java time to both eliminate an external dependency and ensure bugs are fixed. From a format support perspective, joda time and java time are very similar. There are a only a handful of format specifiers that changed, and the year of century mentioned above is the only one that does not exist in java time. Do you have a use for it specifically, or are you generally concerned with needing to update your formats?

@MovGP0
Copy link

MovGP0 commented Sep 2, 2018

@rjernst I'm mostly concert about reinventing the wheel (aka. 'not invented here syndrome'). Usually it's better to just use and contribute to a library where possible, rather than developing something as complex as handling international time calculations yourself.

Not sure how far the Java Time API has evolved to handle Joda Time's use cases.

@jasontedor
Copy link
Member

jasontedor commented Sep 2, 2018

@MovGP0 As @rjernst mentioned, Joda is deprecated, no longer developed, and users are encouraged to move to java.time:

Note that from Java SE 8 onwards, users are asked to migrate to java.time (JSR-310) - a core part of the JDK which replaces this project.

Note that Joda-Time is considered to be a largely “finished” project. No major enhancements are planned. If using Java SE 8, please migrate to java.time (JSR-310).

@rjernst
Copy link
Member

rjernst commented Sep 7, 2018

We have been looking carefully at how to provide a migration path and deprecation warnings for those using joda time specific behavior. There are 2 types of places within Elasticsearch APIs the can be found: scripting and anywhere date formats are used.

For scripting, we previously attempted to provide migration through #31441. However, that turned out to be problematic with rolling upgrades. The new approach agreed on is to change the scripting api for date fields to return a ZonedDateTime compatible object, but with augmentation methods for the missing joda time methods, which will trigger deprecation warnings when used. This is implemented in #31441.

For date formats, the problem is a few characters for which the identifier has changed, or no longer exists in java time. For these the plan is to provide a special prefix that may be used for date formats that want to force the use of java time format specifiers (probably 8:). For most users, formats are compatible between the two, so nothing will change. For those using specifiers that have changed meanings or no longer exist, a deprecation warning will be emitted when parsing the format. The special prefix to force the new formats will continue to exist through 7.x to provide time for users affected by this to switch back to a format string without the prefix.

@spinscale
Copy link
Contributor Author

Two more BWC things Dave mentioned to me:

  1. Java 8 and Java9/10 do have different timezone handling. When creating a parser with a time zone in Java 8, it will overwrite the timezone parsed in the text input. This is fixed in java 9/10, but may lead to different results in java8 when used in combination with formatters. We need to verify this for the ingest processors
public static void main(String[] argv) {
    DateTimeFormatter formatter = new DateTimeFormatterBuilder().appendPattern("yyyy-MM-dd'T'HH:mm:ss,SSSXX").toFormatter(Locale.ROOT);
    String date = "2018-05-15T16:14:56,374Z";

    System.out.println(formatter.withZone(ZoneId.of("Europe/Berlin")).parse(date));
    System.out.println(formatter.withZone(ZoneOffset.UTC).parse(date));
    System.out.println(formatter.parse(date));

    System.out.println(Instant.from(formatter.withZone(ZoneId.of("Europe/Berlin")).parse(date)));
    System.out.println(Instant.from(formatter.withZone(ZoneOffset.UTC).parse(date)));
    System.out.println(Instant.from(formatter.parse(date)));
  }

Under java 10 this returns

{InstantSeconds=1526400896, OffsetSeconds=0},ISO,Europe/Berlin resolved to 2018-05-15T16:14:56.374
{InstantSeconds=1526400896, OffsetSeconds=0},ISO,Z resolved to 2018-05-15T16:14:56.374
{InstantSeconds=1526400896, OffsetSeconds=0},ISO resolved to 2018-05-15T16:14:56.374
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z

Under java 8 this returns

{OffsetSeconds=0, InstantSeconds=1526393696},ISO,Europe/Berlin resolved to 2018-05-15T16:14:56.374
{OffsetSeconds=0, InstantSeconds=1526400896},ISO,Z resolved to 2018-05-15T16:14:56.374
{OffsetSeconds=0, InstantSeconds=1526400896},ISO resolved to 2018-05-15T16:14:56.374
2018-05-15T14:14:56.374Z
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z

The third last line is the important one. One time it is 14:14 and one time 16:14

  1. Appending a fraction using .appendFraction(MILLI_OF_SECOND, 3, 3, true) used to work with either a dot or a localized decimal point symbol in joda time, but this requires an explicit configuration in java time. This might be an issue for ingest pipelines as well.

@robertdumitrescu
Copy link

What's the plan for solving this issue in terms of releases timeline?

pgomulka added a commit that referenced this issue Feb 4, 2019
monitoring plugin migration from joda to java.time

refers #27330
pgomulka added a commit that referenced this issue Feb 4, 2019
part of the migrating joda time work. Migrating watcher plugin to use JDK's java-time

refers #27330
pgomulka added a commit that referenced this issue Feb 5, 2019
part of the migrating joda time work.
refactoring x-pack plugins usages of joda to java-time
refers #27330
pgomulka added a commit that referenced this issue Feb 6, 2019
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 8, 2019
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 8, 2019
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330
pgomulka added a commit that referenced this issue Feb 8, 2019
…8505)

When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
backport#38505
pgomulka added a commit that referenced this issue Feb 11, 2019
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
BackPort #38505
@jasontedor jasontedor reopened this Apr 6, 2019
@pgomulka
Copy link
Contributor

pgomulka commented Oct 4, 2019

@jasontedor I think we can close this as we are fully migrated. We only have bugfixes now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants