Check for `multipleOf` overflow #438

Zac-HD · 2020-10-03T04:06:07Z

The spec says that instances are valid if "dividing by this value results in an integer".

This allows integers to be invalid for multipleOf: 0.5 if float division overflows to infinity (a non-integer); alternatively implementations may choose to implement logic which defines all integers as multiples of 0.5. Either way, however, implementations must not raise an error due to the overflow of a legal value against a legal schema.

Reference: python-jsonschema/jsonschema#743, cc @Julian

The spec says that instances are valid if "dividing by this value results in an integer". This allows integers to be invalid for `multipleOf: 0.5` if float division overflows to infinity (a non-integer); alternatively implementations may choose to implement logic which defines all integers as multiples of 0.5. Either way, however, implementations must not raise an error due to the overflow of a legal value against a legal schema.

Julian · 2020-10-04T15:57:58Z

Cool. Think these lgtm, thanks!

(To add some extra color for folks -- the added required test seems consistent with the spec as well -- implementations with only IEEE floats will get inf, and so can indeed distinguish the overflow).

karenetheridge · 2020-10-16T16:45:25Z

implementations may choose to implement logic which defines all integers as multiples of 0.5

I'm curious as to how you might expect to see this in code. Would 0.5 be special-cased somehow?

Julian · 2020-10-16T19:50:01Z

Probably more realistically by doing something like what we ended up doing (falling back to non-float division on overflow). Which yeah isn't required, but the non-erroring is.

Zac-HD · 2020-10-16T22:42:52Z

That's the general solution - you could also check whether 1/multipleOf is an integer, and if so drop it from integer schemas.

karenetheridge · 2020-10-22T18:15:07Z

This is not a valid test on systems built with longdoubles (where 16 bytes are allocated to floats) -- as the inputs here are not big enough to cause an overflow. You need numbers way way larger to be sure to tickle an overflow (on my system, 'Inf' is the result in that case), and then you'll clash with the optional/bignum.json tests which suggest that implementations should also be able to handle very large numbers, which means that this edge case will be even harder to reproduce.

(breadcrumb for self: an example of a failure on a smoker system is http://www.cpantesters.org/cpan/report/ee0e51be-1384-11eb-a9f9-f9eb91108de4 )

11:11 < ether> BinGOs: can you give me  perl -wle'print 1e308 / 0.123456789'   and perl -wle'print 1e308 / 0.5' also please?
11:12 < BinGOs> 8.10000007371000067e+308 and 2e+308 respectively

Julian · 2020-10-22T19:02:31Z

Are you referring to the optional ones or the required one?

The required one passes there even in your example because indeed the result is nondivisible right?

You're saying the optional test conflicts with the other optional test?

karenetheridge · 2020-10-22T19:30:45Z

I'm referring to the required one. On a longdouble system, the operation does not overflow -- the quotient is 8.10000007371000067e+308 which is an integer, so the evaluation passes even though the test says it should fail.

Julian · 2020-10-22T19:35:55Z

Ah sorry I can't read... ok think I follow.

Probably we can then pick a value that doesn't produce an integer with that much precision either instead of the one we have, and then that should suit both possibilities?

ssilverman · 2020-10-22T19:37:44Z

In this section: https://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.4.2
it states, "numeric instances processed by JSON Schema can be arbitrarily large and/or have an arbitrarily long decimal part, regardless of the ability of the underlying programming language to deal with such data."

To me, this means that 1e308 is never a multiple of 0.123456789, despite the fact that there's loss of precision given some language's implementation.

Might it be useful to add a paragraph in some future draft dealing with "implementations that truncate", describing expected behaviour and limitations? Currently, the above spec reference doesn't mention any MUST or SHOULD words when talking about this.

ssilverman · 2020-10-22T19:46:26Z

One of the reasons the Unicode-based regex tests were moved to optional/ was because .NET doesn't support them (I disagreed with that). I'm not in favour of moving things to optional/ based on the fact that implementations don't support things.

Having said that, floating-point handling is probably even more fundamental than regexes, but still, for arbitrary-precision-capable implementations, these tests are still valid. I don't agree with optional or missing tests just to avoid specific implementation details. For arbitrary-precision-capable implementations, these tests shouldn't just be present, they should be non-optional.

Further to that, why isn't having a 1e308//0.123456789 test a good thing? If we're only testing values that don't overflow, how will perfectly valid arbitrary-precision implementations that support ranges beyond the overflow range of most byte-limited implementations get tested?

These are perfectly valid, spec-conforming, non-optional tests. Unless, is arbitrary-precision support optional? If not, these tests need to be non-optional.

karenetheridge · 2020-10-22T19:50:35Z

Probably we can then pick a value that doesn't produce an integer with that much precision either instead of the one we have, and then that should suit both possibilities?

Possibly, but I don't have the time right now to compile a new interpreter with longdouble enabled, and then search for a suitable pair of numbers that do not divide evenly.. and then test them on systems with other float sizes to see that they divide as expected also.

karenetheridge · 2020-10-22T19:52:52Z

Further to that, why isn't having a 1e308//0.123456789 test a good thing?

Testing how a JSON Schema implementation handles overflow is fine (it should return an invalid result, rather than exploding or returning true). The problem is creating a test case that is guaranteed to produce an overflow.

Julian · 2020-10-22T20:35:52Z

Can you maybe open an issue to start and we can figure out whatever the right solution should be? (Or propose one if you have one -- ideally though we should end up with something "universal" that won't be forgotten as a required test, but if we can't think of how to do so we can leave that as an open issue) -- but yeah open an issue for the test being unpassable if you have longer floats?

Julian · 2020-10-22T20:38:50Z

We may even be able to bribe Zac-HD :D -- maybe hypothesis could find us a pair of numbers that return non-integers for both 8 and 16 byte floats or something (and yeah sorry I'm still mid-$WORK so possibly my brain isn't thinking straight still on an obvious solution so feel free to suggest one if anyone has one)

ssilverman · 2020-10-22T20:41:35Z

The problem is creating a test case that is guaranteed to produce an overflow.

What's wrong with 1e308//0.123456789? Maybe I'm missing something...

Julian · 2020-10-22T21:09:20Z

Unless, is arbitrary-precision support optional?

Arbitrary precision is optional yeah. You are allowed to use IEEE floats or doubles (and presumably also this -- 16 byte floats)

Zac-HD · 2020-10-24T10:24:07Z

The problem: this test passes (i.e. overflows to non-integer value infinity) at 64bit precision, and passes (i.e. gives a non-integer-valued finite result) at arbitrary precision. I also get np.longdouble("1e308") / np.longdouble("0.123456789") == np.inf, but on my system I'm pretty sure that's only 80-bit precision.

So the problem appears to be that rounding specifically in 128-bit floats leads to an integer output from this test case.

I think we can solve this just by making the quotient e.g. 1e308 / 0.12345678901234567890123456789 which works just fine at 64bit float and arbitrary precision. @karenetheridge, can you check this with 128-bit floats for me and I'll open a follow-up PR?

karenetheridge · 2020-10-27T16:29:28Z

That calculation doesn't overflow, but it still looks like an even multipleOf (the quotient is an integer):

$ perl -wle'my $quotient = 1e308 / 0.12345678901234567890123456789; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
8.10000007290000066e+308
true

$ perl -V
Summary of my perl5 (revision 5 version 32 subversion 0) configuration:
   
  Platform:
    osname=darwin
    osvers=17.7.0
    archname=darwin-ld-2level
    uname='darwin aquavit 17.7.0 darwin kernel version 17.7.0: mon aug 31 22:11:23 pdt 2020; root:xnu-4570.71.82.6~1release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/ether/perl5/perlbrew/perls/32.0_longdouble -Dman1dir=none -Dman3dir=none -Duselongdouble -Aeval:scriptdir=/Users/ether/perl5/perlbrew/perls/32.0_longdouble/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=define
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.13 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.13 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='long double'
    nvsize=16
    Off_t='off_t'
    lseeksize=8
    alignbytes=16
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=10.13 -fstack-protector-strong -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib /opt/local/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.13 -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_LONG_DOUBLE
    USE_PERLIO
    USE_PERL_ATOF
  Built under darwin
  Compiled at Oct 27 2020 09:16:39
  %ENV:
    PERLBREW_HOME="/Users/ether/.perlbrew"
    PERLBREW_MANPATH="/Users/ether/perl5/perlbrew/perls/32.0_longdouble/man"
    PERLBREW_PATH="/Users/ether/perl5/perlbrew/bin:/Users/ether/perl5/perlbrew/perls/32.0_longdouble/bin"
    PERLBREW_PERL="32.0_longdouble"
    PERLBREW_ROOT="/Users/ether/perl5/perlbrew"
    PERLBREW_SHELLRC_VERSION="0.89"
    PERLBREW_VERSION="0.89"
    PERLDOC="-oman"
    PERLDOC_PAGER="less -sicMr"
    PERL_AUTOINSTALL_PREFER_CPAN="1"
    PERL_INSTALL_QUIET="1"
    PERL_USE_UNSAFE_INC="0"
  @INC:
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/site_perl/5.32.0/darwin-ld-2level
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/site_perl/5.32.0
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/5.32.0/darwin-ld-2level
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/5.32.0

karenetheridge · 2020-10-29T17:32:07Z

You can see the same problem with a simpler example:

$ perl -wle'my $quotient = 1e308 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3.33333333333333333e+307
true

The divisor is so large that even though the quotient is not an integer, it appears as if it an integer because of how floats are expressed internally. I do not see any way of correctly checking the integer status here (and the same situation will occur with smaller divisors on architectures with smaller floats).

e.g. on a 64-bit float implementation:

$ perl -wle'my $quotient = 1e10 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3333333333.33333
false

$ perl -wle'my $quotient = 1e100 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3.33333333333333e+99
true

edit, 2021-12-30: I was wrong about most of what I said here :) see #534 for a more current discussion.

karenetheridge · 2021-06-12T05:15:29Z

This is not a valid test on systems built with longdoubles (where 16 bytes are allocated to floats) -- as the inputs here are not big enough to cause an overflow.

The test being referred to here is multipleOf.json - "invalid instance should not raise error when float division = inf" - "always invalid, but naive implementations may raise an overflow error". It also fails when 12 bytes are allocated to floats.

http://www.cpantesters.org/cpan/report/d1a4dd18-ca7c-11eb-932e-240466f710c6
http://www.cpantesters.org/cpan/report/2464b1aa-ca7c-11eb-8dc8-abf665f710c6
(on these test systems, doublekind=3, longdblkind=3)

Fixes overflow error exposed by: json-schema-org/JSON-Schema-Test-Suite#438 ``` FloatDomainError: Infinity /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `floor' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `validate_numeric' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:378:in `validate_number' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:277:in `validate_type' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:198:in `validate_instance' ```

Floats aren't accurate enough here. Fixes: #100 Also addresses overflow exposed by: json-schema-org/JSON-Schema-Test-Suite#438 ``` FloatDomainError: Infinity /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `floor' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `validate_numeric' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:378:in `validate_number' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:277:in `validate_type' /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:198:in `validate_instance' ```

Zac-HD force-pushed the overflow-with-multipleOf branch from a2a52b1 to c5ba4ba Compare October 3, 2020 04:11

Julian merged commit 96742ba into json-schema-org:master Oct 4, 2020

Zac-HD deleted the overflow-with-multipleOf branch October 4, 2020 23:26

Zac-HD mentioned this pull request Oct 4, 2020

Handle multipleOf overflow python-jsonschema/jsonschema#746

Merged

This comment has been minimized.

Sign in to view

karenetheridge mentioned this pull request Dec 26, 2021

multipleOf test is invalid #534

Closed

jimmylewis mentioned this pull request Dec 28, 2021

multipleOf contains incorrectly optional (bignum dependent) tests #536

Open

davishmcclurg mentioned this pull request Apr 1, 2022

Use BigDecimal for multiple_of check davishmcclurg/json_schemer#102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for `multipleOf` overflow #438

Check for `multipleOf` overflow #438

Zac-HD commented Oct 3, 2020

Julian commented Oct 4, 2020

karenetheridge commented Oct 16, 2020

Julian commented Oct 16, 2020

Zac-HD commented Oct 16, 2020

karenetheridge commented Oct 22, 2020 •

edited

Loading

Julian commented Oct 22, 2020

karenetheridge commented Oct 22, 2020 •

edited

Loading

Julian commented Oct 22, 2020

ssilverman commented Oct 22, 2020

ssilverman commented Oct 22, 2020 •

edited

Loading

karenetheridge commented Oct 22, 2020

karenetheridge commented Oct 22, 2020

Julian commented Oct 22, 2020

Julian commented Oct 22, 2020

ssilverman commented Oct 22, 2020

Julian commented Oct 22, 2020

This comment has been minimized.

Zac-HD commented Oct 24, 2020

karenetheridge commented Oct 27, 2020 •

edited

Loading

karenetheridge commented Oct 29, 2020 •

edited

Loading

karenetheridge commented Jun 12, 2021

Check for multipleOf overflow #438

Check for multipleOf overflow #438

Conversation

Zac-HD commented Oct 3, 2020

Julian commented Oct 4, 2020

karenetheridge commented Oct 16, 2020

Julian commented Oct 16, 2020

Zac-HD commented Oct 16, 2020

karenetheridge commented Oct 22, 2020 • edited Loading

Julian commented Oct 22, 2020

karenetheridge commented Oct 22, 2020 • edited Loading

Julian commented Oct 22, 2020

ssilverman commented Oct 22, 2020

ssilverman commented Oct 22, 2020 • edited Loading

karenetheridge commented Oct 22, 2020

karenetheridge commented Oct 22, 2020

Julian commented Oct 22, 2020

Julian commented Oct 22, 2020

ssilverman commented Oct 22, 2020

Julian commented Oct 22, 2020

This comment has been minimized.

Zac-HD commented Oct 24, 2020

karenetheridge commented Oct 27, 2020 • edited Loading

karenetheridge commented Oct 29, 2020 • edited Loading

karenetheridge commented Jun 12, 2021

Check for `multipleOf` overflow #438

Check for `multipleOf` overflow #438

karenetheridge commented Oct 22, 2020 •

edited

Loading

karenetheridge commented Oct 22, 2020 •

edited

Loading

ssilverman commented Oct 22, 2020 •

edited

Loading

karenetheridge commented Oct 27, 2020 •

edited

Loading

karenetheridge commented Oct 29, 2020 •

edited

Loading