replacing `expm1` implementation #131

mwallerb · 2021-09-01T15:37:56Z

expm1(x::Double64) loses all digits of accuracy below abs(x) < 1e-163 and returns 0.0 instead of x.

This regresses even beyond Float64, which correctly returns x.

using DoubleFloats
x = 1e-200
@assert x == expm1(x)
xx = Double64(x)
@assert xx == expm1(xx)

The text was updated successfully, but these errors were encountered:

JeffreySarnoff · 2021-09-02T01:26:37Z

good catch

JeffreySarnoff · 2021-09-02T01:47:54Z

here is my quick fix ... it should help you while I review it

function Base.expm1(a::DoubleFloat{T}) where {T<:DoubleFloats.IEEEFloat}
    isnan(a) && return a
    isinf(a) && return(signbit(a) ? zero(DoubleFloat{T}) : a)
    u = exp(a)
    # temp fix of if (u == one(DoubleFloat{T}))
    if (isone(u.hi))
        a
    elseif (u-1.0 == -one(DoubleFloat{T}))
        -one(DoubleFloat{T})
    else
        a*(u-1.0) / log(u)
    end
end

oscardssmith · 2022-01-22T17:40:33Z

#135 has a much better fix.

JeffreySarnoff · 2022-01-31T20:18:59Z

#135 has a much better fix.

oscardssmith · 2022-01-31T20:32:22Z

Shouldn't this remain open until #135 is merged?

JeffreySarnoff · 2022-01-31T20:35:40Z

yes

JeffreySarnoff · 2022-05-03T08:53:37Z

I do not see the described behavior

using DoubleFloats

x = 1e-200
dfx = Double64(x)
expm1(dfx) 
# 1.0e-200
Double64(expm1(BigFloat(x)))
# 1.0e-200

I am closing this -- if there is something I missed, let me know

oscardssmith · 2022-05-03T11:35:10Z

this is still an issue for expm1(.5)

JeffreySarnoff · 2022-05-03T14:22:09Z

???

using DoubleFloats, Quadmath
x= 0.5
x128 = Float128(x)
xdf = Double64(x)
ans128 = expm1(x128)
ansdf = expm1(xdf)
bestansdf = Double64(ans128)

bestansdf - ansdf == 0.0

oscardssmith · 2022-05-03T15:00:13Z

Sorry, I chose the wrong number.

expm1(Double64(.0000000000000001))
1.0e-16
julia> expm1(big(Double64(.0000000000000001)))-ans
4.999999999999999957644533907012698185150443928822782614563950837394859595575927e-33
julia> eps(expm1(Double64(.0000000000000001)))
2.7369110631344083e-48

so in this case, DoubleFloats is off by 10e15 ULP

JeffreySarnoff · 2022-05-04T06:35:14Z

After spending some time with the implementation, it has become clear that the only fix is to replace expm1(::Double64) with something that works and runs within 2.5x-3.0x of the spotty expm1. I have done this.
expm1 now converts its argument to a Float128 (from Quadmath), lets Quadmath compute the function and reconverts the result to Double64..
This is much faster than using BigFloat, and it is well traveled.

I did generate factorial coeffs for Double64 in the event we want to run the series solution another time. 1! .. 36! will fit in a Double64 without error.
I derived the approach and computed the values that lets us use 37! .. 52!,
for these each factorial is separated into two values of approximately equal magnitude, values which when multiplied together yield the corresponding factorial. Of course direct multiplication of these values would by inexact.

Consider 42!

𝒻𝒶𝒸(x) = factorial(BigInt(x))

factorial42 =  𝒻𝒶𝒸(42)
# 405006117752879898543142606244511569936384000000000
# this is too large to be held within a Double64 and be recovered

split_factorial(target, splitter) = 
    split_factorial(BigInt(target), BigInt(splitter))

function split_factorial(target::T, splitter::T) where {T::BigInt}
    splitter_factorial_contribution =  𝒻𝒶𝒸(splitter)
    target_factorial_contribution = pochhammer(splitter + 1, target - splitter)
    (splitter_factorial_contribution, target_factorial_contribution)
end

# borrowed from https://github.com/mfpaulos/JuliBoots (2018)
function pochhammer(x::BigInt, n)
    iszero(n) && return BigInt(1)
    if n > 0
         (x + n - 1) * pochhammer(x, n-1)
    else
       pochhammer(x, n+1) / (x+n)
    end
end

# back to 42!

factorial_factors_42 = split_factorial(42, 25)
# (15511210043330985984000000, 90580045904088539774976000)
ndigits.(factorial_factors_42)
# (26, 26)    
# at least within the range we are addressing, the most balanced splits
#     either have the same digit counts or they differ by 1
prod(factorial_factors_42) == 𝒻𝒶𝒸(42)

# and the split works with Double64
a = Double64.(factorial_factors_42)
b = BigInt.(a)
b == factorial_factors_42

For the record, these are appropriate for use with Double64
smallerfactorial values fit in a Double64,
largerfactorial values fit in two Double64s.

smallerfactorials = [
(1, 1)
(2, 2)
(3, 6)
(4, 24)
(5, 120)
(6, 720)
(7, 5040)
(8, 40320)
(9, 362880)
(10, 3628800)
(11, 39916800)
(12, 479001600)
(13, 6227020800)
(14, 87178291200)
(15, 1307674368000)
(16, 20922789888000)
(17, 355687428096000)
(18, 6402373705728000)
(19, 121645100408832000)
(20, 2432902008176640000)
(21, 51090942171709440000)
(22, 1124000727777607680000)
(23, 25852016738884976640000)
(24, 620448401733239439360000)
(25, 15511210043330985984000000)
(26, 403291461126605635584000000)
(27, 10888869450418352160768000000)
(28, 304888344611713860501504000000)
(29, 8841761993739701954543616000000)
(30, 265252859812191058636308480000000)
(31, 8222838654177922817725562880000000)
(32, 263130836933693530167218012160000000)
(33, 8683317618811886495518194401280000000)
(34, 295232799039604140847618609643520000000)
(35, 10333147966386144929666651337523200000000)
(36, 371993326789901217467999448150835200000000)
]

largerfactorials = [
(37, (1124000727777607680000, 12245324002983751680000)),
(38, (25852016738884976640000, 20231404874494894080000)),
(39, (620448401733239439360000, 32876032921054202880000)),
(40, (620448401733239439360000, 1315041316842168115200000)),
(41, (15511210043330985984000000, 2156667759621155708928000)),
(42, (15511210043330985984000000, 90580045904088539774976000)),
(43, (403291461126605635584000000, 149805460533684892704768000)),
(44, (403291461126605635584000000, 6591440263482135279009792000)),
(45, (10888869450418352160768000000, 10985733772470225465016320000)),
(46, (10888869450418352160768000000, 505343753533630371390750720000)),
(47, (304888344611713860501504000000, 848255586288593837691617280000)),
(48, (8841761993739701954543616000000, 1404009246270776007213711360000)),
(49, (8841761993739701954543616000000, 68796453067268024353471856640000)),
(50, (265252859812191058636308480000000, 114660755112113373922453094400000)),
(51, (265252859812191058636308480000000, 5847698510717782070045107814400000)),
(52, (8222838654177922817725562880000000, 9809042663139505407817600204800000))
]

Double64 inv factorials are exactly recoverable through factorial(big(30))

smaller_invfactorials = [
(1, Double64(1.0, 0.0))
(2, Double64(0.5, 0.0))
(3, Double64(0.16666666666666666, 9.25185853854297e-18))
(4, Double64(0.041666666666666664, 2.3129646346357427e-18))
(5, Double64(0.008333333333333333, 1.1564823173178714e-19))
(6, Double64(0.001388888888888889, -5.300543954373577e-20))
(7, Double64(0.0001984126984126984, 1.7209558293420705e-22))
(8, Double64(2.48015873015873e-5, 2.1511947866775882e-23))
(9, Double64(2.7557319223985893e-6, -1.858393274046472e-22))
(10, Double64(2.755731922398589e-7, 2.3767714622250297e-23))
(11, Double64(2.505210838544172e-8, -1.448814070935912e-24))
(12, Double64(2.08767569878681e-9, -1.20734505911326e-25))
(13, Double64(1.6059043836821613e-10, 1.2585294588752098e-26))
(14, Double64(1.1470745597729725e-11, 2.0655512752830745e-28))
(15, Double64(7.647163731819816e-13, 7.03872877733453e-30))
(16, Double64(4.779477332387385e-14, 4.399205485834081e-31))
(17, Double64(2.8114572543455206e-15, 1.6508842730861433e-31))
(18, Double64(1.5619206968586225e-16, 1.1910679660273754e-32))
(19, Double64(8.22063524662433e-18, 2.2141894119604265e-34))
(20, Double64(4.110317623312165e-19, 1.4412973378659527e-36))
(21, Double64(1.9572941063391263e-20, -1.3643503830087908e-36))
(22, Double64(8.896791392450574e-22, -7.911402614872376e-38))
(23, Double64(3.868170170630684e-23, -8.843177655482344e-40))
(24, Double64(1.6117375710961184e-24, -3.6846573564509766e-41))
(25, Double64(6.446950284384474e-26, -1.9330404233703465e-42))
(26, Double64(2.4795962632247976e-27, -1.2953730964765229e-43))
(27, Double64(9.183689863795546e-29, 1.4303150396787322e-45))
(28, Double64(3.279889237069838e-30, 1.5117542744029879e-46))
(29, Double64(1.1309962886447716e-31, 1.0498015412959506e-47))
(30, Double64(3.7699876288159054e-33, 2.5870347832750324e-49))

these below are approximations, above are exactly recoverable

(31, Double64(1.216125041553518e-34, 5.586290567888806e-51))
(32, Double64(3.8003907548547434e-36, 1.7457158024652518e-52))
(33, Double64(1.151633562077195e-37, -6.09957445788454e-54))
(34, Double64(3.387157535521162e-39, 5.09056148151085e-56))
(35, Double64(9.67759295863189e-41, 3.202295548645562e-57))
(36, Double64(2.6882202662866363e-42, 5.355061165943334e-59))
];

invfactorials = [
Double64(0x1p+0, 0x0p+0),
Double64(0x1p-1, 0x0p+0),
Double64(0x1.5555555555555p-3, 0x1.5555555555555p-57),
Double64(0x1.5555555555555p-5, 0x1.5555555555555p-59),
Double64(0x1.1111111111111p-7, 0x1.1111111111111p-63),
Double64(0x1.6c16c16c16c17p-10, -0x1.f49f49f49f49fp-65),
Double64(0x1.a01a01a01a01ap-13, 0x1.a01a01a01a01ap-73),
Double64(0x1.a01a01a01a01ap-16, 0x1.a01a01a01a01ap-76),
Double64(0x1.71de3a556c734p-19, -0x1.c154f8ddc6cp-73),
Double64(0x1.27e4fb7789f5cp-22, 0x1.cbbc05b4fa99ap-76),
Double64(0x1.ae64567f544e4p-26, -0x1.c062e06d1f209p-80),
Double64(0x1.1eed8eff8d898p-29, -0x1.2aec959e14c06p-83),
Double64(0x1.6124613a86d09p-33, 0x1.f28e0cc748ebep-87),
Double64(0x1.93974a8c07c9dp-37, 0x1.05d6f8a2efd1fp-92),
Double64(0x1.ae7f3e733b81fp-41, 0x1.1d8656b0ee8cbp-97),
Double64(0x1.ae7f3e733b81fp-45, 0x1.1d8656b0ee8cbp-101),
Double64(0x1.952c77030ad4ap-49, 0x1.ac981465ddc6cp-103),
Double64(0x1.6827863b97d97p-53, 0x1.eec01221a8b0bp-107),
Double64(0x1.2f49b46814157p-57, 0x1.2650f61dbdcb4p-112),
Double64(0x1.e542ba4020225p-62, 0x1.ea72b4afe3c2fp-120),
Double64(0x1.71b8ef6dcf572p-66, -0x1.d043ae40c4647p-120),
Double64(0x1.0ce396db7f853p-70, -0x1.aebcdbd20331cp-124),
Double64(0x1.761b41316381ap-75, -0x1.3423c7d91404fp-130),
Double64(0x1.f2cf01972f578p-80, -0x1.9ada5fcc1ab14p-135),
Double64(0x1.3f3ccdd165fa9p-84, -0x1.58ddadf344487p-139),
Double64(0x1.88e85fc6a4e5ap-89, -0x1.71c37ebd1654p-143),
Double64(0x1.d1ab1c2dccea3p-94, 0x1.054d0c78aea14p-149),
Double64(0x1.0a18a2635085dp-98, 0x1.b9e2e28e1aa54p-153),
Double64(0x1.259f98b4358adp-103, 0x1.eaf8c39dd9bc5p-157),
Double64(0x1.3932c5047d60ep-108, 0x1.832b7b530a627p-162)
];

mwallerb · 2022-05-09T21:47:18Z

I implemented something here which seems to work:
https://github.com/tuwien-cms/xprec/blob/v1.2.13/csrc/dd_arith.c#L162

Might be worth taking a look at ....

oscardssmith · 2022-05-09T21:57:18Z

That looks like a pretty good version, although it won't be the most accurate since it ends in 9 multiplications that are uncompensated, so I think the error will be around 10 ULP error.

mwallerb · 2022-05-09T22:27:55Z

@oscardssmith I took the exp from the QD library, so that problem is "imported" from there. For me, 10 ulps are acceptable, but of course you guys may feel differently. But I think at least the expm1 should not lose more than 1 ulp on top of that.

Anyway, is there any easy way to fix that uncompensated multiplication problem?

oscardssmith · 2022-05-09T22:35:31Z

You could compensate the multiplication, but it would likely be a noticeable slowdown. 10 ULP may just be the price here. (The other option would be a table-based method)

mwallerb · 2024-04-18T09:08:43Z

Here's a C++ implementation of expm1 and exp that is accurate to 2 ulps across the range and takes around 400 flops:

https://github.com/tuwien-cms/libxprec/blob/v0.4.2/src/exp.cxx

The idea is similar to what you are doing already ... but we simply split off the one in the expm1_kernel and add or do not add it depending on whether we need exp or expm1. Might be relatively straight-forward to port if you're so inclined.

JeffreySarnoff closed this as completed Jan 31, 2022

JeffreySarnoff reopened this Jan 31, 2022

JeffreySarnoff changed the title ~~Catastrophic loss of precision in expm1~~ Catastrophic loss of precision in expm1 for tiny values May 3, 2022

JeffreySarnoff closed this as completed May 3, 2022

JeffreySarnoff reopened this May 3, 2022

JeffreySarnoff changed the title ~~Catastrophic loss of precision in expm1 for tiny values~~ replacing expm1 implementation May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replacing `expm1` implementation #131

replacing `expm1` implementation #131

mwallerb commented Sep 1, 2021 •

edited

Loading

JeffreySarnoff commented Sep 2, 2021

JeffreySarnoff commented Sep 2, 2021

oscardssmith commented Jan 22, 2022

JeffreySarnoff commented Jan 31, 2022

oscardssmith commented Jan 31, 2022

JeffreySarnoff commented Jan 31, 2022

JeffreySarnoff commented May 3, 2022

oscardssmith commented May 3, 2022

JeffreySarnoff commented May 3, 2022

oscardssmith commented May 3, 2022

JeffreySarnoff commented May 4, 2022 •

edited

Loading

mwallerb commented May 9, 2022

oscardssmith commented May 9, 2022

mwallerb commented May 9, 2022

oscardssmith commented May 9, 2022 •

edited

Loading

mwallerb commented Apr 18, 2024 •

edited

Loading

replacing expm1 implementation #131

replacing expm1 implementation #131

Comments

mwallerb commented Sep 1, 2021 • edited Loading

JeffreySarnoff commented Sep 2, 2021

JeffreySarnoff commented Sep 2, 2021

oscardssmith commented Jan 22, 2022

JeffreySarnoff commented Jan 31, 2022

oscardssmith commented Jan 31, 2022

JeffreySarnoff commented Jan 31, 2022

JeffreySarnoff commented May 3, 2022

oscardssmith commented May 3, 2022

JeffreySarnoff commented May 3, 2022

oscardssmith commented May 3, 2022

JeffreySarnoff commented May 4, 2022 • edited Loading

Double64 inv factorials are exactly recoverable through factorial(big(30))

these below are approximations, above are exactly recoverable

mwallerb commented May 9, 2022

oscardssmith commented May 9, 2022

mwallerb commented May 9, 2022

oscardssmith commented May 9, 2022 • edited Loading

mwallerb commented Apr 18, 2024 • edited Loading

replacing `expm1` implementation #131

replacing `expm1` implementation #131

mwallerb commented Sep 1, 2021 •

edited

Loading

JeffreySarnoff commented May 4, 2022 •

edited

Loading

oscardssmith commented May 9, 2022 •

edited

Loading

mwallerb commented Apr 18, 2024 •

edited

Loading