Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replacing expm1 implementation #131

Open
mwallerb opened this issue Sep 1, 2021 · 16 comments
Open

replacing expm1 implementation #131

mwallerb opened this issue Sep 1, 2021 · 16 comments

Comments

@mwallerb
Copy link

mwallerb commented Sep 1, 2021

expm1(x::Double64) loses all digits of accuracy below abs(x) < 1e-163 and returns 0.0 instead of x.

This regresses even beyond Float64, which correctly returns x.

using DoubleFloats
x = 1e-200
@assert x == expm1(x)
xx = Double64(x)
@assert xx == expm1(xx)
@JeffreySarnoff
Copy link
Member

good catch

@JeffreySarnoff
Copy link
Member

here is my quick fix ... it should help you while I review it

function Base.expm1(a::DoubleFloat{T}) where {T<:DoubleFloats.IEEEFloat}
    isnan(a) && return a
    isinf(a) && return(signbit(a) ? zero(DoubleFloat{T}) : a)
    u = exp(a)
    # temp fix of if (u == one(DoubleFloat{T}))
    if (isone(u.hi))
        a
    elseif (u-1.0 == -one(DoubleFloat{T}))
        -one(DoubleFloat{T})
    else
        a*(u-1.0) / log(u)
    end
end

@oscardssmith
Copy link
Member

#135 has a much better fix.

@JeffreySarnoff
Copy link
Member

#135 has a much better fix.

@oscardssmith
Copy link
Member

Shouldn't this remain open until #135 is merged?

@JeffreySarnoff
Copy link
Member

yes

@JeffreySarnoff JeffreySarnoff changed the title Catastrophic loss of precision in expm1 Catastrophic loss of precision in expm1 for tiny values May 3, 2022
@JeffreySarnoff
Copy link
Member

I do not see the described behavior

using DoubleFloats

x = 1e-200
dfx = Double64(x)
expm1(dfx) 
# 1.0e-200
Double64(expm1(BigFloat(x)))
# 1.0e-200

I am closing this -- if there is something I missed, let me know

@oscardssmith
Copy link
Member

this is still an issue for expm1(.5)

@JeffreySarnoff
Copy link
Member

???

using DoubleFloats, Quadmath
x= 0.5
x128 = Float128(x)
xdf = Double64(x)
ans128 = expm1(x128)
ansdf = expm1(xdf)
bestansdf = Double64(ans128)

bestansdf - ansdf == 0.0

@oscardssmith
Copy link
Member

Sorry, I chose the wrong number.

expm1(Double64(.0000000000000001))
1.0e-16
julia> expm1(big(Double64(.0000000000000001)))-ans
4.999999999999999957644533907012698185150443928822782614563950837394859595575927e-33
julia> eps(expm1(Double64(.0000000000000001)))
2.7369110631344083e-48

so in this case, DoubleFloats is off by 10e15 ULP

@JeffreySarnoff JeffreySarnoff reopened this May 3, 2022
@JeffreySarnoff
Copy link
Member

JeffreySarnoff commented May 4, 2022

After spending some time with the implementation, it has become clear that the only fix is to replace expm1(::Double64) with something that works and runs within 2.5x-3.0x of the spotty expm1. I have done this.
expm1 now converts its argument to a Float128 (from Quadmath), lets Quadmath compute the function and reconverts the result to Double64..
This is much faster than using BigFloat, and it is well traveled.


I did generate factorial coeffs for Double64 in the event we want to run the series solution another time. 1! .. 36! will fit in a Double64 without error.
I derived the approach and computed the values that lets us use 37! .. 52!,
for these each factorial is separated into two values of approximately equal magnitude, values which when multiplied together yield the corresponding factorial. Of course direct multiplication of these values would by inexact.

Consider 42!

𝒻𝒶𝒸(x) = factorial(BigInt(x))

factorial42 =  𝒻𝒶𝒸(42)
# 405006117752879898543142606244511569936384000000000
# this is too large to be held within a Double64 and be recovered

split_factorial(target, splitter) = 
    split_factorial(BigInt(target), BigInt(splitter))

function split_factorial(target::T, splitter::T) where {T::BigInt}
    splitter_factorial_contribution =  𝒻𝒶𝒸(splitter)
    target_factorial_contribution = pochhammer(splitter + 1, target - splitter)
    (splitter_factorial_contribution, target_factorial_contribution)
end

# borrowed from https://github.com/mfpaulos/JuliBoots (2018)
function pochhammer(x::BigInt, n)
    iszero(n) && return BigInt(1)
    if n > 0
         (x + n - 1) * pochhammer(x, n-1)
    else
       pochhammer(x, n+1) / (x+n)
    end
end

# back to 42!

factorial_factors_42 = split_factorial(42, 25)
# (15511210043330985984000000, 90580045904088539774976000)
ndigits.(factorial_factors_42)
# (26, 26)    
# at least within the range we are addressing, the most balanced splits
#     either have the same digit counts or they differ by 1
prod(factorial_factors_42) == 𝒻𝒶𝒸(42)

# and the split works with Double64
a = Double64.(factorial_factors_42)
b = BigInt.(a)
b == factorial_factors_42

For the record, these are appropriate for use with Double64
smallerfactorial values fit in a Double64,
largerfactorial values fit in two Double64s.

smallerfactorials = [
(1, 1)
(2, 2)
(3, 6)
(4, 24)
(5, 120)
(6, 720)
(7, 5040)
(8, 40320)
(9, 362880)
(10, 3628800)
(11, 39916800)
(12, 479001600)
(13, 6227020800)
(14, 87178291200)
(15, 1307674368000)
(16, 20922789888000)
(17, 355687428096000)
(18, 6402373705728000)
(19, 121645100408832000)
(20, 2432902008176640000)
(21, 51090942171709440000)
(22, 1124000727777607680000)
(23, 25852016738884976640000)
(24, 620448401733239439360000)
(25, 15511210043330985984000000)
(26, 403291461126605635584000000)
(27, 10888869450418352160768000000)
(28, 304888344611713860501504000000)
(29, 8841761993739701954543616000000)
(30, 265252859812191058636308480000000)
(31, 8222838654177922817725562880000000)
(32, 263130836933693530167218012160000000)
(33, 8683317618811886495518194401280000000)
(34, 295232799039604140847618609643520000000)
(35, 10333147966386144929666651337523200000000)
(36, 371993326789901217467999448150835200000000)
]

largerfactorials = [
(37, (1124000727777607680000, 12245324002983751680000)),
(38, (25852016738884976640000, 20231404874494894080000)),
(39, (620448401733239439360000, 32876032921054202880000)),
(40, (620448401733239439360000, 1315041316842168115200000)),
(41, (15511210043330985984000000, 2156667759621155708928000)),
(42, (15511210043330985984000000, 90580045904088539774976000)),
(43, (403291461126605635584000000, 149805460533684892704768000)),
(44, (403291461126605635584000000, 6591440263482135279009792000)),
(45, (10888869450418352160768000000, 10985733772470225465016320000)),
(46, (10888869450418352160768000000, 505343753533630371390750720000)),
(47, (304888344611713860501504000000, 848255586288593837691617280000)),
(48, (8841761993739701954543616000000, 1404009246270776007213711360000)),
(49, (8841761993739701954543616000000, 68796453067268024353471856640000)),
(50, (265252859812191058636308480000000, 114660755112113373922453094400000)),
(51, (265252859812191058636308480000000, 5847698510717782070045107814400000)),
(52, (8222838654177922817725562880000000, 9809042663139505407817600204800000))
]

Double64 inv factorials are exactly recoverable through factorial(big(30))

smaller_invfactorials = [
(1, Double64(1.0, 0.0))
(2, Double64(0.5, 0.0))
(3, Double64(0.16666666666666666, 9.25185853854297e-18))
(4, Double64(0.041666666666666664, 2.3129646346357427e-18))
(5, Double64(0.008333333333333333, 1.1564823173178714e-19))
(6, Double64(0.001388888888888889, -5.300543954373577e-20))
(7, Double64(0.0001984126984126984, 1.7209558293420705e-22))
(8, Double64(2.48015873015873e-5, 2.1511947866775882e-23))
(9, Double64(2.7557319223985893e-6, -1.858393274046472e-22))
(10, Double64(2.755731922398589e-7, 2.3767714622250297e-23))
(11, Double64(2.505210838544172e-8, -1.448814070935912e-24))
(12, Double64(2.08767569878681e-9, -1.20734505911326e-25))
(13, Double64(1.6059043836821613e-10, 1.2585294588752098e-26))
(14, Double64(1.1470745597729725e-11, 2.0655512752830745e-28))
(15, Double64(7.647163731819816e-13, 7.03872877733453e-30))
(16, Double64(4.779477332387385e-14, 4.399205485834081e-31))
(17, Double64(2.8114572543455206e-15, 1.6508842730861433e-31))
(18, Double64(1.5619206968586225e-16, 1.1910679660273754e-32))
(19, Double64(8.22063524662433e-18, 2.2141894119604265e-34))
(20, Double64(4.110317623312165e-19, 1.4412973378659527e-36))
(21, Double64(1.9572941063391263e-20, -1.3643503830087908e-36))
(22, Double64(8.896791392450574e-22, -7.911402614872376e-38))
(23, Double64(3.868170170630684e-23, -8.843177655482344e-40))
(24, Double64(1.6117375710961184e-24, -3.6846573564509766e-41))
(25, Double64(6.446950284384474e-26, -1.9330404233703465e-42))
(26, Double64(2.4795962632247976e-27, -1.2953730964765229e-43))
(27, Double64(9.183689863795546e-29, 1.4303150396787322e-45))
(28, Double64(3.279889237069838e-30, 1.5117542744029879e-46))
(29, Double64(1.1309962886447716e-31, 1.0498015412959506e-47))
(30, Double64(3.7699876288159054e-33, 2.5870347832750324e-49))

these below are approximations, above are exactly recoverable

(31, Double64(1.216125041553518e-34, 5.586290567888806e-51))
(32, Double64(3.8003907548547434e-36, 1.7457158024652518e-52))
(33, Double64(1.151633562077195e-37, -6.09957445788454e-54))
(34, Double64(3.387157535521162e-39, 5.09056148151085e-56))
(35, Double64(9.67759295863189e-41, 3.202295548645562e-57))
(36, Double64(2.6882202662866363e-42, 5.355061165943334e-59))
];

invfactorials = [
Double64(0x1p+0, 0x0p+0),
Double64(0x1p-1, 0x0p+0),
Double64(0x1.5555555555555p-3, 0x1.5555555555555p-57),
Double64(0x1.5555555555555p-5, 0x1.5555555555555p-59),
Double64(0x1.1111111111111p-7, 0x1.1111111111111p-63),
Double64(0x1.6c16c16c16c17p-10, -0x1.f49f49f49f49fp-65),
Double64(0x1.a01a01a01a01ap-13, 0x1.a01a01a01a01ap-73),
Double64(0x1.a01a01a01a01ap-16, 0x1.a01a01a01a01ap-76),
Double64(0x1.71de3a556c734p-19, -0x1.c154f8ddc6cp-73),
Double64(0x1.27e4fb7789f5cp-22, 0x1.cbbc05b4fa99ap-76),
Double64(0x1.ae64567f544e4p-26, -0x1.c062e06d1f209p-80),
Double64(0x1.1eed8eff8d898p-29, -0x1.2aec959e14c06p-83),
Double64(0x1.6124613a86d09p-33, 0x1.f28e0cc748ebep-87),
Double64(0x1.93974a8c07c9dp-37, 0x1.05d6f8a2efd1fp-92),
Double64(0x1.ae7f3e733b81fp-41, 0x1.1d8656b0ee8cbp-97),
Double64(0x1.ae7f3e733b81fp-45, 0x1.1d8656b0ee8cbp-101),
Double64(0x1.952c77030ad4ap-49, 0x1.ac981465ddc6cp-103),
Double64(0x1.6827863b97d97p-53, 0x1.eec01221a8b0bp-107),
Double64(0x1.2f49b46814157p-57, 0x1.2650f61dbdcb4p-112),
Double64(0x1.e542ba4020225p-62, 0x1.ea72b4afe3c2fp-120),
Double64(0x1.71b8ef6dcf572p-66, -0x1.d043ae40c4647p-120),
Double64(0x1.0ce396db7f853p-70, -0x1.aebcdbd20331cp-124),
Double64(0x1.761b41316381ap-75, -0x1.3423c7d91404fp-130),
Double64(0x1.f2cf01972f578p-80, -0x1.9ada5fcc1ab14p-135),
Double64(0x1.3f3ccdd165fa9p-84, -0x1.58ddadf344487p-139),
Double64(0x1.88e85fc6a4e5ap-89, -0x1.71c37ebd1654p-143),
Double64(0x1.d1ab1c2dccea3p-94, 0x1.054d0c78aea14p-149),
Double64(0x1.0a18a2635085dp-98, 0x1.b9e2e28e1aa54p-153),
Double64(0x1.259f98b4358adp-103, 0x1.eaf8c39dd9bc5p-157),
Double64(0x1.3932c5047d60ep-108, 0x1.832b7b530a627p-162)
];

@JeffreySarnoff JeffreySarnoff changed the title Catastrophic loss of precision in expm1 for tiny values replacing expm1 implementation May 4, 2022
@mwallerb
Copy link
Author

mwallerb commented May 9, 2022

I implemented something here which seems to work:
https://github.com/tuwien-cms/xprec/blob/v1.2.13/csrc/dd_arith.c#L162

Might be worth taking a look at ....

@oscardssmith
Copy link
Member

That looks like a pretty good version, although it won't be the most accurate since it ends in 9 multiplications that are uncompensated, so I think the error will be around 10 ULP error.

@mwallerb
Copy link
Author

mwallerb commented May 9, 2022

@oscardssmith I took the exp from the QD library, so that problem is "imported" from there. For me, 10 ulps are acceptable, but of course you guys may feel differently. But I think at least the expm1 should not lose more than 1 ulp on top of that.

Anyway, is there any easy way to fix that uncompensated multiplication problem?

@oscardssmith
Copy link
Member

oscardssmith commented May 9, 2022

You could compensate the multiplication, but it would likely be a noticeable slowdown. 10 ULP may just be the price here. (The other option would be a table-based method)

@mwallerb
Copy link
Author

mwallerb commented Apr 18, 2024

Here's a C++ implementation of expm1 and exp that is accurate to 2 ulps across the range and takes around 400 flops:

https://github.com/tuwien-cms/libxprec/blob/v0.4.2/src/exp.cxx

The idea is similar to what you are doing already ... but we simply split off the one in the expm1_kernel and add or do not add it depending on whether we need exp or expm1. Might be relatively straight-forward to port if you're so inclined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants