Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ExGaussian logcdf and refactor logp #4407

Merged
merged 4 commits into from
Jan 5, 2021

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Jan 5, 2021

This PR fixes #4295

logcdf
The logcdf method was reliably returning nan for some configurations due to numerical inaccuracies in the std_cdf method. By replacing it with the more numerical stable normal_lcdf the problems seem to vanish.

In addition the method was originally written in the natural scale and only at the end converted to log, but it was straightforward to rewrite it in log terms, which simplifies things quite a bit, and further allows for the use of logdiffexp, which should provide further numerical stability.

I added unit tests on values that were systematically failing in the 32- and 64-bit implementations. I also added one test that fails with the scipy exponnorm implementation, which is numerically unstable in different ways.

logp
I thought it would be nice to also add the normal_lcdf to the logp method. This seems to not only solve the issues that were addressed by #4050 (method would wrongly return -inf for several configurations), but it is also more numerically accurate (using the R implementation as the benchmark).

I added a unittest based on the output of the gamlss::exGAUS R function that fails on master but not in this PR. I also added two tests that fail with the scipy exponnorm implementation, which is numerically unstable in different ways.

The only downside, is that the normal_lcdf relies on tt.erfcx which does not have c_code support (as mentioned by warnings in the tests). However, since the same function is being used in the logp of the TruncatedNormal, I thought it would be okay to use it here as well. Maybe we can add c_code support for tt.erfcx in the future?


Indirect benefit
The code for the logp and logcdf methods look much more clean now, and it is much easier to see the similarities and differences between them!

@ricardoV94 ricardoV94 changed the title - Improve ExGaussian logcdf and refactor logp Improve ExGaussian logcdf and refactor logp Jan 5, 2021
@ricardoV94
Copy link
Member Author

pre-commit is failing with:

************* Module pymc3.distributions.continuous
pymc3/distributions/continuous.py:23: [W0611(unused-import), ] Unused import theano

Should it really be removed?

@codecov
Copy link

codecov bot commented Jan 5, 2021

Codecov Report

Merging #4407 (2a12ce7) into master (34b05d8) will increase coverage by 0.07%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4407      +/-   ##
==========================================
+ Coverage   87.91%   87.98%   +0.07%     
==========================================
  Files          88       88              
  Lines       14331    14325       -6     
==========================================
+ Hits        12599    12604       +5     
+ Misses       1732     1721      -11     
Impacted Files Coverage Δ
pymc3/distributions/continuous.py 94.40% <66.66%> (-0.04%) ⬇️
pymc3/distributions/dist_math.py 96.07% <0.00%> (+3.92%) ⬆️

@junpenglao
Copy link
Member

Yes you should remove the unused import.

@@ -1851,6 +1854,9 @@ def test_ex_gaussian(self, value, mu, sigma, nu, logp):
(15.0, 5.000, 7.500, 7.500, -0.4545255),
(50.0, 50.000, 10.000, 10.000, -1.433714),
(1000.0, 500.000, 10.000, 20.000, -1.573708e-11),
(0.01, 0.01, 100.0, 0.01, -0.69314718), # Fails in scipy version
(-0.43402407, 0.0, 0.1, 0.1, -13.59615423), # Previous 32-bit version failed here
(-0.72402009, 0.0, 0.1, 0.1, -31.26571842), # Previous 64-bit version failed here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are really neat! 💪

@twiecki twiecki merged commit 8b1f64c into pymc-devs:master Jan 5, 2021
@ricardoV94 ricardoV94 deleted the ExGaussian_issue branch January 5, 2021 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NaNs in logcdf of ExGaussian.
3 participants