Fix bug with multiple minibatch variables #7408

ricardoV94 · 2024-07-09T19:12:20Z

Description

Reported in https://discourse.pymc.io/t/verifying-that-minibatch-is-actually-randomly-sampling/14308

The bug occurred due to separate calls to model.logp (from model.datalogp and model.varlogp) that create distinct clones of the RandomIntegersRV underlying minibatch slicing. compile_pymc would not set any updates in this case

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

📚 Documentation preview 📚: https://pymc--7408.org.readthedocs.build/en/7408/

Fixes bug in VI with multiple Minibatch variables, which occurred due to separate calls to model.logp (from model.datalogp and model.varlogp) that create distinct clones of the RandomIntegersRV underlying minibatch slicing. `compile_pymc` would not set any updates in this case

codecov · 2024-07-09T19:56:41Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.18%. Comparing base (f0631b4) to head (10f3aef).
Report is 10 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7408      +/-   ##
==========================================
- Coverage   92.19%   92.18%   -0.02%     
==========================================
  Files         103      103              
  Lines       17212    17261      +49     
==========================================
+ Hits        15869    15912      +43     
- Misses       1343     1349       +6

Files	Coverage Δ
pymc/pytensorf.py	`90.51% <100.00%> (+0.06%)`	⬆️

... and 2 files with indirect coverage changes

jessegrabowski · 2024-07-10T01:56:52Z

tests/variational/test_inference.py

+            total_size=len(y),
+        )
+        mean_field = pm.fit(10_000, obj_optimizer=pm.adam(learning_rate=0.01), progressbar=False)
+    np.testing.assert_allclose(mean_field.mean.get_value(), true_weights, rtol=1e-1)


Does this test need to run the whole model and check parameter recovery? It should be enough to compile the function and check that minibatch_feature and minibatch_y change after each loss function execution right?

This should run pretty fast, with minibatch of 1. I think it's a useful integration test, we didn't have any linear regression minibatch test in the codebase.

Besides VI has a very complex logic leading to building the function that I rather treat as a black box

ricardoV94 added bug VI Variational Inference labels Jul 9, 2024

ricardoV94 requested a review from ferrine July 9, 2024 19:12

ricardoV94 force-pushed the fix_multiple_minibatch_bug branch from b47c935 to f1e3d9c Compare July 9, 2024 19:13

ricardoV94 force-pushed the fix_multiple_minibatch_bug branch from f1e3d9c to 10f3aef Compare July 9, 2024 19:15

jessegrabowski reviewed Jul 10, 2024

View reviewed changes

jessegrabowski approved these changes Jul 10, 2024

View reviewed changes

ricardoV94 merged commit a4ea9fc into pymc-devs:main Jul 10, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug with multiple minibatch variables #7408

Fix bug with multiple minibatch variables #7408

ricardoV94 commented Jul 9, 2024 •

edited by github-actions bot

Loading

codecov bot commented Jul 9, 2024

jessegrabowski Jul 10, 2024

ricardoV94 Jul 10, 2024 •

edited

Loading

Fix bug with multiple minibatch variables #7408

Fix bug with multiple minibatch variables #7408

Conversation

ricardoV94 commented Jul 9, 2024 • edited by github-actions bot Loading

Description

Related Issue

Checklist

Type of change

codecov bot commented Jul 9, 2024

Codecov Report

jessegrabowski Jul 10, 2024

Choose a reason for hiding this comment

ricardoV94 Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

ricardoV94 commented Jul 9, 2024 •

edited by github-actions bot

Loading

ricardoV94 Jul 10, 2024 •

edited

Loading