Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Population Samplers (implemented DE-MCMC) #2735

Merged
merged 60 commits into from
Dec 5, 2017

Conversation

michaelosthege
Copy link
Member

@michaelosthege michaelosthege commented Nov 29, 2017

With this PR I would like to propose some changes to support population samplers.

Changes

  • Metropolis is marked as COMPATIBLE with continuous variables (I don't understand why this was not the case already)
  • the bij variable of ArrayStepShared is elevated to an attribute
  • a PopulationArrayStepShared is subclassed from ArrayStepShared
  • DEMetropolis is a PopulationArrayStepShared step method that implements differential-evolution metropolis
  • the sample() procedure is modified and two new methods sample_population and _iter_chains are added to support sampling with population samplers
  • sample() is a bit more verbose and prints out the sampler assignment and method of sampling

Performance

I have included an example that compares Metropolis, NUTS and DEMetropolis on a correlated bivariate normal target density. It calculates the effective sampling rate and normalizes it to NUTS on the uncorrelated target density. This is the result I get on my laptop:

Effective sample size [0...1]

Method \ p 0.0 0.9
Metropolis 0.176 0.024
Slice 1.000 0.102
NUTS 0.985 0.268
DEMetropolis 0.106 0.129

Runtime [s]

Method \ p 0.0 0.9
Metropolis 21.0 21.5
Slice 63.1 57.9
NUTS 37.2 63.5
DEMetropolis 18.6 19.0

Normalized effective sampling rate [0...1]

Method \ p 0.0 0.9
Metropolis 0.317 0.042
Slice 0.598 0.067
NUTS 1.000 0.159
DEMetropolis 0.214 0.257

Usage

The DEMetropolis step method can be used just like any other stepper. It supports blocking and can be combined with other steppers.

ToDo

  • get all tests to pass
  • add tests for PopulationArraySharedStep and DEMetropolis

References

@twiecki
Copy link
Member

twiecki commented Nov 29, 2017

@michaelosthege Looks very promising!

Optional model for sampling step. Defaults to None (taken from context).
mode : string or `Mode` instance.
compilation mode passed to Theano functions
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add reference

@junpenglao
Copy link
Member

junpenglao commented Dec 1, 2017

Well done @michaelosthege! Could you please add it to the release note under new feature?

Also, you should add an EXPERIMENTAL_WARNING (see eg here https://github.com/pymc-devs/pymc3/blob/master/pymc3/step_methods/sgmcmc.py#L14-L15).

@michaelosthege
Copy link
Member Author

Thank you :)
I will add a reference today.
And two more things:

  1. In contrast to the original publication, I'm tuning epsilon (symmetric noise that is added to the jump). The idea is that it should help the chains to diverge from potentially identical start values. Mathematically there is no problem, but I can't say if it's a good idea either.
  2. I have an experimental chain-multiprocessing in the pipeline. Will do some more testing and exception handling before I push the commits.

@@ -580,6 +626,267 @@ def _iter_sample(draws, step, start=None, trace=None, chain=0, tune=None,
step.report._finalize(strace)


class PopulationStepper(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the code we have for SMC? Is there any overlap between these?

Copy link
Member Author

@michaelosthege michaelosthege Dec 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is the same, but the solution is different. And performance is a concern, so it's worthwhile to think about it.

It looks like SMC uses a multiprocessing.Pool and a text file MultiTrace backend to share the population state and synchronize chains. Although I have to admit that I don't completely understand how this works in combination with the Pool.

I use Pipes to create a communication channel between the processes. That channel is then used to pass the population to the chain, and the update back to the main iterator.

My guess is that the Pipes should in principle be faster. And they certainly involve less code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is code duplication in the pymc3 code-base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can refactor SMC at some point @hvasbath @aloctavodia?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will love to help with SMC, "at some point" could be during January :-)

Copy link
Contributor

@hvasbath hvasbath Dec 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a cool sampler! Thanks for implementing @michaelosthege . If this was merged in 1 1/2 years earlier, likely I wouldnt have started working on the SMC ;) .
The algorithms themselves are quite different. I am not sure if SMC can be refactored to support the parallel sampling here, which is the only thing the samplers have in common appart from the metropolis step ;). For sure the pool also supports python2.7, which is apparently not supported in the DEM here.
The chains in the SMC do not communicate with each other during the sampling as they do here. They communicate all together every n_chains * nsteps samples. For sure would be interesting to see the test and performance you did also compared to the SMC @michaelosthege . There is the SMC2gaussians notebook where you see how to start it. Anyways if anyone has a good idea to anyway factor out common things I also could help starting earliest January.

@twiecki
Copy link
Member

twiecki commented Dec 2, 2017

Some conflicts creeped in @michaelosthege. I'm fine merging once resolved.

@michaelosthege
Copy link
Member Author

michaelosthege commented Dec 3, 2017

Tomorrow I will fix that and do a little refactor. Should be ready to merge then 😊

I really like the way how blocking and CompoundStep work in pymc3. With respect to SMC, it might be nice to merge PopulationArrayStep with the ArrayStepSharedLLK into a universal, parallelized population step. (Even if information is not exchanged in every step.) /cc @hvasbath @aloctavodia

The failure of my implementation to multiprocess in Py2.7 is because even in 3.5 I couldn't get the steppers to pickle without using protocol=4. I don't understand why though, because obviously Pool can do it.

@michaelosthege
Copy link
Member Author

Okay, so I'd say it's ready to merge.
I benchmarked the sampler performance against expensive forward passes. I used a single 1D Normal and a custom Op with a time.sleep() to artificially slow down the forward pass. (10 chains, 5k draws, running on my 4-core laptop)

Runtime [s]
sampler \ delay          0.0   0.5   1.0   2.0   3.0   4.0    5.0   10.0
Metropolis               7.6  44.5  43.1  60.8  84.9 107.5  129.9 238.3
Slice                   11.3 110.5 110.0 184.4 248.5 316.1 1326.4 752.7
DEMetropolis_sequential  3.2  69.3  69.2 110.1 164.9 206.6  250.1 473.9
DEMetropolis_parallel   19.5  42.3  44.6  62.5  73.8  70.2   83.7 139.6

Sampling rate [1/s]
sampler \ delay           0.0   0.5   1.0   2.0   3.0   4.0   5.0   10.0
Metropolis              2623.0 449.2 464.5 328.7 235.5 186.0 153.9  83.9
Slice                   1768.3 181.0 181.8 108.4  80.5  63.3  15.1  26.6
DEMetropolis_sequential 6252.8 288.7 288.9 181.7 121.3  96.8  80.0  42.2
DEMetropolis_parallel   1025.2 472.4 448.8 320.1 271.2 285.1 239.0 143.3

So even though the parallelized DEMetropolis is much worse with very fast forward passes, it outperforms the sequential version as soon as the forward pass takes more than a few ms.

@junpenglao
Copy link
Member

Could you please also add it to release note under feature?

@michaelosthege
Copy link
Member Author

michaelosthege commented Dec 5, 2017

@junpenglao Just added it. Thank you for reminding me!

Note: please Squash & Merge - there are some embarrassing bugs in the history 🙃

@junpenglao junpenglao merged commit dee6575 into pymc-devs:master Dec 5, 2017
@junpenglao
Copy link
Member

Great job @michaelosthege!

jordan-melendez pushed a commit to jordan-melendez/pymc3 that referenced this pull request Feb 6, 2018
* catching other error types that may occur when gradients are not available

* added a benchmark example for correlated dimensions

* marking Metropolis as COMPATIBLE for all types, added line breaks (code style)

* specifying single job to force the sample_many function

* modified sampling procedure to interate chains in parallel (instead of sequentially)

* updated description

* indexing samplers by chain number instead of chain id

* print transposes result table

* created PopulationArrayStepShared base class that allows the individual sampler to be aware of other chains

* modified sampling loop to account for the PopulationArrayStepShared sampler types

* added the DEMetropolis sampler

* raisig an error when the population is too small

* verbose debug logging

* removed debug print

* forcing CompoundStep type

* formatting

* setting DEMetropolis as a blocked step method

* measuring the runtime, example with both 2D z-variable and two 1D x,y variables

* changed the initialization order such that variable transforms are applied

* fixed a bug caused by start=None

* fixes a bug in computing lambda

* using a Uniform proposal with low initial scale

* renamed local variable

* logging the crossover and scaling

* fixed a bug that caused step methods to not be copied

* smarter multiprocessing

* automatic multiprocessing decision, reporting relative sampling rates

* print format

* inheriting PopulationArraySharedStep from ArrayStepShared, using a bij attribute because DEMetropolis will use it

* printing the number of effective samples per variable

* docstrings and comments

* falling back to sequential sampling if no population samplers are used

* removed debugging stats logging

* fixed nested if else

* updated print statement

* fixed a bug related to bijection updating

* docstring and comments

* refactoring for better clarity and less diff

* code style

* removed unused import

* fixed a bug where Slice was preferred on multidimensional variables

* printing the stepper hierarchy, fixed a variable name, handling non-CompoundStep methods

* fixed a bug where DEMetropolis assigned itself to discrete vars, fixed a bug that happened when only one var was assigned

* improved code style, including Slice in comparison

* including DEMetropolis in existing tests, added test case for PopulationSamplers

* fixes python 2.7 compatibility

* Using multiprocessing to parallelize iteration of chain populations (initial)

* added references

* added a warning that DEMetropolis is experimental

* forgotten space

* modified the PopulationStepper to automatically use parallelization or fallback to sequential mode

* avoiding a reimport, disabled chain parallelization by default

* increased nchains

* resolving conflicts

* resolving conflicts

* included DEMetropolis in new features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants