Compute off-diagonal covariance matrix elements #16

vbonvin · 2016-10-18T07:32:21Z

We must devise a way to compute the off-diagonal elements of the covariance matrix such that it fits in the current philosophy of PyCS (i.e. a binning in true delays of the simulated light curves).

The diagonals elements must correspond to the current output of the measvstrue() function (bias+variance), and the off-diagonals elements would simply be co-variances.

TODO:

compute the off-diagonal coefficient using a range instead of min and max of the distribution
save the intermediate products somewhere (in a pickle, as an attribute of the runresults, ...?)
test script and fake data to play with
add a display function
create a notebook explaining how to use / interpret the results

…ing. Also added a few todo/warnings that should be addressed in future issues.

vbonvin · 2016-10-18T12:58:23Z

I've implemented the new function in pycs.sim.plot.newcovplot (see https://github.com/COSMOGRAIL/PyCS/blob/master/pycs/sim/plot.py#L700) that does the job. It contains two ways to compute the diagonal coefficients (see #17 ), that needs to be extended to the non-diagonal coefficients as well for consistency.

mtewes · 2016-10-18T15:46:17Z

Nice, will have a look tomorrow!

BTW, it would be great if we would have and maintain a code to "test" (and illustrate) this new function -- do you happen to have some already ? If yes, we could add clean standalone tests in a directory "tests" at the root of the PyCS repository.

vbonvin · 2016-10-18T15:52:40Z

I can write some test code, typically running on the tutorial data.

mtewes · 2016-10-18T15:54:22Z

... or just faking runresults. This might allow us to see if we recover covariances that are put into the test data.

vbonvin · 2016-10-18T18:08:57Z

FYI, I've updated the first post with todo items.

mtewes · 2016-10-19T12:59:19Z

Just for future reference, there are posts related to this here: https://github.com/shsuyu/H0LiCOW/issues/154

…tead of dependent binning. #16

vbonvin · 2016-11-04T16:20:41Z

Check this out @mtewes : I'm doing good progress with the display function. It's still work in progress and a few labels are missing here and there but the idea is in place. Executing the newcovplot function will now produce the following plots:

Diagonal elements are the error vs truedelay (as displayed by the measvstrue function), binned by true delays. Off-diagonal elements are the standard covariance plots for each pair (as displayed by the covplot function), using all the simulations.

Diagonal elements are the same as above. Off-diagonal elements are the covariance coefficients when binned in true delays. The x and y axis represents the true delays of the sims. The alpha of the tiles is proportionate to the covariance coefficients. I haven't figured out the best way to display which delay corresponds to which axis, but the order is the same than on the first plot above.

That's a detailed view on the AB vs AC panel from the 2nd figure above (2nd row, 1st column). Each panel here corresponds to one tile above. The binning is indicated in brackets in each panel.

vbonvin · 2016-11-08T08:24:23Z

New version of 2nd figure above:

It now displays above the subplots on the diagonal the estimation of the time-delay error following the standard PyCS procedure (sys, rand, tot error). Above the off-diagonal subplots, "mean" is the covariance computed on all the samples. On the off-diagonal subplots, the biggest covariance in the bins is in blue (the one returned in the final bias-covariance matrix). If one bin has less than X elements in it (X being used defined, here it is 10), then the covariance coefficient is artificially put at 0 and is displayed in red. The figure also display the user-defined parameters (number of bins, radius of draw, ...) to ease the comparison with different settings.

mtewes · 2016-11-08T09:00:07Z

Very nice check plot! I guess there will be a simpler display of the covariance for papers? Happy to skype about this.
Random idea, as these plots and their description look a bit intimidating: in a publication (and why not also in the doc), we should write down how to use this matrix (i.e., explicitly say how we would estimate a likelihood of some model delays, given our measurement.

…ned by the function. #16

vbonvin · 2016-11-10T09:37:02Z

I've created a set of fake runresults to test if the procedure is working, and it is. The procedure is as follow, and was surprisingly a bit tricky to come up with:

Randomly generate three independent "true" time delays, associated to AB, AC and AD. This translates into random time shifts applied to B, C and D, and fix the time shift of A to 0.
Generate the errors on the time delay measurement from a multivariate gaussian distribution:
- (a, b, c, d, e, f) = np.random.multivariate_normal(mean, cov)
Where a is the error on AB, b the error on AC, c the error on AD etc... . mean and cov can be chosen to influence the shape of the distribution. Doing so artificially put non-zero covariance coefficients in our error samples, as this is what we want to recover.
Select only the error samples that respects the relation between the errors (since e.g. the error on AB, AC and BC are not independent). This translates into the following condition:
- abs(a-c+e) < eps and abs(b+f-c) < eps and abs(a-b+d) < eps
Where eps is small but non-zero. The so selected samples will have a covariance matrix a bit different from the original draw, but will still be non-zero which what matters here
Create fake light curves A, B, C, D that have the true time shifts t from 1. and measured errors t+v, with v computed from 2. Fixing the error on A to 0, the errors on B, C, D becomes respectively -a, -b and -c.

Doing so allows me to control the covariance coefficients of the (a, b, c, d, e, f) matrix, but still pass runresults (i.e. light curves) to the newcovplot() function, thus testing the whole procedure.

This is what I get, drawing 1000 samples:

Considering all the samples together (i.e. the value in the subplot titles), I recover the input covariance to a few % (the smaller eps, the more precise this is). Now, when doing the 2d binning in true delays, I expect the variations by 2d-bins to be only statistical since drawing the errors (2.) is completely independent of drawing the true delays (1.). However, the variations of the covariance coefficient between the bins of a subplot can be quite large, i.e. from 0.36 to 0.57 in the AD vs BD panel. Since we do not want the statistical noise to be dominant in this procedure, I would rather recommend using a larger binning, where the scatter between bins is smaller (see below). And ultimately, it would be simply better to draw more simulated light curves when applying this procedure -10k would be a good start.

tl;dr : It works, but we need moar simulated lcs !

vbonvin added the enhancement label Oct 18, 2016

vbonvin self-assigned this Oct 18, 2016

vbonvin added a commit that referenced this issue Oct 18, 2016

first try of generalized covariance matrix. #16. Still need some test…

0b2e7f1

…ing. Also added a few todo/warnings that should be addressed in future issues.

vbonvin added a commit that referenced this issue Oct 18, 2016

Cleaning newcovplot function. #16

206ef0c

vbonvin mentioned this issue Oct 18, 2016

Dependency of time-delay error estimations on inadequate binning of the true time-delay distribution #17

Open

vbonvin added a commit that referenced this issue Nov 3, 2016

adding the computation of the off-diagonal elements using a range ins…

a162f55

…tead of dependent binning. #16

vbonvin added a commit that referenced this issue Nov 8, 2016

adding all the intermediate result in a big dictionnary that is retur…

a0e6f54

…ned by the function. #16

vbonvin added a commit that referenced this issue Mar 16, 2017

improving the display . #16

2a1348c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute off-diagonal covariance matrix elements #16

Compute off-diagonal covariance matrix elements #16

vbonvin commented Oct 18, 2016 •

edited

Loading

vbonvin commented Oct 18, 2016

mtewes commented Oct 18, 2016

vbonvin commented Oct 18, 2016

mtewes commented Oct 18, 2016

vbonvin commented Oct 18, 2016

mtewes commented Oct 19, 2016

vbonvin commented Nov 4, 2016

vbonvin commented Nov 8, 2016

mtewes commented Nov 8, 2016

vbonvin commented Nov 10, 2016 •

edited

Loading

Compute off-diagonal covariance matrix elements #16

Compute off-diagonal covariance matrix elements #16

Comments

vbonvin commented Oct 18, 2016 • edited Loading

vbonvin commented Oct 18, 2016

mtewes commented Oct 18, 2016

vbonvin commented Oct 18, 2016

mtewes commented Oct 18, 2016

vbonvin commented Oct 18, 2016

mtewes commented Oct 19, 2016

vbonvin commented Nov 4, 2016

vbonvin commented Nov 8, 2016

mtewes commented Nov 8, 2016

vbonvin commented Nov 10, 2016 • edited Loading

vbonvin commented Oct 18, 2016 •

edited

Loading

vbonvin commented Nov 10, 2016 •

edited

Loading