Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Sparse Arrays #549

Open
ahartikainen opened this issue Jan 18, 2019 · 6 comments
Open

Support for Sparse Arrays #549

ahartikainen opened this issue Jan 18, 2019 · 6 comments

Comments

@ahartikainen
Copy link
Contributor

In Stan some types have only part of the structure defined/filled - e.g. Cholesky decomposition. Our functions should support these structures.

We should support sparse arrays, but this is a problem with xarray.

@ahartikainen
Copy link
Contributor Author

For xarray pydata/xarray#1375

@GWeindel
Copy link
Contributor

Sorry for the delay.

When NaN neff parameters are present I can't use the following functions with the full fitted object :

  • az.summary
  • az.bfmi
  • az.effective_sample_size

User has to subset the fitted object to exclude NaN

Sample of the error :

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-24a3fc920ff9> in <module>
----> 1 az.effective_sample_size(RT_fit)

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in effective_sample_size(data, var_names)
     59 
     60     dataset = dataset if var_names is None else dataset[var_names]
---> 61     return xr.apply_ufunc(_ess_ufunc, dataset, input_core_dims=(("chain", "draw"),))
     62 
     63 

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_ufunc(func, *args, **kwargs)
    980                                    fill_value=dataset_fill_value,
    981                                    dataset_join=dataset_join,
--> 982                                    keep_attrs=keep_attrs)
    983     elif any(isinstance(a, DataArray) for a in args):
    984         return apply_dataarray_ufunc(variables_ufunc, *args,

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_dataset_ufunc(func, *args, **kwargs)
    367     result_vars = apply_dict_of_variables_ufunc(
    368         func, *args, signature=signature, join=dataset_join,
--> 369         fill_value=fill_value)
    370 
    371     if signature.num_outputs > 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_dict_of_variables_ufunc(func, *args, **kwargs)
    312     result_vars = OrderedDict()
    313     for name, variable_args in zip(names, grouped_by_name):
--> 314         result_vars[name] = func(*variable_args)
    315 
    316     if signature.num_outputs > 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, *args, **kwargs)
    559             raise ValueError('unknown setting for dask array handling in '
    560                              'apply_ufunc: {}'.format(dask))
--> 561     result_data = func(*input_data)
    562 
    563     if signature.num_outputs == 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in _ess_ufunc(ary)
     70     target = np.empty(ary.shape[:-2])
     71     for idx in np.ndindex(target.shape):
---> 72         target[idx] = _get_ess(ary[idx])
     73     return target
     74 

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in _get_ess(sample_array)
    117             rho_hat_t[t + 2] = rho_hat_t[t + 1]
    118         t += 2
--> 119     ess = int((n_chain * n_draws) / (-1.0 + 2.0 * np.sum(rho_hat_t)))
    120     return ess
    121 

ValueError: cannot convert float NaN to integer

@ahartikainen
Copy link
Contributor Author

Hi, I think that is fixable with the following

ess = int((n_chain * n_draws) / (-1.0 + 2.0 * np.sum(rho_hat_t))) if not np.any(np.isnan(rho_hat_t))) else np.nan

This comes from the problem that there is not integer NaN (until latest pandas)

@ahartikainen
Copy link
Contributor Author

Hi, how was bfmi broken?

There is a "bug" that you need to manually use data.sample_stats.energy. So inserting InferenceData is not going to work. This should be fixed (see. #501 )

@GWeindel
Copy link
Contributor

Right, manually use sample_stats.energy works.

In case, a sample form the warning I get

/home/gabriel/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/stats.py:41: RuntimeWarning: Mean of empty slice.
  return np.square(np.diff(energy_mat, axis=1)).mean(axis=1) / np.var(energy_mat, axis=1)

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-28-07a9031ba490> in <module>
----> 1 az.bfmi(RT_fit)

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/stats.py in bfmi(energy)
     39     """
     40     energy_mat = np.atleast_2d(energy)
---> 41     return np.square(np.diff(energy_mat, axis=1)).mean(axis=1) / np.var(energy_mat, axis=1)
     42 
     43 

~/anaconda2/envs/future/lib/python3.5/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
     76     if isinstance(ret, mu.ndarray):
     77         ret = um.true_divide(
---> 78                 ret, rcount, out=ret, casting='unsafe', subok=False)
     79         if is_float16_result and out is None:
     80             ret = arr.dtype.type(ret)

ZeroDivisionError: division by zero

@ahartikainen
Copy link
Contributor Author

Yes, it makes np.array([InferenceData])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants