Support for Sparse Arrays #549

ahartikainen · 2019-01-18T15:42:35Z

In Stan some types have only part of the structure defined/filled - e.g. Cholesky decomposition. Our functions should support these structures.

We should support sparse arrays, but this is a problem with xarray.

ahartikainen · 2019-01-18T15:43:07Z

For xarray pydata/xarray#1375

GWeindel · 2019-01-28T12:00:52Z

Sorry for the delay.

When NaN neff parameters are present I can't use the following functions with the full fitted object :

az.summary
az.bfmi
az.effective_sample_size

User has to subset the fitted object to exclude NaN

Sample of the error :

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-24a3fc920ff9> in <module>
----> 1 az.effective_sample_size(RT_fit)

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in effective_sample_size(data, var_names)
     59 
     60     dataset = dataset if var_names is None else dataset[var_names]
---> 61     return xr.apply_ufunc(_ess_ufunc, dataset, input_core_dims=(("chain", "draw"),))
     62 
     63 

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_ufunc(func, *args, **kwargs)
    980                                    fill_value=dataset_fill_value,
    981                                    dataset_join=dataset_join,
--> 982                                    keep_attrs=keep_attrs)
    983     elif any(isinstance(a, DataArray) for a in args):
    984         return apply_dataarray_ufunc(variables_ufunc, *args,

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_dataset_ufunc(func, *args, **kwargs)
    367     result_vars = apply_dict_of_variables_ufunc(
    368         func, *args, signature=signature, join=dataset_join,
--> 369         fill_value=fill_value)
    370 
    371     if signature.num_outputs > 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_dict_of_variables_ufunc(func, *args, **kwargs)
    312     result_vars = OrderedDict()
    313     for name, variable_args in zip(names, grouped_by_name):
--> 314         result_vars[name] = func(*variable_args)
    315 
    316     if signature.num_outputs > 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, *args, **kwargs)
    559             raise ValueError('unknown setting for dask array handling in '
    560                              'apply_ufunc: {}'.format(dask))
--> 561     result_data = func(*input_data)
    562 
    563     if signature.num_outputs == 1:

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in _ess_ufunc(ary)
     70     target = np.empty(ary.shape[:-2])
     71     for idx in np.ndindex(target.shape):
---> 72         target[idx] = _get_ess(ary[idx])
     73     return target
     74 

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/diagnostics.py in _get_ess(sample_array)
    117             rho_hat_t[t + 2] = rho_hat_t[t + 1]
    118         t += 2
--> 119     ess = int((n_chain * n_draws) / (-1.0 + 2.0 * np.sum(rho_hat_t)))
    120     return ess
    121 

ValueError: cannot convert float NaN to integer

ahartikainen · 2019-01-28T12:59:32Z

Hi, I think that is fixable with the following

ess = int((n_chain * n_draws) / (-1.0 + 2.0 * np.sum(rho_hat_t))) if not np.any(np.isnan(rho_hat_t))) else np.nan

This comes from the problem that there is not integer NaN (until latest pandas)

ahartikainen · 2019-01-28T13:43:47Z

Hi, how was bfmi broken?

There is a "bug" that you need to manually use data.sample_stats.energy. So inserting InferenceData is not going to work. This should be fixed (see. #501 )

GWeindel · 2019-01-28T16:24:20Z

Right, manually use sample_stats.energy works.

In case, a sample form the warning I get

/home/gabriel/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/stats.py:41: RuntimeWarning: Mean of empty slice.
  return np.square(np.diff(energy_mat, axis=1)).mean(axis=1) / np.var(energy_mat, axis=1)

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-28-07a9031ba490> in <module>
----> 1 az.bfmi(RT_fit)

~/anaconda2/envs/future/lib/python3.5/site-packages/arviz/stats/stats.py in bfmi(energy)
     39     """
     40     energy_mat = np.atleast_2d(energy)
---> 41     return np.square(np.diff(energy_mat, axis=1)).mean(axis=1) / np.var(energy_mat, axis=1)
     42 
     43 

~/anaconda2/envs/future/lib/python3.5/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
     76     if isinstance(ret, mu.ndarray):
     77         ret = um.true_divide(
---> 78                 ret, rcount, out=ret, casting='unsafe', subok=False)
     79         if is_float16_result and out is None:
     80             ret = arr.dtype.type(ret)

ZeroDivisionError: division by zero

ahartikainen · 2019-01-28T16:47:01Z

Yes, it makes np.array([InferenceData])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Sparse Arrays #549

Support for Sparse Arrays #549

ahartikainen commented Jan 18, 2019

ahartikainen commented Jan 18, 2019

GWeindel commented Jan 28, 2019

ahartikainen commented Jan 28, 2019

ahartikainen commented Jan 28, 2019

GWeindel commented Jan 28, 2019

ahartikainen commented Jan 28, 2019

Support for Sparse Arrays #549

Support for Sparse Arrays #549

Comments

ahartikainen commented Jan 18, 2019

ahartikainen commented Jan 18, 2019

GWeindel commented Jan 28, 2019

ahartikainen commented Jan 28, 2019

ahartikainen commented Jan 28, 2019

GWeindel commented Jan 28, 2019

ahartikainen commented Jan 28, 2019