lib5c.algorithms.variance.estimate_variance module

lib5c.algorithms.variance.estimate_variance.estimate_variance(obs_counts, exp_counts, key_rep=None, model='lognorm', source='deviation', source_kwargs=None, fitter='lowess', fitter_agg='lowess', fitter_kwargs=None, x_unit='dist', y_unit='disp', logx=False, logy=False, min_disp=1e-08, min_obs=2, min_dist=6, regional=False)[source]

Convenience function for computing variance estimates.

Parameters
  • obs_counts (dict of np.ndarray or dict of dict of np.ndarray) – Counts dict of observed values (keys are region names, values are square symmetric matrices), or superdict (outer keys are replicate names, inner keys are region names, values are square symmetric matrices) if source='cross_rep'.

  • exp_counts (dict of np.ndarray) – Counts dict of expected values.

  • key_rep (str) – If obs_counts is a dict of dict of np.ndarray, pass a string naming the specific replicate to compute variance estimates for.

  • model ({'lognorm', 'loglogistic', 'nbinom', 'poisson'}) – Statistical model to use.

  • source ({'local', 'cross_rep', 'deviation', 'mle'}) – Specify the source of the variance estimates.

  • source_kwargs (dict) – Kwargs to pass through to the variance source function.

  • fitter ({'constant', 'group', 'lowess', 'none'}) – Select fitting method to use for trend fitting. Pass ‘none’ to skip trend fitting and simply return unfiltered point-wise estimates.

  • fitter_agg ({'median', 'mean', 'lowess'}) – If fitter is ‘group’ or ‘constant’, select what function to use to aggregate values (within groups for group fitting or across the whole dataset for constant fitting).

  • fitter_kwargs (dict) – Kwargs to pass through to the fitting function.

  • x_unit ({'dist', 'exp'}) – The x-unit to fit the variance relationship against.

  • y_unit ({'disp', 'var'}) – The y-unit to fit the variance relationship against. When model='nbinom', “disp” refers to the negative binomial dispersion parameter. When model='lognorm', “disp” refers to the variance parameter of the normal distribution describing the logarithm of the observed counts.

  • logy (logx,) – Pass True to fit the variance relationship on the scale of log(x) and/or log(y).

  • min_disp (float) – When model='nbinom', this sets the minimum value of the negative binomial dispersion parameter. When model='lognormal', this sets the minimum value of the variance of logged observed counts.

  • min_obs (float) – Points with observed values below this threshold in any replicate will be excluded from MLE estimation and relationship fitting.

  • min_dist (int) – Points with interaction distances (in bin units) below this threshold will be excluded from MLE estimation and relationship fitting.

  • regional (bool) – Pass True to perform MLE estimation and relationship fitting on a per-region basis.

Returns

The variance estimates as a counts dict.

Return type

dict of np.ndarray