lib5c.util.distributions module¶
Module containing utility functions for parametrizing statistical distributions.
-
lib5c.util.distributions.
call_pvalues
(obs, exp, var, dist_gen, log=False)[source]¶ Call right-tail p-values for obs against a theoretical distribution whose family is specified by dist_gen an whose first two moments are specified by exp and var, respectively.
- Parameters
obs (float) – The observed value to call a p-value for.
var (exp,) – The first two moments of the null distribution.
dist_gen (scipy.stats.rv_generic or str) – The null distribution to parameterize. If a string is passed this will be replaced with getattr(scipy.stats, dist_gen).
log (bool) – Pass True to attempt to convert exp and var to log-scale.
- Returns
The right-tail p-value.
- Return type
float
Notes
This function is array-safe.
-
lib5c.util.distributions.
convert_parameters
(mu, sigma_2, dist_gen, log=False)[source]¶ Obtain correct scipy.stats parameterizations for selected one- and two- parameter distributions given a desired mean and variance.
- Parameters
mu (float) – The mean of the desired distribution.
sigma_2 (float) – The variance of the desired distribution.
dist_gen (scipy.stats.rv_generic) – The target distribution.
log (bool) – Pass True to attempt to convert exp and var to log-scale.
- Returns
The appropriate scipy.stats parameters.
- Return type
tuple of float
-
lib5c.util.distributions.
freeze_distribution
(dist_gen, mu, sigma_2, log=False)[source]¶ Create a frozen distribution of a given type, given a mean and variance.
- Parameters
dist_gen (scipy.stats.rv_generic) – The distribution to use.
mu (float) – The desired mean.
sigma_2 (float) – The desired variance.
log (bool) – Pass True to attempt to convert mu and sigma_2 to log-scale.
- Returns
A frozen distribution of the specified type with specified mean and variance.
- Return type
scipy.stats.rv_frozen
Notes
This function does not perform any fitting, because it assumes that the first two moments directly and uniquely identify the desired distribution.
Examples
>>> frozen_dist = freeze_distribution(stats.poisson, 4.0, 4.0) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) poisson distribution with mean 4.00 and variance 4.00
>>> frozen_dist = freeze_distribution(stats.nbinom, 4.0, 3.0) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) nbinom distribution with mean 4.00 and variance 4.00
>>> frozen_dist = freeze_distribution(stats.nbinom, 3.0, 4.0) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) nbinom distribution with mean 3.00 and variance 4.00
>>> frozen_dist = freeze_distribution(stats.norm, 4.0, 3.0) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) norm distribution with mean 4.00 and variance 3.00
>>> frozen_dist = freeze_distribution(stats.logistic, 4.0, 3.0) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) logistic distribution with mean 4.00 and variance 3.00 >>> mu = 2 # mean of a normal random variable X >>> sigma_2 = 16 # variance of X >>> scale = np.exp(mu) # parameter conversion for scipy.stats.lognorm >>> s = np.sqrt(sigma_2) # ditto >>> y = stats.lognorm(s=s, scale=scale) # a lognormal RV: exp(X) = Y >>> m, v = y.stats(moments='mv') # mean and variance of Y >>> m, v (array(22026.4657948...), array(4.31123106e+15)) >>> frozen_dist = freeze_distribution(stats.logistic, m, v, log=True) >>> print('%s distribution with mean %.2f and variance %.2f' ... % ((frozen_dist.dist.name,) + frozen_dist.stats(moments='mv'))) logistic distribution with mean 2.00 and variance 16.00
-
lib5c.util.distributions.
log_parameters
(m, v)[source]¶ Attempts to guess appropriate log-scale mean and variance parameters given non-log scale estimators of these quantities, under the assumptions of a lognormal model.
Based on https://en.wikipedia.org/wiki/Log-normal_distribution#Notation
- Parameters
m (float) – The non-log scale mean.
v (float) – The non-log scale variance.
- Returns
The first float is the log-scale mean, the second is the log-scale variance.
- Return type
float, float
Notes
This function is array-safe.
Examples
>>> mu = 2 # mean of a normal random variable X >>> sigma_2 = 16 # variance of X >>> scale = np.exp(mu) # parameter conversion for scipy.stats.lognorm >>> s = np.sqrt(sigma_2) # ditto >>> y = stats.lognorm(s=s, scale=scale) # a lognormal RV: exp(X) = Y >>> m, v = y.stats(moments='mv') # mean and variance of Y >>> m, v (array(22026.4657948...), array(4.31123106e+15)) >>> log_parameters(m, v) # recover moments of X from moments of Y (2.0, 16.0)