lib5c.util.lowess module¶
Module for performing lowess fitting. Consists mostly of a convenience wrapper
around statsmodels.nonparametric.smoothers_lowess.lowess().
-
lib5c.util.lowess.constant_fit(x, y, logx=False, logy=False, agg='median')[source]¶ Same signature as
lowess_fit()andgroup_fit(), but instead of fittingyagainstx, simply applies an aggregating function toy.- Parameters
x (Any) – Ignored, present only for signature parity with other fitters.
y (np.ndarray) – The y values to fit.
logx (Any) – Ignored, present only for signature parity with other fitters.
logy (bool) – Pass True to perform the fit on the scale of
log(y).agg ({'median', 'mean', 'lowess'}) – The function to use to aggregate y-values.
- Returns
This function takes in
xvalues, ignores them completely, and simply returns the constant estimatedyvalue on the originalyscale (regardless of what is passed forlogy).- Return type
function
-
lib5c.util.lowess.group_fit(x, y, logx=False, logy=False, agg='median', left_boundary=None, right_boundary=None, n_windows=100, window_width=0.2)[source]¶ Simpler alternative to lowess fitting using a sliding window mean.
- Parameters
y (x,) – The x and y values to fit, respectively.
logy (logx,) – Pass True to perform the fit on the scale of
log(x)and/orlog(y), respectively.agg ({'median', 'mean', 'lowess'}) – The function to use to aggregate within groups.
right_boundary (left_boundary,) – Allows specifying boundaries for the fit, in the original
xspace. If a float is passed, the returned fit will return the farthest left or farthest right lowess-estimatedy_hat(from the original fitting set) for all points which are left or right of the specified left or right boundary point, respectively. Pass None to use linear extrapolation for these points instead.n_windows (int) – The number of windows to use (spaced uniformly across the range of
x).window_width (float) – The width of each window, defined as a fraction of its x-value.
- Returns
This function takes in
xvalues on the originalxscale and returns estimatedyvalues on the originalyscale (regardless of what is passed forlogxandlogy). This function will still return sane estimates foryeven at points not in the original fitting set by performing linear interpolation in the space the fit was performed in.- Return type
function
-
lib5c.util.lowess.lowess_agg(y, it=3)[source]¶ Performs an aggregation operation equivalent to lowess. Should behave like an outlier-resistant mean.
- Parameters
y (np.ndarray) – The values to aggregate.
it (int) – The number of residual-based reweightings to perform.
- Returns
The lowess-implemented outlier-resistant mean.
- Return type
float
-
lib5c.util.lowess.lowess_fit(x, y, logx=False, logy=False, left_boundary=None, right_boundary=None, frac=0.3, delta=0.01)[source]¶ Opinionated convenience wrapper for lowess smoothing.
- Parameters
y (x,) – The x and y values to fit, respectively.
logy (logx,) – Pass True to perform the fit on the scale of
log(x)and/orlog(y), respectively.right_boundary (left_boundary,) – Allows specifying boundaries for the fit, in the original
xspace. If a float is passed, the returned fit will return the farthest left or farthest right lowess-estimatedy_hat(from the original fitting set) for all points which are left or right of the specified left or right boundary point, respectively. Pass None to use linear extrapolation for these points instead.frac (float) – The lowess smoothing fraction to use.
delta (float) – Distance (on the scale of
xorlog(x)) within which to use linear interpolation when constructing the initial fit, expressed as a fraction of the range ofxorlog(x).
- Returns
This function takes in
xvalues on the originalxscale and returns estimatedyvalues on the originalyscale (regardless of what is passed forlogxandlogy). This function will still return sane estimates foryeven at points not in the original fitting set by performing linear interpolation in the space the fit was performed in.- Return type
function
Notes
No filtering of input values is performed; clients are expected to handle this if desired. NaN values should not break the function, but
xpoints with zero values passed whenlogxis True are expected to break the function.The default value of the
deltaparameter is set to be non-zero, matching the behavior of lowess smoothing in R and improving performance.Linear interpolation between x-values in the original fitting set is used to provide a familiar functional interface to the fitted function.
Boundary conditions on the fitted function are exposed via
left_boundaryandright_boundary, mostly as a convenience for points wherex == 0when fitting was performed on the scale oflog(x).When
left_boundaryorright_boundaryare None (this is the default) the fitted function will be linearly extrapolated for points beyond the lowest and highest x-values inx.