lib5c.plotters.bias_heatmaps module

Module for plotting bias heatmaps.

lib5c.plotters.bias_heatmaps.plot_bias_heatmap(obs_counts, exp_counts, primermap, factor, bins=None, n_bins=None, cmap=None, vmin=None, vmax=None, midpoint=None, log=True, region=None, agg=<function gmean>, asymmetric=False, print_variance=False, shuffle=0, zero_inflated=False, unique=False, despine=False, style='dark', dpi=300, **kwargs)[source]

Plots a bias heatmap.

Parameters
  • obs_counts (Dict[str, np.ndarray]) – The dict of observed counts.

  • exp_counts (Dict[str, np.ndarray]) – The dict of expected counts.

  • primermap (Dict[str, List[Dict[str, Any]]]) – Primermap or pixelmap describing the loci in obs_counts and exp_counts.

  • factor (str) – The bias factor to draw the bias heatmap for. This string must match a metadata key in primermap. That is to say, if factor is 'length' then we expect primermap[region][i]['length'] to be a number representing the length of the i th fragment in the region specified by region.

  • bins (Optional[Sequence[numeric]]) – The endpoints of the bins to use to stratify the bias factor values. Either bins or n_bins must be specified.

  • n_bins (Optional[int]) – The number of even-number bins to use to stratify the bias factor values. Either bins or n_bins must be specified.

  • cmap (Optional[matplotlib.colors.Colormap]) – Pass a colormap to use for the heatmap. If this kwarg is not passed, the default ‘bias’ colormap is used.

  • vmin (Optional[float]) – The minimum value to use for the heatmap. If this kwarg is not passed, the min of the data will be used.

  • vmax (Optional[float]) – The maximum value to use for the heatmap. If this kwarg is not passed, the max of the data will be used.

  • midpoint (Optional[float]) – The midpoint value to use for the colormap. If this kwarg is not passed, the colormap will be symmetric about its midpoint. This kwarg can be used to force the midpoint of the colormap to lie at a desired value, such as 0.

  • log (bool) – Whether or not to show log-scale fold-enrichments in the heatmap.

  • region (Optional[str]) – Pass a region name as a string to consider only the contacts in one particular region. If this kwarg is not passed, contacts for all regions in the input counts dicts will be used to generate the bias heatmap.

  • agg (Callable[[np.ndarray], float]) – The aggregation function to use when summarizing the strata. This function should take in an array of floats and return a single summary value.

  • asymmetric (bool) – Pass True to construct heatmaps using only the upper-triangular elements of the counts matrices, which can lead to asymmetric heatmaps. By default, the algorithm iterates over all elements of the counts matrices, enforcing symmetry in the bias models but incurring some redundancy in the actual counts information.

  • print_variance (bool) – If True, the variance of the bias across the stratification grid will be printed in the plot title.

  • shuffle (int) – Specify a number of random permutation null hypothesis simulations to perform.

  • zero_inflated (bool) – Pass True here to treat the bias factor as “zero inflated”, which will cause all the zero values to land in a dedicated “zero stratum” and allocate the remaining bins evenly among the positive data. This kwarg is ignored if the bins are passed explicitly.

  • unique (bool) – Pass True to override bins and n_bins and simply put each unique value of the bias factor into its own stratum.

  • dpi (int) – DPI to save figure at if auto-saving to a raster format.

  • kwargs (kwargs) – Typical plotter kwargs.

Returns

The first element is the pyplot axis plotted on, which will always be present. The second element is the variance of the enrichment with respect to the bias factor grid. The third element is the percentile value for this variance obtained from simulations under the null hypothesis. The fourth element is the 95% RI for the null hypothesis. These last three will be nan if no simulations were performed.

Return type

Tuple[pyplot axis, float]

Notes

The simulations were a cool idea, but in reality the 95% null hypothesis RI is incredibly small since it is very unlikely to see any enrichment at all if the obs and exp counts are forcibly de-correlated from the bias factors. Therefore the recommendation is to not simulate unless you’re really sure you want it.