Bias mitigation

5C read counts are strongly influenced by bias factors which are dictated by intrinsic properties of the restriction fragments and primers involved in the reactions. Moreover, the influence of these bias factors can vary from replicate to replicate. lib5c includes a wide variety of algorithms for mitigating the effects of these bias factors.

Approaches

A variety of approaches to bias mitigation are possible.

Explicit normalization (spline)

This approach involves fitting splines to the three-dimensional surfaces generated by plotting each entry of the contact matrix on the z-axis, and setting the x- and y-axis positions according to some property of the upsteam and downstream fragment involved in the ligation junction represented by that matrix entry. These fitted splines can then be simply subtracted from the experimentally observed data.

To perform this normalization on the command line, run

$ lib5c spline

Splines can be visualized by running

$ lib5c plot visualize-spline

The exposed function for performing the spline normalization is lib5c.algorithms.spline_normalization.iterative_spline_normalization().

The exposed function for visualizing the spline is lib5c.plotters.splines.visualize_spline().

Simple matrix balancing approaches (kr and iced)

Matrix balancing approaches attempt to equalize the row sums of the contact matrix, without knowing anything about the intrinsic properties of the restriction fragments.

The Knight-Ruiz matrix balancing algorithm can be used by running

$ lib5c kr

The exposed function is lib5c.algorithms.knight_ruiz.kr_balance_matrix().

The ICED matrix balancing algorithm is implemented by iced, and an easy-to-use interface to this package is exposed in lib5c.

It can be used on the command line by running

$ lib5c iced

if iced has been installed by running

$ pip install iced

The exposed function is lib5c.contrib.iced.balancing.iced_balance_matrix().

Advanced matrix balancing (express)

The Express matrix balancing algorithm takes into account a simple one-dimensional distance-dependent expected model when balancing, which can improve balancing performance given the wide dynamic range of interactions across any given row of the interaction matrix.

It can be used on the command line by running

$ lib5c express

The “Joint Express” variant first described in this library can be used by running

$ lib5c express -J

The exposed functions are:

Assessing bias factor profiles

Bias factor profiles can be visualized by running

$ lib5c plot bias-heatmap

The exposed function is lib5c.plotters.bias_heatmaps.plot_bias_heatmap()

The overall balance of a contact matrix can be visualized by running

$ lib5c plot boxplot

The exposed function is lib5c.plotters.boxplots.plot_regional_locus_boxplot()