Bias mitigation¶
5C read counts are strongly influenced by bias factors which are dictated by
intrinsic properties of the restriction fragments and primers involved in the
reactions. Moreover, the influence of these bias factors can vary from replicate
to replicate. lib5c
includes a wide variety of algorithms for mitigating
the effects of these bias factors.
Approaches¶
A variety of approaches to bias mitigation are possible.
Explicit normalization (spline)¶
This approach involves fitting splines to the three-dimensional surfaces generated by plotting each entry of the contact matrix on the z-axis, and setting the x- and y-axis positions according to some property of the upsteam and downstream fragment involved in the ligation junction represented by that matrix entry. These fitted splines can then be simply subtracted from the experimentally observed data.
To perform this normalization on the command line, run
$ lib5c spline
Splines can be visualized by running
$ lib5c plot visualize-spline
The exposed function for performing the spline normalization is
lib5c.algorithms.spline_normalization.iterative_spline_normalization()
.
The exposed function for visualizing the spline is
lib5c.plotters.splines.visualize_spline()
.
Simple matrix balancing approaches (kr and iced)¶
Matrix balancing approaches attempt to equalize the row sums of the contact matrix, without knowing anything about the intrinsic properties of the restriction fragments.
The Knight-Ruiz matrix balancing algorithm can be used by running
$ lib5c kr
The exposed function is lib5c.algorithms.knight_ruiz.kr_balance_matrix()
.
The ICED matrix balancing algorithm is implemented by iced,
and an easy-to-use interface to this package is exposed in lib5c
.
It can be used on the command line by running
$ lib5c iced
if iced
has been installed by running
$ pip install iced
The exposed function is lib5c.contrib.iced.balancing.iced_balance_matrix()
.
Advanced matrix balancing (express)¶
The Express matrix balancing algorithm takes into account a simple one-dimensional distance-dependent expected model when balancing, which can improve balancing performance given the wide dynamic range of interactions across any given row of the interaction matrix.
It can be used on the command line by running
$ lib5c express
The “Joint Express” variant first described in this library can be used by running
$ lib5c express -J
The exposed functions are:
Assessing bias factor profiles¶
Bias factor profiles can be visualized by running
$ lib5c plot bias-heatmap
The exposed function is lib5c.plotters.bias_heatmaps.plot_bias_heatmap()
The overall balance of a contact matrix can be visualized by running
$ lib5c plot boxplot
The exposed function is lib5c.plotters.boxplots.plot_regional_locus_boxplot()