lib5c.plotters.extendable.gene_extendable_heatmap module¶
Module for the GeneExtendableHeatmap class, which adds gene track plotting functionality for the extendable heatmap system.
-
class
lib5c.plotters.extendable.gene_extendable_heatmap.
GeneExtendableHeatmap
(array, grange_x, grange_y=None, colorscale=None, colormap='obs', norm=None)[source]¶ Bases:
lib5c.plotters.extendable.base_extendable_heatmap.BaseExtendableHeatmap
ExtendableHeatmap mixin class providing gene track plotting functionality.
To deal with the fact that genes may overlap (e.g., where a gene has multiple isoforms), this class uses the concept of “gene stacks”. Each gene track in a gene stack represents a separate axis added to the ExtendableHeatmap. By packing a set of genes into separate “rows”, functions like
add_gene_stack()
can plot each row in the stack as a separate gene track viaadd_gene_track()
.Most commonly, we will want to add reference gene tracks corresponding to a particular genome assembly. To make this easy, this class provides the
add_refgene_stack()
andadd_refgene_stacks()
functions.-
add_gene_stack
(genes, loc='bottom', size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]¶ Adds one stack of gene tracks along either the x- or y-axis of the heatmap by packing one set of genes into rows and calling
add_gene_track()
once for every row.- Parameters
genes (list of dict) –
Each dict should represent a gene and could have the following keys:
{ 'chrom' : str, 'start' : int, 'end' : int, 'name' : str, 'id' : str, 'strand': '+' or '-', 'blocks': list of dicts }
Each block represents an exon as dicts with the following structure:
{ 'start': int, 'end' : int }
The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See
lib5c.parsers.genes.load_genes()
.loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene tracks to.
size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.
pad_within (float) – The padding to use between each newly added gene track.
axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The newly added gene track axes, one for each row of genes.
- Return type
list of pyplot axis
-
add_gene_stacks
(genes, size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]¶ Adds a gene stack for a set of genes to both the bottom and left side of the heatmap by calling
add_gene_stack()
twice.- Parameters
genes (list of dict) –
Each dict should represent a gene and could have the following keys:
{ 'chrom' : str, 'start' : int, 'end' : int, 'name' : str, 'id' : str, 'strand': '+' or '-', 'blocks': list of dicts }
Each block represents an exon as dicts with the following structure:
{ 'start': int, 'end' : int }
The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See
lib5c.parsers.genes.load_genes()
.size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.
pad_within (float) – The padding to use between each newly added gene track.
axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The first element of the outer list is a list of the newly added horizontal gene track axes, one for each row of genes. The second element is the same but for the newly added vertical gene track axes.
- Return type
list of lists of pyplot axis
-
add_gene_track
(genes, loc='bottom', size='3%', pad=0.0, new_ax_name='genes', axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, colors=None)[source]¶ Adds one gene track (for one row of genes) along either the x- or y-axis of the heatmap.
- Parameters
genes (list of dict) –
Each dict should represent a gene and could have the following keys:
{ 'chrom' : str, 'start' : int, 'end' : int, 'name' : str, 'id' : str, 'strand': '+' or '-', 'blocks': list of dicts }
Each block represents an exon as dicts with the following structure:
{ 'start': int, 'end' : int }
The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See
lib5c.parsers.genes.load_genes()
.loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene track to.
size (str) – The size of the new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad (float) – The padding to use between the existing parts of the figure and the newly added axis.
new_ax_name (str) – The name for the new axis. You can access the new axis later at
h[name]
whereh
is this ExtendableHeatmap instance.axis_limits (tuple of float) – Axis limits for the non-genomic axis of the gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The newly added gene track axis.
- Return type
pyplot axis
-
add_gene_tracks
(genes, size='3%', pad=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, colors=None)[source]¶ Adds a gene track for a single row of genes to both the bottom and left side of the heatmap by calling
add_gene_track()
twice.- Parameters
genes (list of dict) –
Each dict should represent a gene and could have the following keys:
{ 'chrom' : str, 'start' : int, 'end' : int, 'name' : str, 'id' : str, 'strand': '+' or '-', 'blocks': list of dicts }
Each block represents an exon as dicts with the following structure:
{ 'start': int, 'end' : int }
The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See
lib5c.parsers.genes.load_genes()
.size (str) – The size of the new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad (float) – The padding to use between the existing parts of the figure and the newly added axis.
axis_limits (tuple of float) – Axis limits for the non-genomic axis of the gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The newly added gene track axes.
- Return type
list of pyplot axes
-
add_refgene_stack
(assembly, loc='bottom', size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]¶ Adds a gene stack to either the x- or y-axis of the heatmap by getting a set of reference genes for a specified genome assembly, and then passes that set of genes to
add_gene_stack()
.- Parameters
assembly ({'hg18', 'hg19', 'hg38', 'mm9', 'mm10'}) – The genome assembly to load reference genes for.
loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene tracks to.
size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.
pad_within (float) – The padding to use between each newly added gene track.
axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The newly added gene track axes, one for each row of genes.
- Return type
list of pyplot axis
-
add_refgene_stacks
(assembly, size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]¶ Adds a gene stack for a set of genes to both the bottom and left side of the heatmap by calling
add_refgene_stack()
twice.- Parameters
assembly ({'hg18', 'hg19', 'hg38', 'mm9', 'mm10'}) – The genome assembly to load reference genes for.
size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.
pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.
pad_within (float) – The padding to use between each newly added gene track.
axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.
intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.
exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.
padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.
colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.
- Returns
The first element of the outer list is a list of the newly added horizontal gene track axes, one for each row of genes. The second element is the same but for the newly added vertical gene track axes.
- Return type
list of lists of pyplot axis
-