lib5c.plotters.extendable.gene_extendable_heatmap module

Module for the GeneExtendableHeatmap class, which adds gene track plotting functionality for the extendable heatmap system.

class lib5c.plotters.extendable.gene_extendable_heatmap.GeneExtendableHeatmap(array, grange_x, grange_y=None, colorscale=None, colormap='obs', norm=None)[source]

Bases: lib5c.plotters.extendable.base_extendable_heatmap.BaseExtendableHeatmap

ExtendableHeatmap mixin class providing gene track plotting functionality.

To deal with the fact that genes may overlap (e.g., where a gene has multiple isoforms), this class uses the concept of “gene stacks”. Each gene track in a gene stack represents a separate axis added to the ExtendableHeatmap. By packing a set of genes into separate “rows”, functions like add_gene_stack() can plot each row in the stack as a separate gene track via add_gene_track().

Most commonly, we will want to add reference gene tracks corresponding to a particular genome assembly. To make this easy, this class provides the add_refgene_stack() and add_refgene_stacks() functions.

add_gene_stack(genes, loc='bottom', size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]

Adds one stack of gene tracks along either the x- or y-axis of the heatmap by packing one set of genes into rows and calling add_gene_track() once for every row.

Parameters
  • genes (list of dict) –

    Each dict should represent a gene and could have the following keys:

    {
        'chrom' : str,
        'start' : int,
        'end'   : int,
        'name'  : str,
        'id'    : str,
        'strand': '+' or '-',
        'blocks': list of dicts
    }
    

    Each block represents an exon as dicts with the following structure:

    {
        'start': int,
        'end'  : int
    }
    

    The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See lib5c.parsers.genes.load_genes().

  • loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene tracks to.

  • size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.

  • pad_within (float) – The padding to use between each newly added gene track.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The newly added gene track axes, one for each row of genes.

Return type

list of pyplot axis

add_gene_stacks(genes, size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]

Adds a gene stack for a set of genes to both the bottom and left side of the heatmap by calling add_gene_stack() twice.

Parameters
  • genes (list of dict) –

    Each dict should represent a gene and could have the following keys:

    {
        'chrom' : str,
        'start' : int,
        'end'   : int,
        'name'  : str,
        'id'    : str,
        'strand': '+' or '-',
        'blocks': list of dicts
    }
    

    Each block represents an exon as dicts with the following structure:

    {
        'start': int,
        'end'  : int
    }
    

    The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See lib5c.parsers.genes.load_genes().

  • size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.

  • pad_within (float) – The padding to use between each newly added gene track.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The first element of the outer list is a list of the newly added horizontal gene track axes, one for each row of genes. The second element is the same but for the newly added vertical gene track axes.

Return type

list of lists of pyplot axis

add_gene_track(genes, loc='bottom', size='3%', pad=0.0, new_ax_name='genes', axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, colors=None)[source]

Adds one gene track (for one row of genes) along either the x- or y-axis of the heatmap.

Parameters
  • genes (list of dict) –

    Each dict should represent a gene and could have the following keys:

    {
        'chrom' : str,
        'start' : int,
        'end'   : int,
        'name'  : str,
        'id'    : str,
        'strand': '+' or '-',
        'blocks': list of dicts
    }
    

    Each block represents an exon as dicts with the following structure:

    {
        'start': int,
        'end'  : int
    }
    

    The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See lib5c.parsers.genes.load_genes().

  • loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene track to.

  • size (str) – The size of the new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad (float) – The padding to use between the existing parts of the figure and the newly added axis.

  • new_ax_name (str) – The name for the new axis. You can access the new axis later at h[name] where h is this ExtendableHeatmap instance.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of the gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The newly added gene track axis.

Return type

pyplot axis

add_gene_tracks(genes, size='3%', pad=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, colors=None)[source]

Adds a gene track for a single row of genes to both the bottom and left side of the heatmap by calling add_gene_track() twice.

Parameters
  • genes (list of dict) –

    Each dict should represent a gene and could have the following keys:

    {
        'chrom' : str,
        'start' : int,
        'end'   : int,
        'name'  : str,
        'id'    : str,
        'strand': '+' or '-',
        'blocks': list of dicts
    }
    

    Each block represents an exon as dicts with the following structure:

    {
        'start': int,
        'end'  : int
    }
    

    The ‘name’ and ‘id’ keys are optional and are only used when color- coding genes. See lib5c.parsers.genes.load_genes().

  • size (str) – The size of the new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad (float) – The padding to use between the existing parts of the figure and the newly added axis.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of the gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The newly added gene track axes.

Return type

list of pyplot axes

add_refgene_stack(assembly, loc='bottom', size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]

Adds a gene stack to either the x- or y-axis of the heatmap by getting a set of reference genes for a specified genome assembly, and then passes that set of genes to add_gene_stack().

Parameters
  • assembly ({'hg18', 'hg19', 'hg38', 'mm9', 'mm10'}) – The genome assembly to load reference genes for.

  • loc ({'top', 'bottom', 'left', 'right'}) – Which side of the heatmap to add the new gene tracks to.

  • size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.

  • pad_within (float) – The padding to use between each newly added gene track.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The newly added gene track axes, one for each row of genes.

Return type

list of pyplot axis

add_refgene_stacks(assembly, size='3%', pad_before=0.0, pad_within=0.0, axis_limits=(0, 1), intron_height=0.05, exon_height=0.5, padding=1000, colors=None)[source]

Adds a gene stack for a set of genes to both the bottom and left side of the heatmap by calling add_refgene_stack() twice.

Parameters
  • assembly ({'hg18', 'hg19', 'hg38', 'mm9', 'mm10'}) – The genome assembly to load reference genes for.

  • size (str) – The size of each new axis as a percentage of the main heatmap width. Should be passed as a string ending in ‘%’.

  • pad_before (float) – The padding to use between the existing parts of the figure and the newly added gene tracks.

  • pad_within (float) – The padding to use between each newly added gene track.

  • axis_limits (tuple of float) – Axis limits for the non-genomic axis of each new gene track.

  • intron_height (float) – Controls thickness of gene introns. Pass a larger number for thicker introns.

  • exon_height (float) – Controls thickness of gene exons. Pass a larger number for thicker exons.

  • padding (int) – The padding to use when packing genes into rows, in units of base pairs. Genes that are within this many base pairs of each other will get packed into different rows.

  • colors (dict, optional) – Pass a dict mapping gene names or id’s to matplotlib colors to color code those genes. Genes not in the dict will be colored black by default. Using gene names as keys should color all isoforms, while using gene id’s as keys should color just the isoform matching the specified id.

Returns

The first element of the outer list is a list of the newly added horizontal gene track axes, one for each row of genes. The second element is the same but for the newly added vertical gene track axes.

Return type

list of lists of pyplot axis