lib5c.parsers.genes module

Module for parsing .bed files containing gene track information.

lib5c.parsers.genes.load_gene_table(tablefile)[source]

Similar to load_genes(), but reads in a gzipped UCSC table file instead.

The main advantage of this approach is that genes parsed this way include human-readable gene symbols.

Parameters:tablefile (str) – String reference to location of the gzipped table file to read.
Returns:The keys are chromosome names. The values are lists of genes for that chromosome. The genes are represented as dicts with the following structure:
{
    'start' : int,
    'end'   : int,
    'name'  : str,
    'id': str,
    'strand': '+' or '-',
    'blocks': list of dicts
}

Blocks typically represent exons and are represented as dicts with the following structure:

{
    'start': int,
    'end'  : int
}
Return type:dict of lists of dicts
lib5c.parsers.genes.load_genes(bedfile)[source]

Loads information for genes from a .bed file into dicts and returns them.

Parameters:bedfile (str) – String reference to location of .bed file to load genes from.
Returns:The keys are chromosome names. The values are lists of genes for that chromosome. The genes are represented as dicts with the following structure:
{
    'start' : int,
    'end'   : int,
    'name'  : str,
    'strand': '+' or '-',
    'blocks': list of dicts
}

Blocks typically represent exons and are represented as dicts with the following structure:

{
    'start': int,
    'end'  : int
}
Return type:dict of lists of dicts
lib5c.parsers.genes.main()[source]