lib5c.core.loci module

class lib5c.core.loci.Locus(chrom, start, end, **kwargs)[source]

Bases: object, lib5c.core.mixins.Picklable, lib5c.core.mixins.Annotatable

Basically anything with a chromosome, start, and end. Can also include arbitrary metadata.

chrom

The chromosome on which this locus resides (e.g., 'chr4').

Type:str
start

The start coordinate for the zero-indexed, half-open interval occupied by the locus.

Type:int
end

The end coordinate for the zero-indexed, half-open interval occupied by the locus.

Type:int
data

Arbitrary additional data about the locus. This attribute is filled in with any kwargs passed to the constructor.

Type:dict(str -> any)

Notes

Locus objects support comparison and ordering via the total_ordering decorator. See this class’s implementations of __eq__() and __lt__() for more details.

Locus objects support the Annotatable mixin, which is the source of their data attribute and all its related functions.

as_dict()[source]

Gets a dict representation of the Locus.

Returns:This dict is guaranteed to have at least the following keys:
{
    'chrom': str,
    'start': int,
    'end': int
}

Other keys may be present if included in the data attribute.

Return type:dict

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, strand='+')
>>> locus.as_dict()
{'start': 34109023, 'end': 34113109, 'chrom': 'chr3', 'strand': '+'}
get_name()[source]

Gets the name of the locus, if present.

Returns:The value of data['name'] if it exists; None otherwise.
Return type:str or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2')
>>> locus.get_name()
'5C_329_Sox2_FOR_2'
get_region()[source]

Gets the region of the locus, if present.

Returns:The value of data['region'] if it exists; None otherwise.
Return type:str or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, region='Sox2')
>>> locus.get_region()
'Sox2'
get_strand()[source]

Gets the strand of the locus, if present.

Returns:The value of data['strand'] if it exists; None otherwise.
Return type:‘+’ or ‘-‘ or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, strand='+')
>>> locus.get_strand()
'+'
class lib5c.core.loci.LocusMap(locus_list)[source]

Bases: object, lib5c.core.mixins.Picklable, lib5c.core.mixins.Annotatable, lib5c.core.mixins.Loggable

Representation of an organized group of Locus objects.

locus_list

Ordered list of unique Locus objects in the LocusMap.

Type:list of Locus objects
regions

Ordered list of region names, as strings, present in the LocusMap. This list is filled in only when the Locus objects within the LocusMap have a 'region' key in their data attribute.

Type:list of str
name_dict

Dict mapping Locus names as strings to the Locus object with that name. This dict is filled in only when the Locus objects within the LocusMap have a 'name' key in their data attribute.

Type:dict(str -> Locus)
region_index_dict

Maps a region name and an index within the region to a Locus object within the specified region. This means that, for example,

locus_map.region_index_dict['Sox2'][3]

resolves to the Locus object that is the 4th Locus object of the Sox2 region in the LocusMap instance called locus_map.

Type:dict(str -> list of Locus objects)
hash_to_index_dict

Maps a Locus object’s hash to its index within locus_list.

Type:dict(int -> int)
name_to_index_dict

Dict mapping Locus names as strings to their index within the LocusMap. This dict is filled in only when the Locus objects within the LocusMap have a 'name' key in their data attribute.

Type:dict(str -> int)

Notes

Locus objects in a LocusMap are ordered by the total ordering implemented by the Locus class. See that class’s implementation of __eq__() and __lt__() for more details.

LocusMap objects support the Loggable and Annotatable mixins. See the Examples section for an example.

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> locus_map.size()
1551
>>> locus_map.print_log()
LocusMap created
source primerfile: test/primers.bed
>>> locus_map.set_value('test key', 'test value')
>>> locus_map.get_value('test key')
'test value'
as_dict_of_list_of_dict()[source]

Gets a primitive representation of the LocusMap, organized by region. Converts the locus_list attribute of this LocusMap from a list of Locus objects to dict whose keys are region names as strings and whose values are list of dicts representing the Locus objects in each region.

Returns:The primitive representation of the LocusMap.
Return type:dict(str -> list of dict)

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.as_dict_of_list_of_dict()
{'Sox2': [{'start': 34109023, 'region': 'Sox2', 'end': 34113109,
'chrom': 'chr3'}, {'start': 34113147, 'region': 'Sox2', 'end':
34116141, 'chrom': 'chr3'}], 'Nestin': [{'start': 87282063, 'region':
'Nestin', 'end': 87285636, 'chrom': 'chr3'}, {'start': 87285637,
'region': 'Nestin', 'end': 87295935, 'chrom': 'chr3'}]}
as_list_of_dict()[source]

Gets a primitive representation of the LocusMap. Converts the locus_list attribute of this LocusMap from a list of Locus objects to a list of dicts representing those Locus objects.

Returns:The primitive representation of the LocusMap.
Return type:list of dict

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.as_list_of_dict()
[{'start': 34109023, 'end': 34113109, 'chrom': 'chr3', 'name':
'Sox2_FOR_2'}, {'start': 34113147, 'end': 34116141, 'chrom': 'chr3',
'name': 'Sox2_REV_4'}]
by_index(index)[source]

Get the Locus object contained in this LocusMap with a specified index.

Parameters:index (int) – The index of the Locus to get.
Returns:The Locus object with the specified index.
Return type:Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.by_index(1)
Locus chr3:34113147-34116141
    name: Sox2_REV_4
by_name(name)[source]

Get the Locus object contained in this LocusMap with a specified name.

Parameters:name (str) – The name of the Locus to get.
Returns:The Locus object with the specified name.
Return type:Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.by_name('Sox2_REV_4')
Locus chr3:34113147-34116141
    name: Sox2_REV_4
by_region_index(region, index)[source]

Get the Locus object contained in this LocusMap with a specified index within a specified region. In other words, the index th Locus of the region with name region.

Parameters:
  • region (str) – The name of the region to look for the Locus in.
  • index (int) – The index of the desired Locus within the specified region.
Returns:

The specified Locus.

Return type:

Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.by_region_index('Nestin', 1)
Locus chr3:87285637-87295935
    region: Nestin
delete(index)[source]

Creates a new LocusMap object that excludes the Locus at a specified index.

Parameters:index (int) – The index of the Locus to exclude.
Returns:The new LocusMap.
Return type:LocusMap

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [
...     Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'),
...     Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'),
...     Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10')
... ]
...
>>> locus_map = LocusMap(locus_list)
>>> deleted_locus_map = locus_map.delete(2)
>>> deleted_locus_map.print_log()
LocusMap created
deleted locus at index 2 with name Nestin_REV_9
>>> for locus in deleted_locus_map:
...    print locus
Locus chr3:34109023-34113109
    name: Sox2_FOR_2
Locus chr3:34113147-34116141
    name: Sox2_REV_4
Locus chr3:87285637-87295935
    name: Nestin_FOR_10
extract_region(region)[source]

Create a LocusMap representing the Locus objects in only one specified region of this LocusMap.

Parameters:region (str) – The name of the region to extract.
Returns:A new LocusMap restricted to the specified region.
Return type:LocusMap

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> extracted_locus_map = locus_map.extract_region('Sox2')
>>> extracted_locus_map.print_log()
LocusMap created
extracted region Sox2
>>> for locus in extracted_locus_map:
...     print locus
...
Locus chr3:34109023-34113109
    region: Sox2
Locus chr3:34113147-34116141
    region: Sox2
extract_slice(desired_slice)[source]

Gets a new LocusMap object representing a subset of the Locus objects in this LocusMap specified by a slice.

Parameters:desired_slice (slice) – The slice to use to subset this LocusMap.
Returns:The new LocusMap.
Return type:LocusMap

Notes

Since LocusMap objects are sorted, the returned LocusMap will always be sorted, regardless of the slice direction.

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [
...     Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'),
...     Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'),
...     Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10')
... ]
...
>>> locus_map = LocusMap(locus_list)
>>> sliced_map = locus_map[1:3]
>>> sliced_map.print_log()
LocusMap created
sliced out slice(1, 3, None)
>>> for locus in sliced_map:
...     print locus
...
Locus chr3:34113147-34116141
    name: Sox2_REV_4
Locus chr3:87282063-87285636
    name: Nestin_REV_9
classmethod from_binfile(binfile)[source]

Factory method that creates a LocusMap object from a BED file containing bin information.

Parameters:binfile (str) – String reference to the bin file.
Returns:LocusMap object parsed from the bin file.
Return type:LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_binfile('test/bins_new.bed')
>>> locus_map.size()
1807
>>> locus_map.print_log()
LocusMap created
source binfile: test/bins_new.bed
classmethod from_list(list_of_locusmaps)[source]

Factory method that creates a new LocusMap object from a list of existing LocusMap objects by concatenation.

Parameters:list_of_locusmaps (list of LocusMap) – A list of LocusMap objects to be concatenated.
Returns:The concatenated LocusMap.
Return type:LocusMap

Notes

This function should be slightly more efficient than iterative addition. Therefore, it is preferred to use

summed_locus_map = LocusMap.from_list(list_of_locus_maps)

over

summed_locus_map = sum(list_of_locus_maps, LocusMap([]))

as evidenced by

> python -mtimeit `
  -s'from lib5c.core.loci import LocusMap' `
  -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' `
  -s'lm_list = [lm.extract_region(r) for r in lm.regions]' `
  's = LocusMap.from_list(lm_list)'
10 loops, best of 3: 48.2 msec per loop

versus

> python -mtimeit `
  -s'from lib5c.core.loci import LocusMap' `
  -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' `
  -s'lm_list = [lm.extract_region(r) for r in lm.regions]' `
  's = sum(lm_list, LocusMap([]))'
10 loops, best of 3: 174 msec per loop

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> sox2_locus_map = locus_map.extract_region('Sox2')
>>> sox2_locus_map.size()
265
>>> sox2_locus_map.get_regions()
['Sox2']
>>> klf4_locus_map = locus_map.extract_region('Klf4')
>>> klf4_locus_map.size()
251
>>> klf4_locus_map.get_regions()
['Klf4']
>>> summed_locus_map = LocusMap.from_list([sox2_locus_map,
...                                        klf4_locus_map])
...
>>> summed_locus_map.print_log()
LocusMap created
created from list
>>> summed_locus_map.size()
516
>>> summed_locus_map.get_regions()
['Sox2', 'Klf4']
>>> builtin_sum_result = sum([sox2_locus_map, klf4_locus_map],
...                          LocusMap([]))
...
>>> builtin_sum_result.size()
516
>>> builtin_sum_result.get_regions()
['Sox2', 'Klf4']
classmethod from_list_of_dict(list_of_dict)[source]

Factory method that creates a LocusMap object from a list of dicts that represent the Loci that the LocusMap should be composed of.

Parameters:list_of_dict (list of dict) – A list of dicts, with each dict representing a Locus that should be created and put into the LocusMap.
Returns:A LocusMap whose Locus objects are equivalent to the dicts passed in list_of_dict
Return type:LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> list_of_dict = [{'chrom': 'chr3',
...                  'start': 34109023,
...                  'end': 34113109,
...                  'name': 'Sox2_FOR_2'},
...                 {'chrom': 'chr3',
...                  'start': 34113147,
...                  'end': 34116141,
...                  'name': 'Sox2_REV_4'}]
...
>>> locus_map = LocusMap.from_list_of_dict(list_of_dict)
>>> locus_map.print_log()
LocusMap created
created from list of dict
>>> for locus in locus_map:
...     print locus
...
Locus chr3:34109023-34113109
    name: Sox2_FOR_2
Locus chr3:34113147-34116141
    name: Sox2_REV_4
>>> list_of_dict_dup = [{'chrom': 'chr3',
...                      'start': 34109023,
...                      'end': 34113109,
...                      'name': 'Sox2_FOR_2'},
...                     {'chrom': 'chr3',
...                      'start': 34113147,
...                      'end': 34116141,
...                      'name': 'Sox2_REV_4'},
...                     {'chrom': 'chr3',
...                      'start': 34113147,
...                      'end': 34116141,
...                      'name': 'duplicate Locus!'}]
...
>>> locus_map_dup = LocusMap.from_list_of_dict(list_of_dict_dup)
Traceback (most recent call last):
    ...
ValueError: Locus objects in LocusMap must be unique
classmethod from_primerfile(primerfile)[source]

Factory method that creates a LocusMap object from a BED file containing primer information.

Parameters:primerfile (str) – String reference to the primer file.
Returns:LocusMap object parsed from the primer file.
Return type:LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> locus_map.size()
1551
>>> locus_map.print_log()
LocusMap created
source primerfile: test/primers.bed
get_index(name)[source]

Get the index of the Locus object in this LocusMap with a specified name.

Parameters:name (str) – The name of the Locus to get the index for.
Returns:The index of the Locus object with the specified name.
Return type:int

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.get_index('Sox2_REV_4')
1
get_index_by_hash(hash_value)[source]

Get the Locus object contained in this LocusMap with a specified hash.

Parameters:hash_value (int) – The hash of the Locus object to find.
Returns:The index of the desired Locus object within locus_list if it exists within this LocusMap object, or None if it doesn’t.
Return type:int or None

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109),
...               Locus('chr3', 34113147, 34116141)]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_hash = hash(Locus('chr3', 34113147, 34116141))
>>> locus_map.get_index_by_hash(locus_hash)
1
>>> locus_map.get_index_by_hash(123) is None
True
get_region_sizes()[source]

Gets information about the number of Locus objects in each region.

Returns:A dict mapping region names as strings to the number of Locus objects in that region.
Return type:dict(str -> int)

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.get_region_sizes()
{'Sox2': 2, 'Nestin': 1}
get_regions()[source]

Gets the regions spanned by the Locus objects in this LocusMap.

Returns:The ordered list of region names.
Return type:list of str

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> print locus_map.get_regions()
['Sox2', 'Nestin']
size()[source]

Get the number of Locus objects in the LocusMap.

Returns:The number of Locus objects in the LocusMap.
Return type:int

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109),
...               Locus('chr3', 34113147, 34116141)]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.size()
2
to_bedfile(filename, fields=('name', ))[source]

Write this LocusMap to disk as a BED-formatted file.

Parameters:
  • filename (str) – String reference to the file to write to.
  • fields (list of str, optional) – Specify additional columns in the BED file after the traditional chromosome, start, end. Columns should be specified in order as strings corresponding to keys in the data attributes on the Locus objects that make up this LocusMap.

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> lm1 = LocusMap([
...     Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='5C_329_Sox2_REV_4')
... ])
...
>>> lm1.to_bedfile('test/core_test_locusmap.bed')
>>> lm2 = LocusMap.from_primerfile('test/core_test_locusmap.bed')
>>> for locus in lm2:
...     print locus
Locus chr3:34109023-34113109
    orientation: 3'
    region: Sox2
    name: 5C_329_Sox2_FOR_2
    strand: +
    number: 2
Locus chr3:34113147-34116141
    orientation: 5'
    region: Sox2
    name: 5C_329_Sox2_REV_4
    strand: -
    number: 4