lib5c.core.loci module

class lib5c.core.loci.Locus(chrom, start, end, **kwargs)[source]

Bases: lib5c.core.mixins.Picklable, lib5c.core.mixins.Annotatable

Basically anything with a chromosome, start, and end. Can also include arbitrary metadata.

chrom

The chromosome on which this locus resides (e.g., 'chr4').

Type

str

start

The start coordinate for the zero-indexed, half-open interval occupied by the locus.

Type

int

end

The end coordinate for the zero-indexed, half-open interval occupied by the locus.

Type

int

data

Arbitrary additional data about the locus. This attribute is filled in with any kwargs passed to the constructor.

Type

dict(str -> any)

Notes

Locus objects support comparison and ordering via the total_ordering decorator. See this class’s implementations of __eq__() and __lt__() for more details.

Locus objects support the Annotatable mixin, which is the source of their data attribute and all its related functions.

as_dict()[source]

Gets a dict representation of the Locus.

Returns

This dict is guaranteed to have at least the following keys:

{
    'chrom': str,
    'start': int,
    'end': int
}

Other keys may be present if included in the data attribute.

Return type

dict

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, strand='+')
>>> locus.as_dict() == \
...     {'chrom': 'chr3',
...      'start': 34109023,
...      'end': 34113109,
...      'strand': '+'}
True
get_name()[source]

Gets the name of the locus, if present.

Returns

The value of data['name'] if it exists; None otherwise.

Return type

str or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2')
>>> locus.get_name()
'5C_329_Sox2_FOR_2'
get_region()[source]

Gets the region of the locus, if present.

Returns

The value of data['region'] if it exists; None otherwise.

Return type

str or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, region='Sox2')
>>> locus.get_region()
'Sox2'
get_strand()[source]

Gets the strand of the locus, if present.

Returns

The value of data['strand'] if it exists; None otherwise.

Return type

‘+’ or ‘-‘ or None

Examples

>>> from lib5c.core.loci import Locus
>>> locus = Locus('chr3', 34109023, 34113109, strand='+')
>>> locus.get_strand()
'+'
class lib5c.core.loci.LocusMap(locus_list)[source]

Bases: lib5c.core.mixins.Picklable, lib5c.core.mixins.Annotatable, lib5c.core.mixins.Loggable

Representation of an organized group of Locus objects.

locus_list

Ordered list of unique Locus objects in the LocusMap.

Type

list of Locus objects

regions

Ordered list of region names, as strings, present in the LocusMap. This list is filled in only when the Locus objects within the LocusMap have a 'region' key in their data attribute.

Type

list of str

name_dict

Dict mapping Locus names as strings to the Locus object with that name. This dict is filled in only when the Locus objects within the LocusMap have a 'name' key in their data attribute.

Type

dict(str -> Locus)

region_index_dict

Maps a region name and an index within the region to a Locus object within the specified region. This means that, for example,

locus_map.region_index_dict['Sox2'][3]

resolves to the Locus object that is the 4th Locus object of the Sox2 region in the LocusMap instance called locus_map.

Type

dict(str -> list of Locus objects)

hash_to_index_dict

Maps a Locus object’s hash to its index within locus_list.

Type

dict(int -> int)

name_to_index_dict

Dict mapping Locus names as strings to their index within the LocusMap. This dict is filled in only when the Locus objects within the LocusMap have a 'name' key in their data attribute.

Type

dict(str -> int)

Notes

Locus objects in a LocusMap are ordered by the total ordering implemented by the Locus class. See that class’s implementation of __eq__() and __lt__() for more details.

LocusMap objects support the Loggable and Annotatable mixins. See the Examples section for an example.

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> locus_map.size()
1551
>>> locus_map.print_log()
LocusMap created
source primerfile: test/primers.bed
>>> locus_map.set_value('test key', 'test value')
>>> locus_map.get_value('test key')
'test value'
as_dict_of_list_of_dict()[source]

Gets a primitive representation of the LocusMap, organized by region. Converts the locus_list attribute of this LocusMap from a list of Locus objects to dict whose keys are region names as strings and whose values are list of dicts representing the Locus objects in each region.

Returns

The primitive representation of the LocusMap.

Return type

dict(str -> list of dict)

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nest'),
...               Locus('chr3', 87285637, 87295935, region='Nest')]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.as_dict_of_list_of_dict() == \
...     {'Sox2': [{'chrom': 'chr3', 'start': 34109023, 'end': 34113109,
...                'region': 'Sox2'},
...               {'chrom': 'chr3', 'start': 34113147, 'end': 34116141,
...                'region': 'Sox2'}],
...      'Nest': [{'chrom': 'chr3', 'start': 87282063, 'end': 87285636,
...                'region': 'Nest'},
...               {'chrom': 'chr3', 'start': 87285637, 'end': 87295935,
...                'region': 'Nest'}]}
True
as_list_of_dict()[source]

Gets a primitive representation of the LocusMap. Converts the locus_list attribute of this LocusMap from a list of Locus objects to a list of dicts representing those Locus objects.

Returns

The primitive representation of the LocusMap.

Return type

list of dict

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.as_list_of_dict() == \
...     [{'chrom': 'chr3', 'start': 34109023, 'end': 34113109,
...       'name': 'Sox2_FOR_2'},
...      {'chrom': 'chr3', 'start': 34113147, 'end': 34116141,
...       'name': 'Sox2_REV_4'}]
True
by_index(index)[source]

Get the Locus object contained in this LocusMap with a specified index.

Parameters

index (int) – The index of the Locus to get.

Returns

The Locus object with the specified index.

Return type

Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print(locus_map.by_index(1))
Locus chr3:34113147-34116141
    name: Sox2_REV_4
by_name(name)[source]

Get the Locus object contained in this LocusMap with a specified name.

Parameters

name (str) – The name of the Locus to get.

Returns

The Locus object with the specified name.

Return type

Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> print(locus_map.by_name('Sox2_REV_4'))
Locus chr3:34113147-34116141
    name: Sox2_REV_4
by_region_index(region, index)[source]

Get the Locus object contained in this LocusMap with a specified index within a specified region. In other words, the index th Locus of the region with name region.

Parameters
  • region (str) – The name of the region to look for the Locus in.

  • index (int) – The index of the desired Locus within the specified region.

Returns

The specified Locus.

Return type

Locus

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> print(locus_map.by_region_index('Nestin', 1))
Locus chr3:87285637-87295935
    region: Nestin
delete(index)[source]

Creates a new LocusMap object that excludes the Locus at a specified index.

Parameters

index (int) – The index of the Locus to exclude.

Returns

The new LocusMap.

Return type

LocusMap

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [
...     Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'),
...     Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'),
...     Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10')
... ]
...
>>> locus_map = LocusMap(locus_list)
>>> deleted_locus_map = locus_map.delete(2)
>>> deleted_locus_map.print_log()
LocusMap created
deleted locus at index 2 with name Nestin_REV_9
>>> for locus in deleted_locus_map:
...    print(locus)
Locus chr3:34109023-34113109
    name: Sox2_FOR_2
Locus chr3:34113147-34116141
    name: Sox2_REV_4
Locus chr3:87285637-87295935
    name: Nestin_FOR_10
extract_region(region)[source]

Create a LocusMap representing the Locus objects in only one specified region of this LocusMap.

Parameters

region (str) – The name of the region to extract.

Returns

A new LocusMap restricted to the specified region.

Return type

LocusMap

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> extracted_locus_map = locus_map.extract_region('Sox2')
>>> extracted_locus_map.print_log()
LocusMap created
extracted region Sox2
>>> for locus in extracted_locus_map:
...     print(locus)
...
Locus chr3:34109023-34113109
    region: Sox2
Locus chr3:34113147-34116141
    region: Sox2
extract_slice(desired_slice)[source]

Gets a new LocusMap object representing a subset of the Locus objects in this LocusMap specified by a slice.

Parameters

desired_slice (slice) – The slice to use to subset this LocusMap.

Returns

The new LocusMap.

Return type

LocusMap

Notes

Since LocusMap objects are sorted, the returned LocusMap will always be sorted, regardless of the slice direction.

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [
...     Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'),
...     Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'),
...     Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10')
... ]
...
>>> locus_map = LocusMap(locus_list)
>>> sliced_map = locus_map[1:3]
>>> sliced_map.print_log()
LocusMap created
sliced out slice(1, 3, None)
>>> for locus in sliced_map:
...     print(locus)
...
Locus chr3:34113147-34116141
    name: Sox2_REV_4
Locus chr3:87282063-87285636
    name: Nestin_REV_9
classmethod from_binfile(binfile)[source]

Factory method that creates a LocusMap object from a BED file containing bin information.

Parameters

binfile (str) – String reference to the bin file.

Returns

LocusMap object parsed from the bin file.

Return type

LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_binfile('test/bins_new.bed')
>>> locus_map.size()
1807
>>> locus_map.print_log()
LocusMap created
source binfile: test/bins_new.bed
classmethod from_list(list_of_locusmaps)[source]

Factory method that creates a new LocusMap object from a list of existing LocusMap objects by concatenation.

Parameters

list_of_locusmaps (list of LocusMap) – A list of LocusMap objects to be concatenated.

Returns

The concatenated LocusMap.

Return type

LocusMap

Notes

This function should be slightly more efficient than iterative addition. Therefore, it is preferred to use

summed_locus_map = LocusMap.from_list(list_of_locus_maps)

over

summed_locus_map = sum(list_of_locus_maps, LocusMap([]))

as evidenced by

> python -mtimeit `
  -s'from lib5c.core.loci import LocusMap' `
  -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' `
  -s'lm_list = [lm.extract_region(r) for r in lm.regions]' `
  's = LocusMap.from_list(lm_list)'
10 loops, best of 3: 48.2 msec per loop

versus

> python -mtimeit `
  -s'from lib5c.core.loci import LocusMap' `
  -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' `
  -s'lm_list = [lm.extract_region(r) for r in lm.regions]' `
  's = sum(lm_list, LocusMap([]))'
10 loops, best of 3: 174 msec per loop

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> sox2_locus_map = locus_map.extract_region('Sox2')
>>> sox2_locus_map.size()
265
>>> sox2_locus_map.get_regions()
['Sox2']
>>> klf4_locus_map = locus_map.extract_region('Klf4')
>>> klf4_locus_map.size()
251
>>> klf4_locus_map.get_regions()
['Klf4']
>>> summed_locus_map = LocusMap.from_list([sox2_locus_map,
...                                        klf4_locus_map])
...
>>> summed_locus_map.print_log()
LocusMap created
created from list
>>> summed_locus_map.size()
516
>>> summed_locus_map.get_regions()
['Sox2', 'Klf4']
>>> builtin_sum_result = sum([sox2_locus_map, klf4_locus_map],
...                          LocusMap([]))
...
>>> builtin_sum_result.size()
516
>>> builtin_sum_result.get_regions()
['Sox2', 'Klf4']
classmethod from_list_of_dict(list_of_dict)[source]

Factory method that creates a LocusMap object from a list of dicts that represent the Loci that the LocusMap should be composed of.

Parameters

list_of_dict (list of dict) – A list of dicts, with each dict representing a Locus that should be created and put into the LocusMap.

Returns

A LocusMap whose Locus objects are equivalent to the dicts passed in list_of_dict

Return type

LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> list_of_dict = [{'chrom': 'chr3',
...                  'start': 34109023,
...                  'end': 34113109,
...                  'name': 'Sox2_FOR_2'},
...                 {'chrom': 'chr3',
...                  'start': 34113147,
...                  'end': 34116141,
...                  'name': 'Sox2_REV_4'}]
...
>>> locus_map = LocusMap.from_list_of_dict(list_of_dict)
>>> locus_map.print_log()
LocusMap created
created from list of dict
>>> for locus in locus_map:
...     print(locus)
...
Locus chr3:34109023-34113109
    name: Sox2_FOR_2
Locus chr3:34113147-34116141
    name: Sox2_REV_4
>>> list_of_dict_dup = [{'chrom': 'chr3',
...                      'start': 34109023,
...                      'end': 34113109,
...                      'name': 'Sox2_FOR_2'},
...                     {'chrom': 'chr3',
...                      'start': 34113147,
...                      'end': 34116141,
...                      'name': 'Sox2_REV_4'},
...                     {'chrom': 'chr3',
...                      'start': 34113147,
...                      'end': 34116141,
...                      'name': 'duplicate Locus!'}]
...
>>> locus_map_dup = LocusMap.from_list_of_dict(list_of_dict_dup)
Traceback (most recent call last):
    ...
ValueError: Locus objects in LocusMap must be unique
classmethod from_primerfile(primerfile)[source]

Factory method that creates a LocusMap object from a BED file containing primer information.

Parameters

primerfile (str) – String reference to the primer file.

Returns

LocusMap object parsed from the primer file.

Return type

LocusMap

Examples

>>> from lib5c.core.loci import LocusMap
>>> locus_map = LocusMap.from_primerfile('test/primers.bed')
>>> locus_map.size()
1551
>>> locus_map.print_log()
LocusMap created
source primerfile: test/primers.bed
get_index(name)[source]

Get the index of the Locus object in this LocusMap with a specified name.

Parameters

name (str) – The name of the Locus to get the index for.

Returns

The index of the Locus object with the specified name.

Return type

int

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'),
...               Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.get_index('Sox2_REV_4')
1
get_index_by_hash(hash_value)[source]

Get the Locus object contained in this LocusMap with a specified hash.

Parameters

hash_value (int) – The hash of the Locus object to find.

Returns

The index of the desired Locus object within locus_list if it exists within this LocusMap object, or None if it doesn’t.

Return type

int or None

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109),
...               Locus('chr3', 34113147, 34116141)]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_hash = hash(Locus('chr3', 34113147, 34116141))
>>> locus_map.get_index_by_hash(locus_hash)
1
>>> locus_map.get_index_by_hash(123) is None
True
get_region_sizes()[source]

Gets information about the number of Locus objects in each region.

Returns

A dict mapping region names as strings to the number of Locus objects in that region.

Return type

dict(str -> int)

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.get_region_sizes() == {'Sox2': 2, 'Nestin': 1}
True
get_regions()[source]

Gets the regions spanned by the Locus objects in this LocusMap.

Returns

The ordered list of region names.

Return type

list of str

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'),
...               Locus('chr3', 34113147, 34116141, region='Sox2'),
...               Locus('chr3', 87282063, 87285636, region='Nestin'),
...               Locus('chr3', 87285637, 87295935, region='Nestin')]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.get_regions()
['Sox2', 'Nestin']
size()[source]

Get the number of Locus objects in the LocusMap.

Returns

The number of Locus objects in the LocusMap.

Return type

int

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> locus_list = [Locus('chr3', 34109023, 34113109),
...               Locus('chr3', 34113147, 34116141)]
...
>>> locus_map = LocusMap(locus_list)
>>> locus_map.size()
2
to_bedfile(filename, fields=('name', ))[source]

Write this LocusMap to disk as a BED-formatted file.

Parameters
  • filename (str) – String reference to the file to write to.

  • fields (list of str, optional) – Specify additional columns in the BED file after the traditional chromosome, start, end. Columns should be specified in order as strings corresponding to keys in the data attributes on the Locus objects that make up this LocusMap.

Examples

>>> from lib5c.core.loci import Locus, LocusMap
>>> lm1 = LocusMap([
...     Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2'),
...     Locus('chr3', 34113147, 34116141, name='5C_329_Sox2_REV_4')
... ])
...
>>> lm1.to_bedfile('test/core_test_locusmap.bed')
>>> lm2 = LocusMap.from_primerfile('test/core_test_locusmap.bed')
>>> for locus in lm2:
...     print(locus)
Locus chr3:34109023-34113109
    name: 5C_329_Sox2_FOR_2
    number: 2
    orientation: 3'
    region: Sox2
    strand: +
Locus chr3:34113147-34116141
    name: 5C_329_Sox2_REV_4
    number: 4
    orientation: 5'
    region: Sox2
    strand: -