lib5c.core.loci module¶
-
class
lib5c.core.loci.
Locus
(chrom, start, end, **kwargs)[source]¶ Bases:
lib5c.core.mixins.Picklable
,lib5c.core.mixins.Annotatable
Basically anything with a chromosome, start, and end. Can also include arbitrary metadata.
-
chrom
¶ The chromosome on which this locus resides (e.g.,
'chr4'
).- Type
str
-
start
¶ The start coordinate for the zero-indexed, half-open interval occupied by the locus.
- Type
int
-
end
¶ The end coordinate for the zero-indexed, half-open interval occupied by the locus.
- Type
int
-
data
¶ Arbitrary additional data about the locus. This attribute is filled in with any kwargs passed to the constructor.
- Type
dict(str -> any)
Notes
Locus objects support comparison and ordering via the
total_ordering
decorator. See this class’s implementations of__eq__()
and__lt__()
for more details.Locus objects support the Annotatable mixin, which is the source of their
data
attribute and all its related functions.-
as_dict
()[source]¶ Gets a dict representation of the Locus.
- Returns
This dict is guaranteed to have at least the following keys:
{ 'chrom': str, 'start': int, 'end': int }
Other keys may be present if included in the
data
attribute.- Return type
dict
Examples
>>> from lib5c.core.loci import Locus >>> locus = Locus('chr3', 34109023, 34113109, strand='+') >>> locus.as_dict() == \ ... {'chrom': 'chr3', ... 'start': 34109023, ... 'end': 34113109, ... 'strand': '+'} True
-
get_name
()[source]¶ Gets the name of the locus, if present.
- Returns
The value of
data['name']
if it exists; None otherwise.- Return type
str or None
Examples
>>> from lib5c.core.loci import Locus >>> locus = Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2') >>> locus.get_name() '5C_329_Sox2_FOR_2'
-
-
class
lib5c.core.loci.
LocusMap
(locus_list)[source]¶ Bases:
lib5c.core.mixins.Picklable
,lib5c.core.mixins.Annotatable
,lib5c.core.mixins.Loggable
Representation of an organized group of Locus objects.
-
locus_list
¶ Ordered list of unique Locus objects in the LocusMap.
- Type
list of Locus objects
-
regions
¶ Ordered list of region names, as strings, present in the LocusMap. This list is filled in only when the Locus objects within the LocusMap have a
'region'
key in theirdata
attribute.- Type
list of str
-
name_dict
¶ Dict mapping Locus names as strings to the Locus object with that name. This dict is filled in only when the Locus objects within the LocusMap have a
'name'
key in theirdata
attribute.- Type
dict(str -> Locus)
-
region_index_dict
¶ Maps a region name and an index within the region to a Locus object within the specified region. This means that, for example,
locus_map.region_index_dict['Sox2'][3]
resolves to the Locus object that is the 4th Locus object of the Sox2 region in the LocusMap instance called
locus_map
.- Type
dict(str -> list of Locus objects)
-
hash_to_index_dict
¶ Maps a Locus object’s hash to its index within
locus_list
.- Type
dict(int -> int)
-
name_to_index_dict
¶ Dict mapping Locus names as strings to their index within the LocusMap. This dict is filled in only when the Locus objects within the LocusMap have a
'name'
key in theirdata
attribute.- Type
dict(str -> int)
Notes
Locus objects in a LocusMap are ordered by the total ordering implemented by the Locus class. See that class’s implementation of
__eq__()
and__lt__()
for more details.LocusMap objects support the Loggable and Annotatable mixins. See the Examples section for an example.
Examples
>>> from lib5c.core.loci import LocusMap >>> locus_map = LocusMap.from_primerfile('test/primers.bed') >>> locus_map.size() 1551 >>> locus_map.print_log() LocusMap created source primerfile: test/primers.bed >>> locus_map.set_value('test key', 'test value') >>> locus_map.get_value('test key') 'test value'
-
as_dict_of_list_of_dict
()[source]¶ Gets a primitive representation of the LocusMap, organized by region. Converts the
locus_list
attribute of this LocusMap from a list of Locus objects to dict whose keys are region names as strings and whose values are list of dicts representing the Locus objects in each region.- Returns
The primitive representation of the LocusMap.
- Return type
dict(str -> list of dict)
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'), ... Locus('chr3', 34113147, 34116141, region='Sox2'), ... Locus('chr3', 87282063, 87285636, region='Nest'), ... Locus('chr3', 87285637, 87295935, region='Nest')] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.as_dict_of_list_of_dict() == \ ... {'Sox2': [{'chrom': 'chr3', 'start': 34109023, 'end': 34113109, ... 'region': 'Sox2'}, ... {'chrom': 'chr3', 'start': 34113147, 'end': 34116141, ... 'region': 'Sox2'}], ... 'Nest': [{'chrom': 'chr3', 'start': 87282063, 'end': 87285636, ... 'region': 'Nest'}, ... {'chrom': 'chr3', 'start': 87285637, 'end': 87295935, ... 'region': 'Nest'}]} True
-
as_list_of_dict
()[source]¶ Gets a primitive representation of the LocusMap. Converts the
locus_list
attribute of this LocusMap from a list of Locus objects to a list of dicts representing those Locus objects.- Returns
The primitive representation of the LocusMap.
- Return type
list of dict
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.as_list_of_dict() == \ ... [{'chrom': 'chr3', 'start': 34109023, 'end': 34113109, ... 'name': 'Sox2_FOR_2'}, ... {'chrom': 'chr3', 'start': 34113147, 'end': 34116141, ... 'name': 'Sox2_REV_4'}] True
-
by_index
(index)[source]¶ Get the Locus object contained in this LocusMap with a specified index.
- Parameters
index (int) – The index of the Locus to get.
- Returns
The Locus object with the specified index.
- Return type
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')] ... >>> locus_map = LocusMap(locus_list) >>> print(locus_map.by_index(1)) Locus chr3:34113147-34116141 name: Sox2_REV_4
-
by_name
(name)[source]¶ Get the Locus object contained in this LocusMap with a specified name.
- Parameters
name (str) – The name of the Locus to get.
- Returns
The Locus object with the specified name.
- Return type
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')] ... >>> locus_map = LocusMap(locus_list) >>> print(locus_map.by_name('Sox2_REV_4')) Locus chr3:34113147-34116141 name: Sox2_REV_4
-
by_region_index
(region, index)[source]¶ Get the Locus object contained in this LocusMap with a specified index within a specified region. In other words, the
index
th Locus of the region with nameregion
.- Parameters
region (str) – The name of the region to look for the Locus in.
index (int) – The index of the desired Locus within the specified region.
- Returns
The specified Locus.
- Return type
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'), ... Locus('chr3', 34113147, 34116141, region='Sox2'), ... Locus('chr3', 87282063, 87285636, region='Nestin'), ... Locus('chr3', 87285637, 87295935, region='Nestin')] ... >>> locus_map = LocusMap(locus_list) >>> print(locus_map.by_region_index('Nestin', 1)) Locus chr3:87285637-87295935 region: Nestin
-
delete
(index)[source]¶ Creates a new LocusMap object that excludes the Locus at a specified index.
- Parameters
index (int) – The index of the Locus to exclude.
- Returns
The new LocusMap.
- Return type
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [ ... Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'), ... Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'), ... Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10') ... ] ... >>> locus_map = LocusMap(locus_list) >>> deleted_locus_map = locus_map.delete(2) >>> deleted_locus_map.print_log() LocusMap created deleted locus at index 2 with name Nestin_REV_9 >>> for locus in deleted_locus_map: ... print(locus) Locus chr3:34109023-34113109 name: Sox2_FOR_2 Locus chr3:34113147-34116141 name: Sox2_REV_4 Locus chr3:87285637-87295935 name: Nestin_FOR_10
-
extract_region
(region)[source]¶ Create a LocusMap representing the Locus objects in only one specified region of this LocusMap.
- Parameters
region (str) – The name of the region to extract.
- Returns
A new LocusMap restricted to the specified region.
- Return type
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'), ... Locus('chr3', 34113147, 34116141, region='Sox2'), ... Locus('chr3', 87282063, 87285636, region='Nestin'), ... Locus('chr3', 87285637, 87295935, region='Nestin')] ... >>> locus_map = LocusMap(locus_list) >>> extracted_locus_map = locus_map.extract_region('Sox2') >>> extracted_locus_map.print_log() LocusMap created extracted region Sox2 >>> for locus in extracted_locus_map: ... print(locus) ... Locus chr3:34109023-34113109 region: Sox2 Locus chr3:34113147-34116141 region: Sox2
-
extract_slice
(desired_slice)[source]¶ Gets a new LocusMap object representing a subset of the Locus objects in this LocusMap specified by a slice.
- Parameters
desired_slice (slice) – The slice to use to subset this LocusMap.
- Returns
The new LocusMap.
- Return type
Notes
Since LocusMap objects are sorted, the returned LocusMap will always be sorted, regardless of the slice direction.
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [ ... Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4'), ... Locus('chr3', 87282063, 87285636, name='Nestin_REV_9'), ... Locus('chr3', 87285637, 87295935, name='Nestin_FOR_10') ... ] ... >>> locus_map = LocusMap(locus_list) >>> sliced_map = locus_map[1:3] >>> sliced_map.print_log() LocusMap created sliced out slice(1, 3, None) >>> for locus in sliced_map: ... print(locus) ... Locus chr3:34113147-34116141 name: Sox2_REV_4 Locus chr3:87282063-87285636 name: Nestin_REV_9
-
classmethod
from_binfile
(binfile)[source]¶ Factory method that creates a LocusMap object from a BED file containing bin information.
- Parameters
binfile (str) – String reference to the bin file.
- Returns
LocusMap object parsed from the bin file.
- Return type
Examples
>>> from lib5c.core.loci import LocusMap >>> locus_map = LocusMap.from_binfile('test/bins_new.bed') >>> locus_map.size() 1807 >>> locus_map.print_log() LocusMap created source binfile: test/bins_new.bed
-
classmethod
from_list
(list_of_locusmaps)[source]¶ Factory method that creates a new LocusMap object from a list of existing LocusMap objects by concatenation.
- Parameters
list_of_locusmaps (list of LocusMap) – A list of LocusMap objects to be concatenated.
- Returns
The concatenated LocusMap.
- Return type
Notes
This function should be slightly more efficient than iterative addition. Therefore, it is preferred to use
summed_locus_map = LocusMap.from_list(list_of_locus_maps)
over
summed_locus_map = sum(list_of_locus_maps, LocusMap([]))
as evidenced by
> python -mtimeit ` -s'from lib5c.core.loci import LocusMap' ` -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' ` -s'lm_list = [lm.extract_region(r) for r in lm.regions]' ` 's = LocusMap.from_list(lm_list)' 10 loops, best of 3: 48.2 msec per loop
versus
> python -mtimeit ` -s'from lib5c.core.loci import LocusMap' ` -s'lm = LocusMap.from_primerfile(\"test/primers.bed\")' ` -s'lm_list = [lm.extract_region(r) for r in lm.regions]' ` 's = sum(lm_list, LocusMap([]))' 10 loops, best of 3: 174 msec per loop
Examples
>>> from lib5c.core.loci import LocusMap >>> locus_map = LocusMap.from_primerfile('test/primers.bed') >>> sox2_locus_map = locus_map.extract_region('Sox2') >>> sox2_locus_map.size() 265 >>> sox2_locus_map.get_regions() ['Sox2'] >>> klf4_locus_map = locus_map.extract_region('Klf4') >>> klf4_locus_map.size() 251 >>> klf4_locus_map.get_regions() ['Klf4'] >>> summed_locus_map = LocusMap.from_list([sox2_locus_map, ... klf4_locus_map]) ... >>> summed_locus_map.print_log() LocusMap created created from list >>> summed_locus_map.size() 516 >>> summed_locus_map.get_regions() ['Sox2', 'Klf4'] >>> builtin_sum_result = sum([sox2_locus_map, klf4_locus_map], ... LocusMap([])) ... >>> builtin_sum_result.size() 516 >>> builtin_sum_result.get_regions() ['Sox2', 'Klf4']
-
classmethod
from_list_of_dict
(list_of_dict)[source]¶ Factory method that creates a LocusMap object from a list of dicts that represent the Loci that the LocusMap should be composed of.
- Parameters
list_of_dict (list of dict) – A list of dicts, with each dict representing a Locus that should be created and put into the LocusMap.
- Returns
A LocusMap whose Locus objects are equivalent to the dicts passed in
list_of_dict
- Return type
Examples
>>> from lib5c.core.loci import LocusMap >>> list_of_dict = [{'chrom': 'chr3', ... 'start': 34109023, ... 'end': 34113109, ... 'name': 'Sox2_FOR_2'}, ... {'chrom': 'chr3', ... 'start': 34113147, ... 'end': 34116141, ... 'name': 'Sox2_REV_4'}] ... >>> locus_map = LocusMap.from_list_of_dict(list_of_dict) >>> locus_map.print_log() LocusMap created created from list of dict >>> for locus in locus_map: ... print(locus) ... Locus chr3:34109023-34113109 name: Sox2_FOR_2 Locus chr3:34113147-34116141 name: Sox2_REV_4 >>> list_of_dict_dup = [{'chrom': 'chr3', ... 'start': 34109023, ... 'end': 34113109, ... 'name': 'Sox2_FOR_2'}, ... {'chrom': 'chr3', ... 'start': 34113147, ... 'end': 34116141, ... 'name': 'Sox2_REV_4'}, ... {'chrom': 'chr3', ... 'start': 34113147, ... 'end': 34116141, ... 'name': 'duplicate Locus!'}] ... >>> locus_map_dup = LocusMap.from_list_of_dict(list_of_dict_dup) Traceback (most recent call last): ... ValueError: Locus objects in LocusMap must be unique
-
classmethod
from_primerfile
(primerfile)[source]¶ Factory method that creates a LocusMap object from a BED file containing primer information.
- Parameters
primerfile (str) – String reference to the primer file.
- Returns
LocusMap object parsed from the primer file.
- Return type
Examples
>>> from lib5c.core.loci import LocusMap >>> locus_map = LocusMap.from_primerfile('test/primers.bed') >>> locus_map.size() 1551 >>> locus_map.print_log() LocusMap created source primerfile: test/primers.bed
-
get_index
(name)[source]¶ Get the index of the Locus object in this LocusMap with a specified name.
- Parameters
name (str) – The name of the Locus to get the index for.
- Returns
The index of the Locus object with the specified name.
- Return type
int
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, name='Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='Sox2_REV_4')] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.get_index('Sox2_REV_4') 1
-
get_index_by_hash
(hash_value)[source]¶ Get the Locus object contained in this LocusMap with a specified hash.
- Parameters
hash_value (int) – The hash of the Locus object to find.
- Returns
The index of the desired Locus object within
locus_list
if it exists within this LocusMap object, or None if it doesn’t.- Return type
int or None
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109), ... Locus('chr3', 34113147, 34116141)] ... >>> locus_map = LocusMap(locus_list) >>> locus_hash = hash(Locus('chr3', 34113147, 34116141)) >>> locus_map.get_index_by_hash(locus_hash) 1 >>> locus_map.get_index_by_hash(123) is None True
-
get_region_sizes
()[source]¶ Gets information about the number of Locus objects in each region.
- Returns
A dict mapping region names as strings to the number of Locus objects in that region.
- Return type
dict(str -> int)
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'), ... Locus('chr3', 34113147, 34116141, region='Sox2'), ... Locus('chr3', 87282063, 87285636, region='Nestin')] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.get_region_sizes() == {'Sox2': 2, 'Nestin': 1} True
-
get_regions
()[source]¶ Gets the regions spanned by the Locus objects in this LocusMap.
- Returns
The ordered list of region names.
- Return type
list of str
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109, region='Sox2'), ... Locus('chr3', 34113147, 34116141, region='Sox2'), ... Locus('chr3', 87282063, 87285636, region='Nestin'), ... Locus('chr3', 87285637, 87295935, region='Nestin')] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.get_regions() ['Sox2', 'Nestin']
-
size
()[source]¶ Get the number of Locus objects in the LocusMap.
- Returns
The number of Locus objects in the LocusMap.
- Return type
int
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> locus_list = [Locus('chr3', 34109023, 34113109), ... Locus('chr3', 34113147, 34116141)] ... >>> locus_map = LocusMap(locus_list) >>> locus_map.size() 2
-
to_bedfile
(filename, fields=('name', ))[source]¶ Write this LocusMap to disk as a BED-formatted file.
- Parameters
filename (str) – String reference to the file to write to.
fields (list of str, optional) – Specify additional columns in the BED file after the traditional chromosome, start, end. Columns should be specified in order as strings corresponding to keys in the
data
attributes on the Locus objects that make up this LocusMap.
Examples
>>> from lib5c.core.loci import Locus, LocusMap >>> lm1 = LocusMap([ ... Locus('chr3', 34109023, 34113109, name='5C_329_Sox2_FOR_2'), ... Locus('chr3', 34113147, 34116141, name='5C_329_Sox2_REV_4') ... ]) ... >>> lm1.to_bedfile('test/core_test_locusmap.bed') >>> lm2 = LocusMap.from_primerfile('test/core_test_locusmap.bed') >>> for locus in lm2: ... print(locus) Locus chr3:34109023-34113109 name: 5C_329_Sox2_FOR_2 number: 2 orientation: 3' region: Sox2 strand: + Locus chr3:34113147-34116141 name: 5C_329_Sox2_REV_4 number: 4 orientation: 5' region: Sox2 strand: -
-