lib5c.contrib.luigi.tasks module¶
Provides luigi Task subclasses that wrap the lib5c command line functions.
-
class
lib5c.contrib.luigi.tasks.
BinTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.FilteringTask
Task class for binning fragment-level countsfiles into binned countsfiles.
Wraps the
lib5c bin
command line command.- Input/output specification:
self.input()[0]
: the bin .bed fileself.input()[1]
: the primer .bed fileself.input()[2]
: the input fragment-level countsfileself.output()
: the resulting countsfile of binned observed values
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
CmdTask
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
Luigi Task parent class for Tasks whose
run()
behavior should be to execute a specific command on the command line.Subclasses must implement
_construct_cmd_string()
, which should return a string corresponding to the command to be run on the command line.If the
bsub
Python package is installed, the command will be executed using the bsub scheduling system, and the caller will wait for the job corresponding to the task to complete.If the
bsub
Python package is not installed, the command will be simply executed viasubprocess
.
-
class
lib5c.contrib.luigi.tasks.
CrossVarianceTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.VarianceTask
Task class for computing variance estimates using the cross-replicate variance method.
Wraps the
lib5c variance
command line command called with-s/--source cross_rep
.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input expected countsfileself.input()[2:]
: the input observed countsfiles for each replicateself.output()
: the resulting countsfile of variance estimates
This class defines a
conditions
Parameter which should be used to ensure that the input observed countsfiles passed inself.input()[2:]
all belong to the same condition. This logic is not implemented here.-
conditions
= <luigi.parameter.Parameter object>¶
-
source
= <luigi.parameter.Parameter object>¶
-
class
lib5c.contrib.luigi.tasks.
DetermineBinsTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for determining bin locations.
Wraps the
lib5c determine-bins
command line command.- Input/output specification:
self.input()
: the input primer .bed fileself.output()
: the resulting bin .bed file
-
bin_width
= <luigi.parameter.IntParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
DistributionTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
-
dist
= <luigi.parameter.Parameter object>¶
-
log
= <luigi.parameter.BoolParameter object>¶
-
mode
= <luigi.parameter.Parameter object>¶
-
-
class
lib5c.contrib.luigi.tasks.
DivideTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for dividing one countsfile by another.
Wraps the
lib5c divide
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the dividend (countsfile to divide)self.input()[2]
: the divisor (countsfile to divide by)self.output()
: the quotient (countsfile resulting from the division)
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
ExpectedTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for computing expected models.
Wraps the
lib5c expected
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input observed countsfileself.output()
: the resulting countsfile of expected values
-
degree
= <luigi.parameter.IntParameter object>¶
-
donut
= <luigi.parameter.BoolParameter object>¶
-
donut_frac
= <luigi.parameter.FloatParameter object>¶
-
exclude_near_diagonal
= <luigi.parameter.BoolParameter object>¶
-
global_expected
= <luigi.parameter.BoolParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
log_donut
= <luigi.parameter.BoolParameter object>¶
-
log_transform
= <luigi.parameter.Parameter object>¶
-
lowess
= <luigi.parameter.BoolParameter object>¶
-
lowess_frac
= <luigi.parameter.FloatParameter object>¶
-
max_with_lower_left
= <luigi.parameter.BoolParameter object>¶
-
min_exp
= <luigi.parameter.FloatParameter object>¶
-
monotonic
= <luigi.parameter.BoolParameter object>¶
-
p
= <luigi.parameter.IntParameter object>¶
-
plot_outfile
= <luigi.parameter.Parameter object>¶
-
plot_outfile_hexbin
= <luigi.parameter.BoolParameter object>¶
-
plot_outfile_kde
= <luigi.parameter.BoolParameter object>¶
-
powerlaw
= <luigi.parameter.BoolParameter object>¶
-
regression
= <luigi.parameter.BoolParameter object>¶
-
run
()¶
-
w
= <luigi.parameter.IntParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
ExpressTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying Express bias correction to countsfiles.
Wraps the
lib5c express
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfileself.output()
: the resulting Express-normalized countsfile
-
bias
= <luigi.parameter.BoolParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
FilteringTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Parent Task class for Tasks related to binning and smoothing.
-
inverse_weights
= <luigi.parameter.BoolParameter object>¶
-
threshold
= <luigi.parameter.FloatParameter object>¶
-
window_function
= <luigi.parameter.Parameter object>¶
-
window_width
= <luigi.parameter.IntParameter object>¶
-
wipe_unsmoothable_columns
= <luigi.parameter.BoolParameter object>¶
-
-
class
lib5c.contrib.luigi.tasks.
IcedTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying ICED bias correction to countsfiles.
Wraps the
lib5c iced
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfileself.output()
: the resulting ICED-normalized countsfile
-
bias
= <luigi.parameter.BoolParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
imputation_size
= <luigi.parameter.IntParameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
InteractionScoreTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for converting p-values to interaction scores.
Wraps the
lib5c interaction-score
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfile of p-valuesself.output()
: the resulting countsfile of interaction scores
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
KnightRuizTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying KR bias correction to countsfiles.
Wraps the
lib5c kr
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfileself.output()
: the resulting KR-normalized countsfile
-
bias
= <luigi.parameter.BoolParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
imputation_size
= <luigi.parameter.IntParameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
LegacyPvaluesOneTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.DistributionTask
-
bias
= <luigi.parameter.BoolParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
-
class
lib5c.contrib.luigi.tasks.
LegacyPvaluesTwoTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
-
bias
= <luigi.parameter.BoolParameter object>¶
-
dist
= <luigi.parameter.Parameter object>¶
-
distance_tolerance
= <luigi.parameter.IntParameter object>¶
-
fractional_tolerance
= <luigi.parameter.FloatParameter object>¶
-
grouping
= <luigi.parameter.Parameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
log
= <luigi.parameter.BoolParameter object>¶
-
mode
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
-
class
lib5c.contrib.luigi.tasks.
LegacyVisualizeFitTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.DistributionTask
,lib5c.contrib.luigi.tasks.RegionalTaskMixin
-
distance_scale
= <luigi.parameter.IntParameter object>¶
-
expected_value
= <luigi.parameter.FloatParameter object>¶
-
tolerance
= <luigi.parameter.FloatParameter object>¶
-
-
class
lib5c.contrib.luigi.tasks.
LegacyVisualizeVarianceTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.DistributionTask
,lib5c.contrib.luigi.tasks.RegionalTaskMixin
-
class
lib5c.contrib.luigi.tasks.
LogTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for logging or unlogging a countsfile.
Wraps the
lib5c log
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfile (to be logged)self.output()
: the resulting countsfile (after logging)
-
log_base
= <luigi.parameter.Parameter object>¶
-
pseudocount
= <luigi.parameter.FloatParameter object>¶
-
unlog
= <luigi.parameter.BoolParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
OutliersTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying high outlier removal to countsfiles.
Wraps the
lib5c outliers
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfileself.output()
: the resulting outlier-filtered countsfile
-
fold_threshold
= <luigi.parameter.FloatParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
overwrite_value
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
window_size
= <luigi.parameter.IntParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
PvalueTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for calling p-values.
Wraps the
lib5c pvalues
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input observed countsfileself.input()[2]
: the input expected countsfileself.input()[3]
: the input variance countsfileself.output()
: the resulting countsfile of p-values
-
distribution
= <luigi.parameter.Parameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
log
= <luigi.parameter.BoolParameter object>¶
-
run
()¶
-
vst
= <luigi.parameter.BoolParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
QnormTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying quantile normalization to countsfiles.
Wraps the
lib5c qnorm
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1:]
: the input countsfilesself.output()
: not specified explicitly, see below
Technically this class should specify a list of outputs, one for each input countsfile. In practice, this specification of outputs is left to whatever code strings together the pipeline. The
lib5c qnorm
command will produce output files on disk based on theoutfile_pattern
and the file names of the input countsfiles.-
averaging
= <luigi.parameter.BoolParameter object>¶
-
condition_on
= <luigi.parameter.Parameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
outfile_pattern
= <luigi.parameter.Parameter object>¶
-
reference
= <luigi.parameter.Parameter object>¶
-
regional
= <luigi.parameter.BoolParameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
QvaluesTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for converting p-values to q-values.
Wraps the
lib5c qvalues
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfile of p-valuesself.output()
: the resulting countsfile of q-values
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
method
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
RegionalTaskMixin
[source]¶ Bases:
object
Mixin class for Tasks that write a separate output file per region.
-
region
= <luigi.parameter.Parameter object>¶
-
-
class
lib5c.contrib.luigi.tasks.
SmoothTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.FilteringTask
Task class for smoothing countsfiles.
Wraps the
lib5c smooth
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input observed countsfileself.output()
: the resulting countsfile of smooth observed values
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
SplineTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for applying explicit spline bias correction to countsfiles.
Wraps the
lib5c spline
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input countsfileself.output()
: the resulting spline-normalized countsfile
-
bias_factors
= <luigi.parameter.ListParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
knots
= <luigi.parameter.ListParameter object>¶
-
model_outfile
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
SubtractTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for subtracting one countsfile from another.
Wraps the
lib5c subtract
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the minuend (countsfile to subtract from)self.input()[2]
: the subtrahend (countsfile to subtract)self.output()
: the difference (countsfile resulting from the subtraction)
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
class
lib5c.contrib.luigi.tasks.
ThresholdTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for thresholding p-value countsfiles to call loops.
Wraps the
lib5c threshold
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1:]
: the input countsfiles of p-valuesself.output()[0]
: the output countsfile of called loopsself.output()[1]
: the output text file summarizing the loop callsself.output()[2]
: the output .csv file containing the complete analysis results
-
background_threshold
= <luigi.parameter.FloatParameter object>¶
-
bh_fdr
= <luigi.parameter.BoolParameter object>¶
-
concordant
= <luigi.parameter.BoolParameter object>¶
-
conditions
= <luigi.parameter.Parameter object>¶
-
dataset_outfile
= <luigi.parameter.Parameter object>¶
-
distance_threshold
= <luigi.parameter.IntParameter object>¶
-
heatmap
= <luigi.parameter.BoolParameter object>¶
-
heatmap_outdir
= <luigi.parameter.Parameter object>¶
-
kappa_confusion_outfile
= <luigi.parameter.Parameter object>¶
-
run
()¶
-
significance_threshold
= <luigi.parameter.FloatParameter object>¶
-
size_threshold
= <luigi.parameter.IntParameter object>¶
-
two_tail
= <luigi.parameter.BoolParameter object>¶
-
class
lib5c.contrib.luigi.tasks.
VarianceTask
(*args, **kwargs)[source]¶ Bases:
lib5c.contrib.luigi.tasks.CmdTask
Task class for computing variance estimates.
Wraps the
lib5c variance
command line command.- Input/output specification:
self.input()[0]
: the primer or bin .bed fileself.input()[1]
: the input observed countsfileself.input()[2]
: the input expected countsfileself.output()
: the resulting countsfile of variance estimates
-
agg_fn
= <luigi.parameter.Parameter object>¶
-
fitter
= <luigi.parameter.Parameter object>¶
-
logx
= <luigi.parameter.BoolParameter object>¶
-
logy
= <luigi.parameter.BoolParameter object>¶
-
min_disp
= <luigi.parameter.Parameter object>¶
-
min_dist
= <luigi.parameter.IntParameter object>¶
-
min_obs
= <luigi.parameter.FloatParameter object>¶
-
model
= <luigi.parameter.Parameter object>¶
-
regional
= <luigi.parameter.BoolParameter object>¶
-
source
= <luigi.parameter.Parameter object>¶
-
x_unit
= <luigi.parameter.Parameter object>¶
-
y_unit
= <luigi.parameter.Parameter object>¶
-
lib5c.contrib.luigi.tasks.
add_visualization_hooks
(f, pvalue=False, obs_over_exp=False, tetris=False)[source]¶ Decorator intended to wrap the
run()
method of luigi Task subclasses to automatically visualize the result of the Task class after it completes.- Parameters
f (function) – The function to add visualization hooks to. Intended to be the
run()
method of luigi Task subclasses.pvalue (bool) – Pass True to denote that the visualized heatmaps should be drawn using the p-value colorscale.
obs_over_exp (bool) – Pass True to denote that the visualized heatmaps should be drawn using the obs_over_exp colorscale.
tetris (bool) – Pass True to denote that the visualized heatmaps should be drawn as tetris heatmaps.
- Returns
The hooked function.
- Return type
function
-
lib5c.contrib.luigi.tasks.
get_all_lines
(filename)[source]¶ Utility function for reading all lines from a file on disk.
- Parameters
filename (str) – The file to read from.
- Returns
The contents of the file.
- Return type
str
-
lib5c.contrib.luigi.tasks.
parallelize_reps
(task_class, reps, **kwargs)[source]¶ Parallelizes any Task class whose constructor accepts a
rep
kwarg across a list of reps by creating a new WrapperTask.- Parameters
task_class (luigi.Task subclass) – The Task to parallelize.
reps (list of str) – List of reps to parallelize over.
kwargs (kwargs) – Additional kwargs to pass through to the Task class.
- Returns
A WrapperTask which simply requires the original
task_class
to be run for every rep inreps
.- Return type
luigi.WrapperTask subclass
-
lib5c.contrib.luigi.tasks.
parallelize_reps_regions
(task_class, reps, regions, **kwargs)[source]¶ Parallelizes any Task class whose constructor accepts
rep
andregion
kwargs across lists of reps and regions by creating a new WrapperTask.- Parameters
task_class (luigi.Task subclass) – The Task to parallelize.
reps (list of str) – List of reps to parallelize over.
regions (list of str) – List of regions to parallelize over.
kwargs (kwargs) – Additional kwargs to pass through to the Task class.
- Returns
A WrapperTask which simply requires the original
task_class
to be run for every rep inreps
and every region inregions
.- Return type
luigi.WrapperTask subclass
-
lib5c.contrib.luigi.tasks.
visualizable
(pvalue=False, obs_over_exp=False, tetris=False)[source]¶ Class decorator factory for luigi Task subclasses which allows the task to automatically visualize itself after completion by
adding
heatmap
andheatmap_outdir
parameters to the Task anddecorating the Task’s
run()
method withadd_visualization_hooks()
- Parameters
pvalue (bool) – Pass True to denote that the visualized heatmaps should be drawn using the p-value colorscale.
obs_over_exp (bool) – Pass True to denote that the visualized heatmaps should be drawn using the obs_over_exp colorscale.
tetris (bool) – Pass True to denote that the visualized heatmaps should be drawn as tetris heatmaps.
- Returns
The class decorator.
- Return type
function