'Assign and address doit task targets by a name
I am using doit for my bioinformatics pipeline. I do not want to repeat myself considering the output names of my tasks. Rather, I want to assign each target a name and then later refer to it.
Currently, I am solving this with a adapter layer, by defining an object that implements the create_doit_tasks(), that then creates a dictionary in the dictionary form that
doit expects.
(Besides functions starting with task_, doit considers each object that has the create_doit_tasks callable a task.)
It is similar to this sketch:
class MyTask:
def __init__(self, actions, targets = {}, **task_args):
self.actions = actions
self.targets = targets
self.task_args = task_args
# Only the generated object shall have an attribute of this name, not this class
# itself. Else, doit would execute this "class-object" itself as a task.
# Therefore, the class must not have the function named "create_doit_tasks"
# but the instantiated object has to. So, rename the function on object
# creation
self.create_doit_tasks = self._create_doit_tasks
def _create_doit_tasks(self):
target_list = list(self.targets.values())
return {
"actions" : self.actions,
"targets" : target_list,
**self.task_args
}
Then, I can create tasks that depend on each other like this:
step1 = MyTask(
actions = ["echo 'Hello world' > file1.txt"],
targets = dict(
output = "file1.txt")
)
step2 = MyTask(
actions = [f"cat '{step1.targets['output']}'"]
)
Now the file name of the output of step 1 is not repeated in step 2. But this seems somehow redundant to me, as I create objects that then create a dictionary that are then parsed into objects (doit.Task) again by doit. Is there a simpler way?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
