'Assign and address doit task targets by a name

I am using doit for my bioinformatics pipeline. I do not want to repeat myself considering the output names of my tasks. Rather, I want to assign each target a name and then later refer to it.

Currently, I am solving this with a adapter layer, by defining an object that implements the create_doit_tasks(), that then creates a dictionary in the dictionary form that doit expects.

(Besides functions starting with task_, doit considers each object that has the create_doit_tasks callable a task.)

It is similar to this sketch:

class MyTask:
  def __init__(self, actions, targets = {}, **task_args):
    self.actions = actions
    self.targets = targets
    self.task_args = task_args

    # Only the generated object shall have an attribute of this name, not this class
    # itself. Else, doit would execute this "class-object" itself as a task.
    # Therefore, the class must not have the function named "create_doit_tasks"
    # but the instantiated object has to. So, rename the function on object 
    # creation
    self.create_doit_tasks = self._create_doit_tasks

  def _create_doit_tasks(self):
    target_list = list(self.targets.values())
    return {
      "actions" : self.actions,
      "targets" : target_list,
      **self.task_args
    }

Then, I can create tasks that depend on each other like this:

step1 = MyTask(
  actions = ["echo 'Hello world' > file1.txt"],
  targets = dict(
    output = "file1.txt")
)

step2 = MyTask(
  actions = [f"cat '{step1.targets['output']}'"]
)

Now the file name of the output of step 1 is not repeated in step 2. But this seems somehow redundant to me, as I create objects that then create a dictionary that are then parsed into objects (doit.Task) again by doit. Is there a simpler way?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source