'Handle environmental variables in config options

I have snakemake command line with configuration options like this:

snakemake --config \
    f1=$PWD/file1.txt \
    f2=$PWD/file2.txt \
    f3=/path/to/file3.txt \
    ...more key-value pairs \
    --directory /path/to/output/dir

file1.txt and file2.txt are expected to be in the same directory as the snakefile, file3.txt is somewhere else. I need the paths to files to be absolute, hence the $PWD variable, so Snakemake can find the files after moving to /path/to/output/dir.

Because I start having several configuration options, I would like to move all the --config items to a separate yaml configuration file. The problem is: How do I transfer the variable $PWD to a configuration file?

I could have a dummy string in the yaml file indicating that that string is to be replaced by the directory where the Snakefile is (e.g. f1: <replace me>/file1.txt) but I feel it's awkward. Any better ideas? It may be that I should rethink how the files fileX.txt are passed to snakemake...



Solution 1:[1]

You can access the directory the Snakefile lives in with workflow.basedir - you might be able to get away with specifying the relative path in the config file and then defining the absolute path in your Snakefile e.g. as

file1 = pathlib.Path(workflow.basedir) / config["f1"]
file2 = pathlib.Path(workflow.basedir) / config["f2"]

Solution 2:[2]

One option is to use an external module, intake, to handle the environmental variable integration. There is a similar answer, but a more specific example for this question is as follow.

A yaml file which follows the syntax expected by intake, a field called sources that contains a list of nested entries that specify at the very least a (possibly local) url at which the file can be access:

# config.yml
sources:
  file1:
    args:
      url: "{{env(PWD)}}/file1.txt"
  file2:
    args:
      url: "{{env(PWD)}}/file2.txt"

Inside the Snakefile, the relevant code would be:

import intake
cat = intake.open_catalog('config.yml')
f1 = cat['file1'].urlpath
f2 = cat['file2'].urlpath

Note that for less verbose yaml files, intake provides syntax for parameterization, see the docs or this example.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 KeyboardCat
Solution 2 SultanOrazbayev