'Getting Snakemake to work nicely with datetime-based filenames
I'm working with a large dataset where each file (stored online) has a URL that looks something like
import datetime
def fname(dt: datetime.datetime):
return "dataset-" + dt.strftime("%Y%m%d-%H%M%S") + ".gz"
def url(dt: datetime.datetime):
return "url.com/" + fname(dt)
In my Snakefile I have a list of datetimes that I want to work with. For each I need to download the corresponding file from the internet (using wget or similar).
rule download:
output: # would like to specify `fname(dt)`
shell: # would like to `wget -O url(dt)`
What's the right way to do something like this?
Solution 1:[1]
I think the most straightforward way is to create a metadata table with each row representing the file name and download URL, and use this to query the download URL in rule download according to the file name
Or you can specify all files in rule all:
rule all:
input:
[fname(dt) for dt in dts]
Then rule download:
rule download:
output: "{file}"
shell: "wget -O url.com/{output}"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jianyu |
