'How to use wildcards in input for snakemake rule
I am having rel hard time understanding what I am doing in the following code. Could someone kindly help me with issue?
Partial Snakefile:
# load config file
configfile: "config.yaml"
import sys
import os
sample=config["Sample_Name"]
trainingset=config["trainingset"]
if config["batch_analysis"]:
sample = []
for name in os.listdir(config["base_dir"]):
sample.append(name)
sample.sort()
input_dir = expand(config["base_dir"] + "/{sample}", sample=sample)
else:
input_dir = config["FAST5_basedir"]
out_dir = config["output_dir"]
files_needed = ["modified_bases.5mC.bed", "basecalls.fastq",
"mappings.sorted.bam", "mappings.sorted.bam.bai"]
rule all:
input:
expand("{out_dir}/{sample}/reports/{sample}_{trainingset}_WGS_report.pdf",
out_dir=config["output_dir"], sample=sample,
trainingset=config["trainingset"])
rule megalodon:
input:
fast5_dir = input_dir,
ref = config["ref_index"],
output:
touch(expand("{out_dir}/{sample}/megalodon/{files}",
files=files_needed, sample=sample,
out_dir=config["output_dir"]))
threads: 16
conda:
workflow.basedir + "/envs/megalodon.yaml"
params:
guppybasecallserver = config["megalodon_guppy_basecall_server"],
guppy_params = config["megalodon_guppy_params"],
additional = config["megalodon_additional_commands"],
outputs = config["megalodon_outputs"],
liveprocess = "--live-processing" if config["live_acquisition"] else "",
overwrite = "--overwrite" if config["megalodon_overwrite"] else "",
priority: 100
shell:
"""
megalodon '{input.fast5_dir}'
--outputs {params.outputs}
--reference {input.ref}
--mod-motif m CG 0
--devices 0
--processes {threads}
--sort-mappings
--guppy-server-path {params.guppybasecallserver}
--guppy-params "{params.guppy_params}"
--output-directory "{out_dir}/{sample}/megalodon/"
{params.liveprocess} {params.overwrite} {params.additional}
"""
Output from terminal: snakemake --snakefile megalodon_pipeline -c -n
[Tue Mar 29 22:40:17 2022] rule megalodon: input: /home/monib/MagDx/Testing_Folder/data/Test1, /home/monib/MagDx/Testing_Folder/data/Test2, /home/monib/MagDx/Testing_Folder/data/Test3, /home/monib/MagDx/static/hg19.mmi output: /home/monib/MagDx_result/Test1/megalodon/modified_bases.5mC.bed, /home/monib/MagDx_result/Test2/megalodon/modified_bases.5mC.bed, /home/monib/MagDx_result/Test3/megalodon/modified_bases.5mC.bed, /home/monib/MagDx_result/Test1/megalodon/basecalls.fastq, /home/monib/MagDx_result/Test2/megalodon/basecalls.fastq, /home/monib/MagDx_result/Test3/megalodon/basecalls.fastq, /home/monib/MagDx_result/Test1/megalodon/mappings.sorted.bam, /home/monib/MagDx_result/Test2/megalodon/mappings.sorted.bam, /home/monib/MagDx_result/Test3/megalodon/mappings.sorted.bam, /home/monib/MagDx_result/Test1/megalodon/mappings.sorted.bam.bai, /home/monib/MagDx_result/Test2/megalodon/mappings.sorted.bam.bai, /home/monib/MagDx_result/Test3/megalodon/mappings.sorted.bam.bai jobid: 3 priority: 100 threads: 12 resources: tmpdir=/tmp
The above rule does not iterate, instead the input is being used as concatenated string, instead of base_dir and sample being a wildcards.
Rest of the rules come up fine; it is just this rule I cannot get it to behave like a wildcard base_dir and sample in the input.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
