'Snakemake MissingInputException
We have always run our Snakemake pipelines through Amazon S3.
snakemake --default-remote-provider S3 --default-remote-prefix '$s3' --use-conda
However, we need to run our Snakemake pipeline locally for a new experiment.
snakemake --use-conda
The pipeline works great when running with --default-remote-provider S3 --default-remote-prefix '$s3'; however, when we try to run locally we get:
Building DAG of jobs...
MissingInputException in line 226 of /usr/local/eclipse/snakemake_eclip/rules/rep_element.smk:
Missing input files for rule compress_parsed:
output: rep_element_pipeline/IN1_BET_S35_R1_001.combined_w_uniquemap.rmDup.sam.parsed.gz
wildcards: sample=IN1_BET_S35_R1_001
affected files:
rep_element_pipeline/IN1_BET_S35_R1_001.combined_w_uniquemap.rmDup.sam.parsed
ERROR conda.cli.main_run:execute(41): `conda run snakemake --use-conda --cores 36` failed. (See above for error)
Below are the two rules that are causing the error. I'm fairly certain it has something to do with the local wrapper temp(local('{full_path}' + 'rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed')).
{full_path} is an absolute path where the files at the beginning of the pipeline are located.
#Create final rep element parsed file
rule merge_parsed:
input:
'rep_element_pipeline/AA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/AC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/AG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/AN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/AT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/CA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/CC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/CG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/CN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/CT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/GA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/GC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/GG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/GN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/GT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/NA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/NC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/NG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/NN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/NT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/TA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/TC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/TG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/TN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed',
'rep_element_pipeline/TT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp.combined_w_uniquemap.rmDup.sam.parsed'
output:
temp(local('{full_path}' + 'rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed'))
conda:
'../envs/rep_element.yaml'
params:
fp=full_path
shell:
'perl ../scripts/merge_multiple_parsed_files.pl {output[0]} {input}'
#Compress sam.parsed
rule compress_parsed:
input:
'rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed'
output:
temp('rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed.gz')
params:
fp=full_path
conda:
'../envs/standard_eclip.yaml'
shell:
'pigz -c {input[0]} > {params.fp}rep_element_pipeline/{wildcards.sample}.combined_w_uniquemap.rmDup.sam.parsed.gz'
Solution 1:[1]
The output of rule merge_parsed is:
temp(local('{full_path}' + 'rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed'))
but the input of compress_parsed is:
'rep_element_pipeline/{sample}.combined_w_uniquemap.rmDup.sam.parsed'
The string {full_path} is in the first but not the second. Isn't this an issue or am I missing something?
Besides, I would avoid concatenating file paths with +. Better to use os.path.join which takes care of adding the correct separator.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | dariober |
