'define SAMPLE for different dir name and sample name in snakemake code

I have written a snakemake code to run bwa_map. Fastq files are with different folder name and different sample name (paired end). It shows error as 'SAMPLES' is not defined. Please help.

Error:

$snakemake --snakefile rnaseq.smk mapped_reads/EZ-123-B_IGO_08138_J_2_S101_R2_001.bam -np

*NameError in line 2 of /Users/singhh5/Desktop/tutorial/rnaseq.smk: name 'SAMPLES' is not defined

File "/Users/singhh5/Desktop/tutorial/rnaseq.smk", line 2, in *

 #SAMPLE DIRECTORY
 
 fastq
   Sample_EZ-123-B_IGO_08138_J_2
          EZ-123-B_IGO_08138_J_2_S101_R1_001.fastq.gz
          EZ-123-B_IGO_08138_J_2_S101_R2_001.fastq.gz
 
   Sample_EZ-123-B_IGO_08138_J_4
          EZ-124-B_IGO_08138_J_4_S29_R1_001.fastq.gz
          EZ-124-B_IGO_08138_J_4_S29_R2_001.fastq.gz
#My Code
expand("~/Desktop/{sample}/{rep}.fastq.gz", sample=SAMPLES)

rule bwa_map:
        input:
                "data/genome.fa",
                "fastq/{sample}/{rep}.fastq"
        conda:
                "env.yaml"
        output:
                "mapped_reads/{rep}.bam"
        threads: 8
        shell:
                "bwa mem {input} | samtools view -Sb -> {output}"


Solution 1:[1]

The specific error you are seeing is because the variable SAMPLES isn't set to anything before you use it in expand.

Some other issues you may run into:

  • Output file is missing the {sample} wildcard.
  • The value of threads isn't passed into bwa or samtools
  • You should place your expand into the input directive of the first rule in your snakefile, typically called all to properly request the files from bwa_map.
  • You aren't pairing your reads (R1 and R2) in bwa.

You should look around stackoverflow or some github projects for similar rules to give you inspiration on how to do this mapping.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Troy Comi