'Snakemake first genotype of a vcf file as wildcard in output
In the second rule I would like to select from the vcf file containing bob, clara and tim, only the first genotype of dictionary (i.e. bob) in roder to get as output in the second rule bob.dn.vcf. Is this possible in snakemake?
d = {"FAM1": ["bob.bam", "clara.bam", "tim.bam"]}
FAMILIES = list(d)
rule all:
input:
expand some outputs
wildcard_constraints:
family = "|".join(FAMILIES)
rule somerulename:
input:
lambda w: d[w.family]
output:
vcf="{family}/{family}.vcf"
shell:
"""
some python command line which produces a single vcf file with bob, clara and tim
"""
rule somerulename:
input:
invcf="{family}/{family}.vcf"
params:
ref="someref.fasta"
output:
out="{family}/{bob}.dn.vcf"
shell:
"""
gatk --java-options "-Xms2G -Xmx2g -XX:ParallelGCThreads=2" SelectVariants -R {params.ref} -V {input.invcf} -O {output.out}
"""
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
