'Way to force snakemake to re-evaluate dag before checkpoint with --list-input-changes?

I wonder if anyone might have some ideas for a small problem I am having with checkpoints. I am trying to produce a workflow that is robust to changes in the sample list, so that necessary rules re-run if a sample is removed. However, if I have a successful workflow run, but remove some samples, re-running with -R $(snakemake --list-input-changes) does not manage to detect the input changes in rules completed before the most recent dag re-evaluation after a checkpoint. Does anyone know how to force snakemake to also check for input changes in rules that happen before a checkpoint output is produced? A small example of a use case might be:

# index the reference to get a list of all chromosomes
checkpoint index_ref:
    input:
        "reference.fa"
    output:
        "reference.fa.fai"
    shell:
        "samtools faidx {input}"

# make a function to get a chromosome list item from checkpoint output
def get_contigs(wildcards):
    with checkpoints.index_ref.get().output.open() as index:
        return pd.read_table(index,header=None,usecols=[0]).squeeze("columns")

# calculate depth per chromosome for a set of samples
rule samtools_depth:
    input:
        expand({sample}.bam, sample=sample_list)
    output:
        "dataset.chr{chrom}.depth"
    shell:
        "samtools depth -r {wildcards.chrom} {input} > {output}"

# combine depths into single output file
rule combine_depths:
    input:
        expand("dataset.chr{chrom}.depth", chrom=get_contigs())
    output:
        "dataset.genome.depth"
    shell:
        "cat {input} > {output}"

rule all:
    input:
        "dataset.genome.depth"

In this case, if the workflow is successfully run, then a sample removed from sample_list, using snakemake -R $(snakemake --list-input-changes) will suggest that the workflow is already complete. If dataset.genome.depth is removed, the same command will result in suggesting that rules samtools_depth and combine_depths both need to be re-run due to the former's input file changes, which is what I would prefer to happen even if downstream outputs are already created.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source