'Skip list of filenames from txt file with os.walk

I would like to upload files that users dump into a shared folder to an FTP site. Only certain files must be uploaded based on a pattern in the filename, and that works. I would like to avoid uploading files that have been uploaded in the past. A simple solution would be to move the files to a subdirectory once uploaded, but users whish for the files to remain where they are.

I was thinking of writing a filename to a text file when each iteration of the loops makes an update. Populating the text file works.

Excluding directories with os.walk is mentioned in many articles and I can get that to work fine, but excluding a list of filenames seems to be a bit more obscure

This is what I have so far:

import ftplib
import os
import os.path
import fnmatch

## set local path variables
dir = 'c:/Temp'
hist_path = 'C:/Temp/hist.txt'

pattern = '*SomePattern*'

## make the ftp connection and set appropriate working directory
ftp = ftplib.FTP('ftp.someserver.com')
ftp.login('someuser', 'somepassword')
ftp.cwd('somedirectory')

## make a list of previously uploaded files
hist_list = open(hist_path, 'r')
hist_content = hist_list.read()
# print(hist_content)

## loop through the files and upload them to the FTP as above
for root, dirs, files in os.walk(dir):
    for fname in fnmatch.filter(files, pattern): # this filters for filenames that include the pattern

        ## upload each file to the ftp
        os.chdir(dir)
        full_fname = os.path.join(root, fname)
        ftp.storbinary('STOR ' + fname, open(full_fname, 'rb'))

        ## add an entry for each file into the historical uploads log
        f = open(hist_path, 'a')
        f.write(fname + '\n')
        f.close()

Any help would be appreciated



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source