'Merge PDFs from different subfolders with the same name

In MAIN folder, there are about 8 subfolders and each subfolder has PDF files. In those subfolders there are PDFs with the same name. Example in Folder 1, there is a pdf with the name SamplePDF.pdf and in folder 5, there is a nother pdf with the same name SamplePDF.pdf. How can I merge these two similar pdfs in the name in one pdf file with the same name SamplePDF.pdf but in OUTPUT folder.

As for the PDFs that are unique to be copied only to the OUTPUT folder This is my try so far

from pathlib import Path
import os

def list_files(dir):
    r = []
    for root, dirs, files in os.walk(dir):
        for name in files:
            r.append(os.path.join(root, name))
    return r

BASE_DIR = Path.cwd()
MAIN_DIR = BASE_DIR / 'MAIN'
OUTPUT_DIR = BASE_DIR / 'OUTPUT'

try:
    shutil.rmtree(OUTPUT_DIR)
except:
    pass
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

mylist = list_files(MAIN_DIR)
print(mylist)

I have an idea of code that merge pdf files stored in python list. But how can I create lists of all the files with the same name?

Looping through the list

for f in mylist:
    print(f)

I got this as output:

C:\Users\Future\Desktop\MAIN\1\1.pdf
C:\Users\Future\Desktop\MAIN\1\2.pdf
C:\Users\Future\Desktop\MAIN\1\3.pdf
C:\Users\Future\Desktop\MAIN\1\4.pdf
C:\Users\Future\Desktop\MAIN\1\5.pdf
C:\Users\Future\Desktop\MAIN\1\6.pdf
C:\Users\Future\Desktop\MAIN\2\1.pdf
C:\Users\Future\Desktop\MAIN\2\2.pdf
C:\Users\Future\Desktop\MAIN\2\3.pdf
C:\Users\Future\Desktop\MAIN\2\4.pdf
C:\Users\Future\Desktop\MAIN\2\5.pdf
C:\Users\Future\Desktop\MAIN\2\6.pdf
C:\Users\Future\Desktop\MAIN\3\1.pdf
C:\Users\Future\Desktop\MAIN\3\2.pdf
C:\Users\Future\Desktop\MAIN\3\3.pdf
C:\Users\Future\Desktop\MAIN\3\4.pdf
C:\Users\Future\Desktop\MAIN\3\5.pdf
C:\Users\Future\Desktop\MAIN\3\6.pdf
C:\Users\Future\Desktop\MAIN\4\1.pdf
C:\Users\Future\Desktop\MAIN\4\2.pdf
C:\Users\Future\Desktop\MAIN\4\3.pdf
C:\Users\Future\Desktop\MAIN\4\4.pdf
C:\Users\Future\Desktop\MAIN\4\5.pdf
C:\Users\Future\Desktop\MAIN\4\6.pdf
C:\Users\Future\Desktop\MAIN\5\1.pdf
C:\Users\Future\Desktop\MAIN\5\2.pdf
C:\Users\Future\Desktop\MAIN\5\3.pdf
C:\Users\Future\Desktop\MAIN\5\4.pdf
C:\Users\Future\Desktop\MAIN\5\5.pdf
C:\Users\Future\Desktop\MAIN\5\6.pdf
C:\Users\Future\Desktop\MAIN\6\1.pdf
C:\Users\Future\Desktop\MAIN\6\2.pdf
C:\Users\Future\Desktop\MAIN\6\3.pdf
C:\Users\Future\Desktop\MAIN\6\4.pdf
C:\Users\Future\Desktop\MAIN\6\5.pdf
C:\Users\Future\Desktop\MAIN\6\6.pdf
C:\Users\Future\Desktop\MAIN\7\1.pdf
C:\Users\Future\Desktop\MAIN\7\2.pdf
C:\Users\Future\Desktop\MAIN\7\3.pdf
C:\Users\Future\Desktop\MAIN\7\4.pdf
C:\Users\Future\Desktop\MAIN\7\5.pdf
C:\Users\Future\Desktop\MAIN\7\6.pdf
C:\Users\Future\Desktop\MAIN\8\1.pdf
C:\Users\Future\Desktop\MAIN\8\2.pdf
C:\Users\Future\Desktop\MAIN\8\3.pdf
C:\Users\Future\Desktop\MAIN\8\4.pdf
C:\Users\Future\Desktop\MAIN\8\5.pdf
C:\Users\Future\Desktop\MAIN\8\6.pdf

I need an idea of how to create python lists of each item that has the same name in pdf name?



Solution 1:[1]

1- create a function to detect duplicates:

def duplicates(files):
    store=[]
    checked=[]
    for i in range(len(files)):
        counter =1 
        for j in range(i+1,len(files)):
            if files[i] not in checked and files[j]==files[i] :
                counter +=1 
        if counter > 1 :
            store.append(files[i])
            checked.append(files[i])
    return store

2- refer mylist to it:

mylist = list_files(MAIN_DIR)
duplicated = duplicates(mylist)
print(mylist,duplicated)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Milad Elyasi