'Bash : Find and Remove duplicate files from different folders
I have two folders with some common files, I want to delete duplicate files from xyz folder.
folder1:
/abc/file1.csv
/abc/file2.csv
/abc/file3.csv
/abc/file4.csv
folder2:
/xyz/file1.csv
/xyz/file5.csv
I want to compare both folders and remove duplicate from /xyz folder. Output should be: file5.csv
For now I am using :
find "/xyz" "/abc" "/abc" -printf '%P\n' | sort | uniq -u | -exec rm {} \;
But it failing with reason : if -exec is not a typo you can run the following command to lookup the package that contains the binary:
command-not-found -exec
-bash: -exec: command not found
Solution 1:[1]
-exec is an option to find, you've already exited the command find when you started the pipes.
Try xargs instead, it take all the data from stdin and appends to the program.
UNTESTED
find "/xyz" "/abc" "/abc" -printf '%P\n' | sort | uniq -u | xargs rm
Solution 2:[2]
Find every file in 234 and 123 directory get filename by -printf, sort them, uniq -d give list of duplications, give back path by sed, using 123 directory to delete the duplications from, and pass files to xargs rm
Command:
find ./234 ./123 -type f -printf '%P\n' | sort | uniq -d | sed 's/^/.\/123\//g' | xargs rm
sed don't needed if you are in the ./123 directory and using full path for folders in find.
Solution 3:[3]
Another approach: just find the files in abc and attempt to remove them from xyz:
UNTESTED
find /abc -type f -printf 'rm -f /xyz/%P' | sh
Solution 4:[4]
Remove Duplicate Files From Particular Directory
FileList=$(ls)
for D1 in $FileList ;do
if [[ -f $D1 ]]; then
for D2 in $FileList ;do
if [[ -f $D2 ]]; then
if [[ $D1 == $D2 ]]; then
: 'Skip Orignal File'
else
if [[ $(md5sum $D1 | cut -d'=' -f 2 | cut -d ' ' -f 1 ) == $(md5sum $D2 | cut -d'=' -f 2 | cut -d ' ' -f 1 ) ]]; then
echo "Duplicate File Found : $D2"
rm -rf $D2
fi #Detect Duplicate Using MD5
fi #Skip Orginal File
fi #D2 File available Then Next
done
fi #D1 File available Then Next
done
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | L. Scott Johnson |
| Solution 4 | IAMX |
