'How to compress only uncompressed files?

I came up with the following command to brotli compress all files in a directory:

find "$PROJ_DIR/services/webpack/dist/" -type f -size +1000c -regextype posix-extended -iregex '.*\.(css|html|js|xml|svg)' -exec brotli -f -q 10 {} \+

But I'd like to only run it on files that don't have a corresponding .br file. Is there a nice way to do this? May be some way to get find to not return files that have this counterpart? Or if not, a quick way to file them out?


For example, given:

- foo.js
- foo.js.br
- bar.js
- charlie.css
- charlie.css.br

I only want to find bar.js because it doesn't have a matching .br file.



Solution 1:[1]

You can loop through the files returned by find and test to see if a compressed version already exists in the loop.

while IFS= read -r -d '' file; do
    [[ $file -nt ${file}.br ]] && brotli -f -q 10 "$file"
done < <(find "$PROJ_DIR/services/webpack/dist/" -type f -size +1000c -regextype posix-extended -iregex '.*\.(css|html|js|xml|svg)' -print0)

(file1 -nt file2 is true if the first file has a newer modification time than the second (or exists when the second one doesn't); seems useful to recompress in that case in case the contents changed since the last time it was compressed)

Solution 2:[2]

This is the best I could come up with:

uncompressed_files=()
while IFS= read -r -d '' file; do
    [[ $file -nt ${file}.br ]] && uncompressed_files+=("$file")
done < <(find "$PROJ_DIR/services/webpack/dist/" -type f -size +1000c -regextype posix-extended -iregex '.*\.(css|html|js|xml|svg)' -print0)
if [[ ${#uncompressed_files[@]} -gt 0 ]]; then
    printf "%s\0" "${uncompressed_files[@]}" | xargs -0 -n1 -t -P"$(nproc)" -- brotli -f -q 10 -n
fi

It expands on Shawn's answer by pushing the uncompressed files into an array so that I can compress them in parallel via xargs.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 mpen