'How to create symlinks in a single directory with the lowest number of forks?
How to create symlinks in a single directory when:
- The common way fails:
ln -s /readonlyShare/mydataset/*.mrc .
-bash: /bin/ln: Argument list too long
- The
findcommand doesn't allow the following syntax:
find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} . +
- Using wild forking takes hours to complete:
find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} . ';'
Solution 1:[1]
I was in a rush when I needed it so I didn't explore all possibilities but I worked-out something meanwhile
Thanks to @WeihangJian answer I now know that find ... | xargs -I {} ... is as bad as find ... -exec ... {} ';'.
A correct answer to my question would be:
find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' \
-exec sh -c 'ln -s "$0" $@" .' {} +
Solution 2:[2]
find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ln -s '{}' '+' .
or if you prefer xargs:
find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
xargs -0 -P0 sh -c 'ln -s "$@" .' sh
If you are using BSD xargs instead of GNU xargs, it can be simpler:
find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
xargs -0 -J@ -P0 ln -s @ .
Why '{}' '+'?
Quoted from man find:
-exec utility [argument ...] {} +
Same as -exec, except that “{}” is replaced with as many pathnames as possible for each invocation of utility. This behaviour is similar
to that of xargs(1). The primary always returns true; if at least one invocation of utility returns a non-zero exit status, find will
return a non-zero exit status.
find is good at splitting large number of arguments:
find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ruby -e 'pp ARGV.size' '{}' '+'
15925
15924
15925
15927
1835
Why not xargs -I?
It is not efficient and slow because -I executes the utility per argument, for example:
printf 'foo\0bar' | xargs -0 -I@ ruby -e 'pp ARGV' @
["foo"]
["bar"]
printf 'foo\0bar' | xargs -0 ruby -e 'pp ARGV'
["foo", "bar"]
xargs is also good at splitting large number of arguments
seq 65536 | tr '\n' '\0' | xargs -0 ruby -e 'pp ARGV.size'
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
536
Why sh -c?
Only BSD xargs have -J flag to put arguments in the middle of commands. For GNU xargs, we need the combination of sh -c and "$@" to do the same thing.
find -exec vs find | xargs
It depends but I would suggest use xargs when you want to utilize all your CPUs. xargs can execute utility parallelly by -P while find can't.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
