'Linux: create random directory/file hierarchy
For testing a tool I need a directory with a whole bunch of different Office files in a deep nested structure. I already have the files in a directory, but now need to create some random nested sub directories and spread out the files in them.
I could sit down and write a proper program in a programming language of my choice, but I wonder if there might be a clever combination of Linux command line tools + Bash to achieve what I want.
Edit: to clarify, my input is a directory with a about 200 files. The output should be a directory hierarchy containing these files more or less evenly spread. Directory names should be more than single letters, vary randomly in length and use various allowed characters (utf-8 filesystem).
Solution 1:[1]
You can use bash brace-expansion:
mkdir -p {a,b}/{e,f,g}/{h,i,j}
????a
? ????e
? ? ????h
? ? ????i
? ? ????j
? ????f
? ? ????h
? ? ????i
? ? ????j
? ????g
? ????h
? ????i
? ????j
????b
????e
? ????h
? ????i
? ????j
????f
? ????h
? ????i
? ????j
????g
????h
????i
????j
Solution 2:[2]
This is a script that generate a random dir structure :
#!/bin/bash
# Decimal ASCII codes (see man ascii)
ARR=( {48..57} {65..90} {97..122} )
# Array count
arrcount=${#ARR[@]}
# return a random string
get_rand_dir(){
for ((i=1; i<$((RANDOM%30)); i++)) {
printf \\$(printf '%03o' ${ARR[RANDOM%arrcount]});
}
}
dir=/tmp/
# appending random characters to make a hierarchy
for ((i=0; i<$((RANDOM%100)); i++)) {
dir+="$(get_rand_dir)/"
}
echo $dir
mkdir -p "$dir"
oldir=$(echo "$dir" | cut -d '/' -f1-3)
while [[ $dir ]]; do
dir=${dir%/*}
cd $dir
for ((i=0; i<$((RANDOM%100)); i++)) {
mkdir &>/dev/null -p $(get_rand_dir)
}
done
tree "$oldir"
OUTPUT
/tmp/x
??? egeDVPW
??? iOkr
??? l
??? o1gye8uF
??? q
? ??? 4Dlrfagv
? ??? 4Yxmoqf
? ??? 8LkyIrXA
? ??? 8m9kse8s
? ??? aV
? ??? in
? ? ??? 12zdLso68HWlPK
? ? ? ??? C
? ? ? ??? DOYt8wUW
? ? ? ??? FXP
? ? ? ??? hFLem8
? ? ? ??? hhHIv
? ? ? ??? iD87kxs54x04
? ? ? ??? oFM
? ? ? ??? OjFT
Now you can create an array of dirs :
shopt -s globstar # require bash4
dirs=( /tmp/x/** )
printf '%s\n' ${dirs[@]}
and populate dirs with files randomly. You have enough examples to do so. I've done the most hard work.
Solution 3:[3]
Thanks to all who posted here; it turns out, it wasn't really trivial to escape filenames with special characters, so I built my own script based on those here; here is how it behaves with special character filenames:
$ ~/rndtree.sh ./rndpath 0 3 1
Warning: will create random tree at: ./rndpath
Proceed (y/n)? y
Removing old outdir ./rndpath
mkdir -p ./rndpath/";"/{")?DxVBBJ{w2","L,|+","^VC)Vn.6!"}/"D+,IFJ( LN"
> > > > > > > > > > >
./rndpath
??? [ 4096] ;
??? [ 4096] )?DxVBBJ{w2
? ??? [ 4096] D+,IFJ( LN
? ? ??? [ 929] r2.bin
? ??? [ 8557] %3fsaG# Rl;ffXf.bin
? ??? [ 19945] Dzk .bin
??? [ 4096] L,|+
? ??? [ 4096] D+,IFJ( LN
? ? ??? [ 2325] 6Qg#pe5j'&ji49oqrO.bin
? ? ??? [ 16345] #?.bin
? ? ??? [ 2057] Uz-0XtLVWz#}0lI.bin
? ??? [ 2543] bbtA-^s22vdTu.bin
? ??? [ 10848] K46+kh7L9.bin
??? [ 4096] ^VC)Vn.6!
? ??? [ 4096] D+,IFJ( LN
? ??? [ 10502] 8yY,MqZ ^5+_SA^.r4{.bin
? ??? [ 17628] ipO"|69.bin
??? [ 12376] a2Y% }G1.qDir.bin
7 directories, 11 files
total bytes: 136823 ./rndpath
and here with a safe subset of ASCII:
$ ~/rndtree.sh ./rndpath 1 3 1
Warning: will create random tree at: ./rndpath
Proceed (y/n)? y
Removing old outdir ./rndpath
mkdir -p ./rndpath/"48SLS"/{"nyG","jIC6vj"}/{"PSLd5tMn","V R"}
> > > > > > >
./rndpath
??? [ 4096] 48SLS
? ??? [ 4096] jIC6vj
? ? ??? [ 4096] PSLd5tMn
? ? ??? [ 4096] V R
? ? ? ??? [ 922] lg.bin
? ? ? ??? [ 9600] VVYG.bin
? ? ??? [ 10883] B7nt06p.bin
? ? ??? [ 19339] g5uT i.bin
? ??? [ 4096] nyG
? ? ??? [ 4096] PSLd5tMn
? ? ??? [ 4096] V R
? ? ??? [ 6128] 1tfLR.bin
? ??? [ 5448] Jda.bin
??? [ 18196] KwEXu2O2H9s.bin
Spaces should be handled in both cases - however, note that subdirectory names repeat (while filenames do not).
The script rndtree.sh:
#!/usr/bin/env bash
# http://stackoverflow.com/questions/13400312/linux-create-random-directory-file-hierarchy
# Decimal ASCII codes (see man ascii); added space
AARR=( 32 {48..57} {65..90} {97..122} )
# Array count
aarrcount=${#AARR[@]}
if [ "$1" == "" ] ; then
OUTDIR="./rndpath" ;
else
OUTDIR="$1" ;
fi
if [ "$2" != "" ] ; then
ASCIIONLY="$2" ;
else
ASCIIONLY=1 ;
fi
if [ "$3" != "" ] ; then
DIRDEPTH="$3" ;
else
DIRDEPTH=3 ;
fi
if [ "$4" != "" ] ; then
MAXFIRSTLEVELDIRS="$4" ;
else
MAXFIRSTLEVELDIRS=2 ;
fi
if [ "$5" != "" ] ; then
MAXDIRCHILDREN="$5" ;
else
MAXDIRCHILDREN=4 ;
fi
if [ "$6" != "" ] ; then
MAXDIRNAMELEN="$6" ;
else
MAXDIRNAMELEN=12 ;
fi
if [ "$7" != "" ] ; then
MAXFILECHILDREN="$7" ;
else
MAXFILECHILDREN=4 ;
fi
if [ "$8" != "" ] ; then
MAXFILENAMELEN="$8" ;
else
MAXFILENAMELEN=20 ;
fi
if [ "$9" != "" ] ; then
MAXFILESIZE="$9" ;
else
MAXFILESIZE=20000 ;
fi
MINDIRNAMELEN=1
MINFILENAMELEN=1
MINDIRCHILDREN=1
MINFILECHILDREN=0
MINFILESIZE=1
FILEEXT=".bin"
VERBOSE=0 #1
get_rand_dirname() {
if [ "$ASCIIONLY" == "1" ]; then
for ((i=0; i<$((MINDIRNAMELEN+RANDOM%MAXDIRNAMELEN)); i++)) {
printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
}
else
cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINDIRNAMELEN + RANDOM % MAXDIRNAMELEN)) | sed 's/\(["]\)/\\\1/g'
fi
#echo -e " " # debug last dirname space
}
get_rand_filename() {
if [ "$ASCIIONLY" == "1" ]; then
for ((i=0; i<$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)); i++)) {
printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
}
else
# no need to escape double quotes for filename
cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINFILENAMELEN + RANDOM % MAXFILENAMELEN)) #| sed 's/\(["]\)/\\\1/g'
fi
printf "%s" $FILEEXT
}
echo "Warning: will create random tree at: $OUTDIR"
[ "$VERBOSE" == "1" ] && echo " MAXFIRSTLEVELDIRS $MAXFIRSTLEVELDIRS ASCIIONLY $ASCIIONLY DIRDEPTH $DIRDEPTH MAXDIRCHILDREN $MAXDIRCHILDREN MAXDIRNAMELEN $MAXDIRNAMELEN MAXFILECHILDREN $MAXFILECHILDREN MAXFILENAMELEN $MAXFILENAMELEN MAXFILESIZE $MAXFILESIZE"
read -p "Proceed (y/n)? " READANS
if [ "$READANS" != "y" ]; then
exit
fi
if [ -d "$OUTDIR" ]; then
echo "Removing old outdir $OUTDIR"
rm -rf "$OUTDIR"
fi
mkdir "$OUTDIR"
if [ $MAXFIRSTLEVELDIRS -gt 0 ]; then
NUMFIRSTLEVELDIRS=$((1+RANDOM%MAXFIRSTLEVELDIRS))
else
NUMFIRSTLEVELDIRS=0
fi
# create directories
for (( ifl=0;ifl<$((NUMFIRSTLEVELDIRS));ifl++ )) {
FLDIR="$(get_rand_dirname)"
FLCHILDREN="";
for (( ird=0;ird<$((DIRDEPTH-1));ird++ )) {
DIRCHILDREN=""; MOREDC=0;
for ((idc=0; idc<$((MINDIRCHILDREN+RANDOM%MAXDIRCHILDREN)); idc++)) {
CDIR="$(get_rand_dirname)" ;
# make sure comma is last, so brace expansion works even for 1 element? that can mess with expansion math, though
if [ "$DIRCHILDREN" == "" ]; then DIRCHILDREN="\"$CDIR\"" ;
else DIRCHILDREN="$DIRCHILDREN,\"$CDIR\"" ; MOREDC=1 ; fi
}
if [ "$MOREDC" == "1" ] ; then
if [ "$FLCHILDREN" == "" ]; then FLCHILDREN="{$DIRCHILDREN}" ;
else FLCHILDREN="$FLCHILDREN/{$DIRCHILDREN}" ; fi
else
if [ "$FLCHILDREN" == "" ]; then FLCHILDREN="$DIRCHILDREN" ;
else FLCHILDREN="$FLCHILDREN/$DIRCHILDREN" ; fi
fi
}
DIRCMD="mkdir -p $OUTDIR/\"$FLDIR\"/$FLCHILDREN"
eval "$DIRCMD"
echo "$DIRCMD"
}
# now loop through all directories, create random files inside
# note printf '%q' escapes to preserve spaces; also here
# escape, and don't wrap path parts in double quotes (e.g. | sed 's_/_"/"_g');
# note then we STILL have to eval to use it!
# but now ls "$D" works, so noneed for QD
# unfortunately backslashes can make '%q' barf - prevent them
find "$OUTDIR" -type d | while IFS= read D ; do
QD="$(printf '%q' "$(echo "$D")" )" ;
[ "$VERBOSE" == "1" ] && echo "$D"; #echo "$QD"; ls -la "$D"; #eval "ls -la $QD";
for ((ifc=0; ifc<$((MINFILECHILDREN+RANDOM%MAXFILECHILDREN)); ifc++)) {
CFILE="$(get_rand_filename)" ;
echo -n '> '
[ "$VERBOSE" == "1" ] && echo "$D"/"$CFILE"
cat /dev/urandom \
| head -c$((MINFILESIZE + RANDOM % MAXFILESIZE)) \
> "$D"/"$CFILE"
}
done
echo
tree -a --dirsfirst -s "$OUTDIR"
echo "total bytes: $(du -bs $(echo "$OUTDIR"))"
Solution 4:[4]
None of these solutions were fast enough since they rely on bash, so I created a Rust crate that generates pseudo-random directory hierarchies: https://crates.io/crates/ftzz.
Note that I only cared about the hierarchy itself, not its contents, so this program creates empty files or files filled with random data.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | sdaau |
| Solution 4 |
