'How can I grep/sed/awk in multiples files and get lines before and after the match in AIX (ksh) using ssh and find commands?

I have a multiples files in a remote server. I got grep/sed/awk the files but I have not been able to obtain context lines, before and after the match for each file and then append in one file.

With sed I have achieved from pattern to the end of file got the lines but no before pattern. With awk I got the line number of the match for each file but got some errors with find and -exec. I am beginner in Linux and regex. What am I doing wrong?

First attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f  -exec ksh -c "grep -n $KEY $1 | cut -d':' -f1  | xargs -n1 -I% awk 'NR<=%+5 && NR>=%-5' $1 " ksh {} \; -print" > output.txt

It seems works fine until xargs command. I got this error: find: 0652-018 An expression term lacks a required parameter.

Second attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f -exec grep -n $KEY {} \; | cut -d':' -f1  | xargs -n1 -I% -exec awk 'NR<=%+5 && NR>=%-5' {} \; -print" > output.txt

I got:

awk: 0602-533 Cannot find or open file {}. The source line number is 1. awk: 0602-533 Cannot find or open file {}. The source line number is 1. awk: 0602-533 Cannot find or open file {}. The source line number is 1.

Third attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f  -exec sed -n '/$KEY/,$ p' {} \;" > output.txt

With sed seems works fine with simple words and I can obtain lines from patterns to the end of each file. But I can't get expressions like this "word1.*word2" (words in same line) works.

$KEY is my variable with the pattern to match.



Solution 1:[1]

Your second attempt failed because you appear to misunderstand what runs where. The -exec ends at \; and that's where the remote command ends, too; the rest of the pipeline runs on your local computer.

So in a way, the first attempt was closer, but the quoting there was wrong, and using grep to find line numbers just so you can pass them back to Awk is weird and inefficient; perhaps see also Counting lines or enumerating line numbers so I can loop over them - why is this an anti-pattern?

Because you want to use both double and single quotes, perhaps the least annoying solution is to pass the script to ssh an a here document. (You still can't nest double quotes, though, and if you need variable interpolation in the here document, you will need to escape any dollar signs or backticks which should not be evaluated by your local shell.) See also What is the cleanest way to ssh and run multiple commands in Bash?

ssh user@server <<___EOF >output.txt
    find /data/ -name "*.txt" -type f  -exec \
      awk -v key="$KEY" '
        \$0 ~ key { p=5; if(q) for (i=0; i<=5 && i<=q; ++i) print lines[((q+i)%6)+1]; q=0 }
        !p { lines[++q%6] = \$0 }
        p && p--' {} \;
____EOF

If you can't use a here document for some reason, the alternative is unfortunately rather depressing.

ssh user@server \
    find /data/ -name '"*.txt"' -type f  -exec \
      awk -v key="'$KEY'" '"\
        \$0 ~ key { p=5; if(q) for (i=0; i<=5 && i<=q; ++i) print lines[((q+i)%6)+1]; q=0 }
        !p { lines[++q%6] = \$0 }
        p && p--"' {} '\;' >output.txt

The weird double quoting is because ssh eats one level of quotes. We use one (outer) layer of quotes (single quotes where it makes sense, otherwise double) to quote expressions from the local shell, and another level (generally double) to still have quotes in the remote shell ... and then you still need to backslash the dollar sign in $0.

The Awk script attempts to keep a memory of recent lines it has seen in lines so that it can recall them and print them when it finds a match on $KEY. This is probably a duplicate of an existing question (and then the duplicate is probably better tested than this one; not in a place where I can properly check the corner cases); see e.g. How to print 5 lines before and after the match regex with awk command

If your find supports it, replacing {} \; with {} + will improve efficiency by passing multiple files to Awk in one go.

Incidentally, none of your shell scripts contain any syntax specific to ksh so (unless the sh in AIX is severely broken) you might replace those with sh.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1