'grep not counting occurrences which span more than one line
I am using grep to count the occurrences of a particular string in my code, but grep is not counting the occurrences which span more than one line.
I am trying to find occurrences of (` including the ones which look like
(
`
Basically, the backtick is in the next line.
I tried so far:
grep -roh -E "\(\s*\`" . | wc -l
But it doesn't count them. Even
grep -roh -E "\(\n" . | wc -l
this is giving 0.
What would be the solution to this?
Solution 1:[1]
find -type f -exec cat {} + | tr -d '[:space:]' | grep -oF '(`' | wc -l
findcatenates contents of all files into a streamtrreads stream and strips whitespacegrepoutputs occurrences of the string (-ois GNU extension)wccounts them
Solution 2:[2]
The following assumes the strings you want to count start with an opening parenthesis, followed by spaces and end with a backtick, with at most one newline in the spaces. We can use sed (tested with GNU sed) to remove the newlines before passing all this to grep and wc:
$ s='abc
text (
`
def
text (
`
ghi ( ` (` jkl
'
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s"
abc
text (`
def
text (`
ghi ( ` (` jkl
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s" |
grep -Eo '\(\s*`'
(`
(`
( `
(`
$ sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' <<< "$s" |
grep -Eo '\(\s*`' | wc -l
4
The sed script uses the -z option to separate lines by NUL characters. It substitutes any of your string that contains a newline by just an opening parenthesis, followed by a backtick and loops as long as there are substitutions.
To apply this on all files under the current directory you will need find to concatenate them and pipe to sed:
$ find . -type f -exec cat {} \; |
sed -Ez ':a;s/(.*)\([[:blank:]]*\n[[:blank:]]*`(.*)/\1\(`\2/g;ta' |
grep -Eo '\(\s*`' | wc -l
1257
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jhnc |
| Solution 2 |
