'how do I use awk to print starting at pattern, end at another pattern then exit?
I have a text file that has the following format:
50000
55000
60000
65000
150000
160000
I want to print everything starting 50000 and ending at 60000. What I tried was:
awk "/50000/,/60000/ {print}"
But this also prints the 150000 and 160000. How should I modify this?
Solution 1:[1]
Robustly and efficiently you'd do:
awk '$1==50000{f=1} f{print; if ($1==60000) exit}' file
The exit is so awk doesn't continue wasting time reading the input long after the last line you want to process.
The above assumes that if 60000 didn't exist in the input but 50000 did then you'd want to print the lines from 50000 to the end of the file. If that's not the case then:
awk '$1==50000{f=1} f{ buf=buf $1 ORS; if ($1==60000) {printf "%s", buf; exit} }' file
Solution 2:[2]
Best practice with awk is to not use a sed style regex range.
Instead, set a flag to start printing and another flag to stop (and perhaps exit.)
Example:
seq 100 | awk '
/^22$/{f=1}
/^29$/{exit}
f'
Prints:
22
23
24
25
26
27
28
Solution 3:[3]
if you're not matching a regular expression you can set the criteria to equivalence instead
$ awk '$0==50000,$0==60000' file
will give you the desired range.
Solution 4:[4]
Also, numeric comparison works:
awk '50000 <= $1 && $1 <= 60000' file
The print is implicit here.
Solution 5:[5]
you can also go for a string-based approach :
gawk/nawk '/^(5[0-9]{4}|6[0]{4})$/'
mawk/mawk2 '/^(5[0-9][0-9][0-9][0-9]|60000)$/'
I'd recommend against [[:digit:]] in place of [0-9] since non-C/POSIX locales may result in matching multi-byte "digits", such as those in Unicode.
Solution 6:[6]
how do I use awk to print starting at pattern, end at another pattern then exit?
If you are interestingly solely in first range, then just exit at first occurence of closing pattern, let file.txt content be
50000
55000
60000
65000
150000
160000
then
awk '/50000/,/60000/{print}/60000/{exit}' file.txt
output
50000
55000
60000
Note that this code will end processing as fast as encountering first /60000/, which is useful if you have huge file and are interested in first range which is placed near start.
(tested in gawk 4.2.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ed Morton |
| Solution 2 | dawg |
| Solution 3 | karakfa |
| Solution 4 | glenn jackman |
| Solution 5 | RARE Kpop Manifesto |
| Solution 6 | Daweo |
