'Grep only the numbers from linux shell

I have an curl output as below and i need to grep only the numbers from that output.

Curl Output

<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">

Grep Command

 grep -o -i 'Max Memory:.*'  | awk  '{ print $3 }'

Output

MB</p><table

Expected Output : 3072.00

Similarly for Free Memory and Total Memory.

Please help



Solution 1:[1]

Here is another gnu grep command to get all memory numbers in one command:

s='<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">'
grep -oP '\w+ Memory:\K[\d.]+' <<< "$s"

2144.78
3072.00
3072.00

Solution 2:[2]

You can also use sed to extract all the necessary details from the string:

fm=$(sed -n 's/.*Free Memory:\([^ ]*\).*/\1/p' file)
tm=$(sed -n 's/.*Total Memory:\([^ ]*\).*/\1/p' file)
mm=$(sed -n 's/.*Max Memory:\([^ ]*\).*/\1/p' file)

See the online demo:

#!/bin/bash
s='<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">'
fm=$(sed -n 's/.*Free Memory:\([^ ]*\).*/\1/p' <<< "$s")
tm=$(sed -n 's/.*Total Memory:\([^ ]*\).*/\1/p' <<< "$s")
mm=$(sed -n 's/.*Max Memory:\([^ ]*\).*/\1/p' <<< "$s")
echo "fm=$fm, tm=$tm, mm=$mm"
# => fm=2144.78, tm=3072.00, mm=3072.00

Details:

  • -n suppresses default line output
  • .*Free Memory:\([^ ]*\).* - matches the whole line that contains
    • .* - any zero or more chars
    • Free Memory: - a fixed string
    • \([^ ]*\) - Group 1 (\1): any zero or more non-space chars
    • .* - any zero or more chars
  • /\1/ - replaces the line matched with Group 1 value
  • p - prints the result of the successful substitution.

Solution 3:[3]

Try this:

grep -o -i 'Max Memory:.*'  | cut -d ':' -f 2 |awk  '{ print $1 }'

Solution 4:[4]

I would GNU AWK for this task following way, let file.txt content be

<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">

then

awk 'BEGIN{FPAT="[0-9]+[.][0-9]+"}{print $1,$2,$3}' file.txt

output

2144.78 3072.00 3072.00

Explanation: I informed GNU AWK that field is 1 or more digits followed by literal . followed by 1 or more digits. I print 1st, 2nd, 3rd field. Disclaimer: I assume that you are interesting only in numbers which have single . inside. Note that 0 in <table border="0"> is not detected. Feel free to adjust to your needs.

(tested in gawk 4.2.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 anubhava
Solution 2 Wiktor Stribiżew
Solution 3 xirehat
Solution 4 Daweo