'count number of xml element from linux shell
My xml looks something like this :
<elements>
<elem>
....bunch of other elements
</elem>
</elements>
Is there a way to count the number of occurances of elem tag in some xml file trough linux shell? like with perl/python or anything that might work as one liner?
I might try something like grep -c "elem" myfile.xml and the number I get divide it by 2 and get the number, is there something similar but one liner?
EDIT :
I'm looking for alternative grep solution
Solution 1:[1]
The xml_grep tool does what you want - try the following:
xml_grep --count //elem example.xml
That utility is in the xml-twig-tools package on Debian / Ubuntu, and the documentation is here.
Solution 2:[2]
You can also use xmllint:
xmllint --xpath "count(//elem)" myfile.xml
Solution 3:[3]
DO NOT USE REGULAR EXPRESSIONS TO PARSE OR SCAN XML FILES
The mandatory disclaimer being fired, here's my solution:
xmllint --nocdata --format myfile.xml | grep -c '</elem>'
xmllint is part of libxml which is fairly common on many linux distros. This solution passes the following regex/XML traps:
- spurious spaces (--format)
- several closing tags on single line (--format)
- CDATA sections (--nocdata)
However, you will be caught by nasty namespace declaration and defaults.
Solution 4:[4]
London,
Try fgrep -c '</elem>' $filename
fgrep is a standard unix utility, not at all sure about linux though. The -c switch means count.
Cheers. Keith.
PS: It's allmost allways easier to count CLOSING tags, coz they don't have attributes ;-)
Solution 5:[5]
grep alone won't help in all cases, but this is an easy case for XMLStarlet. You can match elem with XMLStarlet and then count the new lines with wc -l. The new lines minus 1 is the number of elements.
Example YOURFILE.xml:
<elements>
<elem>....bunch of other elements</elem><elem>....bunch of other elements</elem>
<elem>
....bunch of other elements
....bunch of other elements
</elem>
</elements>
Use XMLStarlet and wc-l:
echo $(($(xmlstarlet sel -t -m //elem -n YOURFILE.xml | wc -l)-1))
Output: 3
Solution 6:[6]
Here's a refinement to @bluenote10's xmllint answer that also works for arbitrary namespace prefixes :
xmllint --xpath "count(//*[local-name()='elem'])" myfile.xml
(Already tried to add this as a response to @Ryan_Pelletier's question below the original answer, but kept running into formatting issues so created a separate answer instead).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mark Longair |
| Solution 2 | bluenote10 |
| Solution 3 | Robert Bossy |
| Solution 4 | corlettk |
| Solution 5 | |
| Solution 6 |
