'Python multiple same name tags selection
Please Help....
I am rather new to python and cannot find a way to only target the last tag found in the xml document. I am trying to get the last j151 "172.00" tag in the xml.
Code Being Used.
import xml.etree.cElementTree as ET
import pyodbc
import mysql.connector
import os
tree = ET.parse('C:\OnixFile\Pearson_SA_20211027-160725999_onix.xml')
root = tree.getroot()
count = 0
for product in root.findall('product'):
#Variables
SurAndInit = pubType = language = subject = othertext = shortDesc = longDesc = currencyPrice= publisher =""
count = count + 1
for sd in product.findall('supplydetail'):
supplydetail = sd.find('j137').text
for pc in sd.findall('price'):
price = pc.find('j151').text
print(price)
XML being used is below.
<product>
<a001>9780636155343</a001>
<a002>03</a002>
<a194>01</a194>
<a197>jimmy South Africa</a197>
<productidentifier>
<b221>03</b221>
<b244>9780636155343</b244>
</productidentifier>
<productidentifier>
<b221>15</b221>
<b244>9780636155343</b244>
</productidentifier>
<b012>DG</b012>
<b211>029</b211>
<b212>EPUB 2.0.1</b212>
<series>
<title>
<b202>01</b202>
<b203>Today</b203>
</title>
</series>
<title>
<b202>01</b202>
<b203>Life Orientation Today Grade 8 Learner's Book ePUB (perpetual licence)</b203>
</title>
<workidentifier>
<b201>01</b201>
<b233>GCOI</b233>
<b244>20157105570300</b244>
</workidentifier>
<workidentifier>
<b201>15</b201>
<b244>9780636155343</b244>
</workidentifier>
<contributor>
<b034>1</b034>
<b035>A01</b035>
<b036>G Euvrard</b036>
<b037>Euvrard, G</b037>
<b039>G</b039>
<b040>Euvrard</b040>
<personnameidentifier>
<b390>01</b390>
<b233>Onixsuite Contributor ID</b233>
<b244>3108</b244>
</personnameidentifier>
</contributor>
<contributor>
<b034>2</b034>
<b035>A01</b035>
<b036>H Findlay</b036>
<b037>Findlay, H</b037>
<b039>H</b039>
<b040>Findlay</b040>
<personnameidentifier>
<b390>01</b390>
<b233>Onixsuite Contributor ID</b233>
<b244>3109</b244>
</personnameidentifier>
</contributor>
<contributor>
<b034>3</b034>
<b035>A01</b035>
<b036>C Normand</b036>
<b037>Normand, C</b037>
<b039>C</b039>
<b040>Normand</b040>
<personnameidentifier>
<b390>01</b390>
<b233>Onixsuite Contributor ID</b233>
<b244>3110</b244>
</personnameidentifier>
</contributor>
<b057>1</b057>
<language>
<b253>01</b253>
<b252>eng</b252>
</language>
<language>
<b253>02</b253>
<b252>eng</b252>
</language>
<b061>168</b061>
<extent>
<b218>00</b218>
<b219>168</b219>
<b220>03</b220>
</extent>
<b064>FAM000000</b064>
<b065>YQN</b065>
<subject>
<b067>12</b067>
<b069>YQW</b069>
</subject>
<subject>
<b067>12</b067>
<b069>YQX</b069>
</subject>
<audience>
<b204>01</b204>
<b206>04</b206>
</audience>
<audience>
<b204>22</b204>
<b206>00</b206>
</audience>
<audiencerange>
<b074>17</b074>
<b075>03</b075>
<b076>13 T</b076>
<b075>04</b075>
<b076>15</b076>
</audiencerange>
<othertext>
<d102>01</d102>
<d104 language="eng" textformat="02"><strong>T</strong>rust <strong><em>TODAY</em></strong> to be up-to-date and fresh for the classroom.<br> <strong>O</strong>pportunities for revision, exam practice and assessment throughout.<br> <strong>D</strong>evelops language skills alongside subject knowledge.<br> <strong>A</strong>ll content is fully CAPS-compliant.<br> <strong>Y</strong>our easy-to-use complete classroom solution!<br> <strong><em>TODAY</em></strong>, for successful teaching tomorrow.<br><br> This eBook is a digital version of the printed, CAPS-approved book. Benefits of the ePUB formatinclude:<br> <ul> <li>The ability to view on a desktop computer, notebook or tablet;<br> </li> <li>As learners adjust fonts, rotate and flip pages, content reflows to fit the device's screen giving the user a more flexible experience; and<br> </li> <li>Learners can take notes, highlight and bookmark, and access video and audio for visuallearning. </li></ul></d104>
</othertext>
<othertext>
<d102>03</d102>
<d104 language="eng" textformat="02"><strong>T</strong>rust <strong><em>TODAY</em></strong> to be up-to-date and fresh for the classroom.<br> <strong>O</strong>pportunities for revision, exam practice and assessment throughout.<br> <strong>D</strong>evelops language skills alongside subject knowledge.<br> <strong>A</strong>ll content is fully CAPS-compliant.<br> <strong>Y</strong>our easy-to-use complete classroom solution!<br> <strong><em>TODAY</em></strong>, for successful teaching tomorrow.<br><br> This eBook is a digital version of the printed, CAPS-approved book. Benefits of the ePUB formatinclude:<br> <ul> <li>The ability to view on a desktop computer, notebook or tablet;<br> </li> <li>As learners adjust fonts, rotate and flip pages, content reflows to fit the device's screen giving the user a more flexible experience; and<br> </li> <li>Learners can take notes, highlight and bookmark, and access video and audio for visuallearning. </li></ul></d104>
</othertext>
<othertext>
<d102>02</d102>
<d104 language="eng">Trust TODAY to be up-to-date and fresh for the classroom.</d104>
</othertext>
<productwebsite>
<b367>02</b367>
<f123>http://jimmysa.app.onixsuite.com/book/?GCOI=20157105570300</f123>
</productwebsite>
<imprint>
<b079>Maskew Miller Longman</b079>
</imprint>
<publisher>
<b291>01</b291>
<b081>jimmy South Africa</b081>
</publisher>
<b209>Cape Town, South Africa</b209>
<b083>ZA</b083>
<b394>04</b394>
<b003>20140630</b003>
<copyrightstatement>
<b087>2013</b087>
<copyrightowner>
<b047>jimmy South Africa</b047>
</copyrightowner>
</copyrightstatement>
<salesrights>
<b089>01</b089>
<b090>ZA</b090>
</salesrights>
<relatedproduct>
<h208>13</h208>
<productidentifier>
<b221>15</b221>
<b244>9780636115651</b244>
</productidentifier>
</relatedproduct>
<supplydetail>
<j137>jimmy SA</j137>
<j292>01</j292>
<j138>ZA</j138>
<j396>20</j396>
<j143>20140630</j143>
<price>
<j148>01</j148>
<j151>10.72</j151>
<j152>USD</j152>
<b251>ZA</b251>
</price>
<price>
<j148>02</j148>
<j151>12.33</j151>
<j152>USD</j152>
<b251>ZA</b251>
</price>
<price>
<j148>01</j148>
<j151>149.57</j151>
<j152>ZAR</j152>
<b251>ZA</b251>
</price>
<price>
<j148>02</j148>
<j151>172.00</j151>
<j152>ZAR</j152>
<b251>ZA</b251>
</price>
</supplydetail>
</product>
I am trying to only return the last value found. eg: <j151>172.00</j151> at the moment it is finding all four values.i need to just return the last value.
Solution 1:[1]
So before we start I had to make the provided xml file an valid xml file (insert
<?xml version="1.0"?>
and a
<data>
tag that surrounded your data.
Assuming the xml file is located in a files directory in the directory the python file resides this is the minimal code to run it:
import xml.etree.ElementTree as ET
tree = ET.parse('files/text.xml')
root = tree.getroot()
listing = root.findall("product/supplydetail/price/j151")
end = len(listing) - 1
print(listing[end].tag)
print(listing[end].text)
As I had not to do anything with the lower tags I could just jump to the path where the j151 reside.
The following function returns a list according to the doku
root.findall("...")
As you mentioned you always want to get the last of the prices which correspond to the last entry of list. We then have to get how many entries this list has and (as list in python are starting at index 0) have to subtract 1.
You should only use the code on trusted XML als the lib you are using is known vulnerable to attacks.
Hope that gave you some hints, Cheerio.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alexander Lahn |
