'cant read field in xml section

Using python I got to the correct iteration of the XML (forecast) section, but one child field I cant seem to be able to read here is the section from the XML

<forecast>
<fcst_time_from>2022-05-04T16:00:00Z</fcst_time_from>
<fcst_time_to>2022-05-04T20:00:00Z</fcst_time_to>
<change_indicator>FM</change_indicator>
<wind_dir_degrees>110</wind_dir_degrees>
<wind_speed_kt>6</wind_speed_kt>
<visibility_statute_mi>6.21</visibility_statute_mi>
<sky_condition sky_cover="SCT" cloud_base_ft_agl="4500"/>
<sky_condition sky_cover="SCT" cloud_base_ft_agl="25000"/>
</forecast>

I can get every field except <sky_condition sky_cover="SCT" cloud_base_ft_agl="25000"/>

here is the code where I pull the fields

if startTime_new <= Cur_Date_UTC <= EndTime_new:
                #Cil2 = (l.find('sky_cover')).text
                wDir = (l.find('wind_dir_degrees')).text
                wSpd = (l.find('wind_speed_kt')).text
                vis = (l.find('visibility_statute_mi')).text
                Cil = (l.find('sky_condition')).text
                print(wDir)
                print(wSpd)
                print(vis)
                print(Cil)
                print(l)

The answer given did work in a comand window but I am using beautifulsoup to get the XML . I tried to import from datetime import datetime however Cil = [x.get('sky_cover') for x in elt.findall('sky_condition')] did not work

here is my full code

from bs4 import BeautifulSoup as bs
import requests
import pytz
from datetime import datetime

#pirip=False

def SetDateTimes():
    utc_time = datetime.now(pytz.utc)
    Cur_Date_UTC = utc_time.strftime("%d/%m/%y %H:%M:%S")
    return Cur_Date_UTC

def FixDateTime(workingTime):
    workingTime = workingTime.replace('T', ' ')
    workingTime = workingTime.replace('Z', '')
    format = "%Y-%m-%d %H:%M:%S"
    dt_object = datetime.strptime(workingTime, format)
    day_st = dt_object.strftime("%d")
    month_st = dt_object.strftime("%m")
    year_st = dt_object.strftime("%y")
    time_st = dt_object.strftime("%H:%M:%S")
    workingTime = day_st + '/' + month_st + '/' + year_st + ' ' + time_st
    return workingTime

def GetTAF3():
    USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
    url = 'https://www.aviationweather.gov/adds/dataserver_current/httpparam?dataSource=tafs&requestType=retrieve&format=xml&hoursBeforeNow=3&timeType=issue&mostRecent=true&stationString=KEWR'
    session = requests.Session()
    session.headers['User-Agent'] = USER_AGENT
    html = session.get(url)
    soup = bs(html.text, 'html.parser')
    taf = soup.find_all("forecast")
    for l in soup.findAll('forecast'):
            startTime = l.find('fcst_time_from').text
            EndTime = (l.find('fcst_time_to')).text
            
            #print(startTime)
            startTime_new = FixDateTime(startTime)
            #print(startTime_new)
            
            #print(EndTime)
            EndTime_new = FixDateTime(EndTime)
            #print(EndTime_new)
            #break
           
            if startTime_new <= Cur_Date_UTC <= EndTime_new:
                #Cil2 = (l.find('sky_cover')).text

                
                print(l)
               
                wDir = (l.find('wind_dir_degrees')).text
                wSpd = (l.find('wind_speed_kt')).text
                vis = (l.find('visibility_statute_mi')).text
                Cil = (l.find('sky_condition'))
                
                
                print(wDir)
                print(wSpd)
                print(vis)
                print(Cil)
                print(l)

                if 'OVC' in Cil:
                    print("OVC Found")
                    pirip = True
                else:
                    print('No PIRIP')

                if float(vis) < 5:
                    print("pirip for vis")
                    pirip = True
                else:
                    print('No PIRIP')


                        

Cur_Date_UTC = SetDateTimes()   
GetTAF3()
#print(pirip)


Solution 1:[1]

You are asking for (l.find('sky_condition')).text, but the sky_condition tag has no text content. Take a close look at the source:

<sky_condition sky_cover="SCT" cloud_base_ft_agl="4500"/>
<sky_condition sky_cover="SCT" cloud_base_ft_agl="25000"/>

The values you want are attributes, not text nodes. Additionally, there are multiple sky_condition elements, so you'll need to handle multiple matches.

You could do something like this:

>>> Cil = [x.get('cloud_base_ft_agl') for x in l.findall('sky_condition')]
>>> Cil
['4500', '25000']

Edit: here's a complete, working example:

from xml.etree import ElementTree

data = """<forecast>
<fcst_time_from>2022-05-04T16:00:00Z</fcst_time_from>
<fcst_time_to>2022-05-04T20:00:00Z</fcst_time_to>
<change_indicator>FM</change_indicator>
<wind_dir_degrees>110</wind_dir_degrees>
<wind_speed_kt>6</wind_speed_kt>
<visibility_statute_mi>6.21</visibility_statute_mi>
<sky_condition sky_cover="SCT" cloud_base_ft_agl="4500"/>
<sky_condition sky_cover="SCT" cloud_base_ft_agl="25000"/>
</forecast>
"""

l = ElementTree.fromstring(data)

wDir = (l.find("wind_dir_degrees")).text
wSpd = (l.find("wind_speed_kt")).text
vis = (l.find("visibility_statute_mi")).text
Cil = [x.get("cloud_base_ft_agl") for x in l.findall("sky_condition")]

print(wDir)
print(wSpd)
print(vis)
print(Cil)
print(l)

Running the above code produces the following output:

110
6
6.21
['4500', '25000']
<Element 'forecast' at 0x7f8770c2fd80>

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1