'Im trying to find and select the phone number from a Google Places search result
HTML block where phone is located. i need to get the text value after > (304) 746-1139
<span class="LrzXr zdqRlf kno-fv"><a data-dtype="d3ph" data-local-attribute="d3ph" jscontroller="LWZElb" href="#" jsdata="QKGTRc;_;AZo4o4" jsaction="rcuQ6b:npT2md;F75qrd" data-ved="2ahUKEwiWvdDy8d71AhUTHcAKHU5FC4EQkAgoAHoECCgQAw"><span aria-label="Call Phone Number (304) 746-1139">(304) 746-1139</span></a></span>
here is the pyton im using def find_text_element(html_element,element,selector_type=None): """ The functions responsible for extracting the needed data from a specific html element, in addition to that this function server as the error handling when html element doesn't exist
"""
try:
if not selector_type:
return html_element.find_element(By.CLASS_NAME,(element)).text
elif selector_type=="attri":
return html_element.find_element(By.XPATH,(element)).get_attribute('href')
else:
return html_element.find_element(By.XPATH,(element)).text
except NoSuchElementException:
pass
return None
data ={} def single_page_extract(html_element): """ The function takes a single html page for a single company and extract all information’s at once and returns as dictionary
"""
data['Company_name'] = [find_text_element(html_element,'SPZz6b')]
data['Activit_description'] =[find_text_element(html_element,'YhemCb')]
data['Full_Address'] = [find_text_element(html_element,'LrzXr')]
**data['Phone'] = [find_text_element(html_element,'LrzXr zdqRlf kno-fv',"xpath")]**
#data['Phone'] = [find_text_element(html_element,"//a[@data-dtype='d3ifr']","xpath")]
data['Website'] = [find_text_element(html_element,"//a[@class='ab_button']","attri")]
data['Status'] = [find_text_element(html_element,"//*[@id='Shyhc']","xpath")]
return pd.DataFrame.from_dict(data)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
