'Error in UseMethod("xml_find_first") : no applicable method for 'xml_find_first' applied to an object of class "character"
I am trying to get the coordinates from the following below webpage: https://nominatim.openstreetmap.org/ui/search.html?q=
However while trying to find the <p> class I am getting the above error.
However, we can see that <p>class exists in the HTML Code.

Code I am using for finding the <p> class:
geocode <- function(record_id, address, city, state, zipcode){
# NOMINATIM SEARCH API URL
src_url <- "https://nominatim.openstreetmap.org/ui/search.html?q="
### INPUTS PREPARATION ###
city <- str_replace_all(string = city,
pattern = "\\s|,",
replacement = "+")
# CREATE A FULL ADDRESS
addr <- paste(address, city, state, zipcode, sep = "%2C")
# CREATE A SEARCH URL BASED ON NOMINATIM API TO RETURN GEOJSON
requests <- paste0(src_url, addr, "&format=geojson")
# ITERATE OVER THE URLS AND MAKE REQUEST TO THE SEARCH API
for (i in 1:length(requests)) {
# MAKE HTML REQUEST TO API AND TRANSFORM HTML RESPONSE TO JSON
response <- read_html(requests[i]) %>%
html_node("p") %>%
html_text() %>%
fromJSON()
# FROM THE RESPONSE EXTRACT LATITUDE AND LONGITUDE COORDINATES
lon <- response$features$geometry$coordinates[[1]][1]
lat <- response$features$geometry$coordinates[[1]][2]
# CREATE A COORDINATES DATAFRAME
if (TRUE && i == 1) {
loc <- tibble(record_id = record_id[i],
address = str_replace_all(addr[i], "%2C", ","),
latitude = lat, longitude = lon)
}else{
df <- tibble(record_id = record_id[i],
address = str_replace_all(addr[i], "%2C", ","),
latitude = lat, longitude = lon)
loc <- bind_rows(loc, df)
}
}
return(loc)
}
Recreating the problem through minimal code:
geocode <- function(record_id, address, city, state, zipcode){
src_url <- "https://nominatim.openstreetmap.org/ui/search.html?q="
city <- str_replace_all(string = city,
pattern = "\\s|,",
replacement = "+")
addr <- paste(address, city, state, zipcode, sep = "%2C")
requests <- paste0(src_url, addr, "&format=geojson")
return(requests)
}
geocode(record_id = 1,
address = 123,
city = "New York",
state = "NY", zipcode = "1006")
Output: "https://nominatim.openstreetmap.org/ui/search.html?q=123%2CNew+York%2CNY%2C1006&format=geojson"
request <- "https://nominatim.openstreetmap.org/ui/search.html?q=123%2CNew+York%2CNY%2C1006&format=geojson"
read_html(request)
Output:
{html_document}
<html lang="en">
[1] <head>\n<meta http-equiv="Content-Type" content="text/h ...
[2] <body>\n</body>
read_html(request) %>%
+ html_nodes('p')
Which results in the above output. What seems to be the problem?
Solution 1:[1]
You are not constructing the correct endpoint which the browser, running JS, actually calls. You can confirm this by monitoring what happens in the network tab of browser when refreshing the target webpage.
Below, I show an amended function to generate the correct endpoint URI, as well as an example call.
library(httr2)
geocode <- function(record_id, address, city, state, zipcode) {
src_url <- "https://nominatim.openstreetmap.org/search.php?q="
city <- str_replace_all(
string = city,
pattern = "\\s|,",
replacement = "+"
)
addr <- paste(address, city, state, zipcode, sep = "%2C")
requests <- paste0(src_url, addr, "&polygon_geojson=1&format=jsonv2")
return(requests)
}
url <- geocode(
record_id = 1,
address = 123,
city = "New York",
state = "NY", zipcode = "1006"
)
headers <- c("User-Agent" = "Mozilla/5.0")
data <- request(url) |>
(\(x) req_headers(x, !!!headers))() |>
req_perform() |>
resp_body_json()
print(data[[1]]$lat)
print(data[[1]]$lon)
print(data[[1]]$geojson$coordinates)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | QHarr |
