'Using geocoder.osm to fetch state, county and country and appending each in a pandas dataframe column
I have a pandas dataframe with a location column. Each row has one such location entry. Some entries only show country, some only state and some are fake, e.g. "Your Mom's Basement". I am interested only in US locations - specifically the county object fetched byNominatim OSM which I use to search. Not all searches fetch those county objects since some locations are only states, e.g. Texas, US, while fake locations also do not provide this.
I tried to filter the results with the code below and append state, county and country entries to new columns in the dataframe. However, many of the values I get seem absurd, e.g. for the entry "Kansas City, MO", I get state as Puerto Rico. It seems that some entries have shifted up relative to the correct row entries. In the above example, 4 rows above there is location called Bayamon, Puerto Rico. It seems the cells have shifted but I cannot find a clear pattern.
I would very much appreciate any help
import geocoder
from geopy.geocoders import Nominatim
states = []
counties = []
countries = []
for locat in df["user_location"][0:50]:
#print(locat)
try:
g = geocoder.osm(locat)
if g.accuracy >= 0.8:
#try extracting county, state and country objects
try:
counties.append(g.county)
print(g.county)
if g.county == None:
counties.append(np.nan)
except:
counties.append(np.nan)
try:
states.append(g.state)
print(g.state)
if g.state == None:
g.state = np.nan
states.append(np.nan)
except:
states.append(np.nan)
try:
countries.append(g.country)
print(g.country)
if g.country == None:
g.country = np.nan
countries.append(np.nan)
except:
countries.append(np.nan)
#Catching fake user_location names, e.g. "Your Mom's Basement"
except:
counties.append(np.nan)
states.append(np.nan)
countries.append(np.nan)
df["state"] = pd.Series(states)
df["county"] = pd.Series(counties)
df["country"] = pd.Series(countries)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
