'np.select and str.extract if cond str.contains a certain regex is not working as expected
I am trying to extract the state from an address string and some of the addresses are canadian and some american. I think the regex is correct but it is creating an array of shape (29999,29999) and I'm not understanding why:
Here is a sample output of `data['Address']:
19 6349 IN-45, Bloomington, IN 47403
20 ~
21 370 Canyon Meadows Dr SE, Calgary, AB T2J 7C6,...
22 3600 Genesee St, Buffalo, NY 14225
Here is my code:
data['state'] = np.select([data['Address'].str.contains(r',(\s.*\s[0-9])'),data['Address'].str.contains(r',(\s.*\s[A-Za-z][0-9])')],[data['Address'].str.extract(r',(\s.*\s[0-9])'),data['Address'].str.extract(r',(\s.*\s[A-Za-z][0-9])')])
Any help appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
