'How to find values repeated more than n number of times using only numpy?
I am new to numpy and python so please be gentle.
So I am working on a csv file popularnames.csv and it has different columns, I only want to load column number 3 which is titled 'Popular names in India' and find the names in that column that have been repeated more than 10 times. I also only want to use numpy for the purpose and cant find any solution yet.
My code is:
Baby_names=np.genfromtxt('popularnames.csv', delimiter=',', usecols=(3), skip_header=1, dtype=str)
for Baby_names:
if np.unique(Baby_names)>10:
print(Baby_names)
I do understand that this code is wrong but that is all I could think of with the limited knowledge i have. Any help would be appreciated.
Thanks in advance!
Solution 1:[1]
The syntax for the for loop is wrong.
Try the following code:
baby_names = np.genfromtxt('popularnames.csv', delimiter=',', usecols=(3), skip_header=1, dtype=str)
for name, count in zip(*np.unique(baby_names, return_count=True)):
if count > 10:
print(name)
return_count=True tells numpy to return the count for each unique name.
zip binds the names to the counts which allows us to then iterate over the two.
If you're new to Python, I suggest you continue learning it before using numpy.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
