'count word frequency with groupby
I have a csv file only one tag column:
tag
A
B
B
C
C
C
C
When run groupby to count the word frequency, the output do not have the frequency number
#!/usr/bin/env python3
import pandas as pd
def count(fname):
df = pd.read_csv(fname)
print(df)
dfg = df.groupby('tag').count().reset_index()
print(dfg)
return
count("save.txt")
Output no frequency column:
tag
0 A
1 B
2 B
3 C
4 C
5 C
6 C
tag
0 A
1 B
2 C
expect output:
tag freq
0 A 1
1 B 2
2 C 4
Solution 1:[1]
You should use value_counts() and not count()
df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"})
outputs:
tag freq
0 A 1
1 B 2
2 C 4
To sort in descending order,
df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"}).sort_values(
by="freq", ascending=False
)
Solution 2:[2]
Looks close to me, per my comment:
df = pd.DataFrame({'tag': ['A', 'B', 'B', 'C', 'C', 'C', 'C']})
df.groupby(['tag'], as_index=False).agg(freq=('tag', 'count'))
Solution 3:[3]
You could create the addtional column then count values:
Input:
df['freq'] = 1
df = df['tag'].value_counts()
Output:
tag freq
0 C 4
1 B 2
2 A 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Simon |
| Solution 3 | tylerjames |
