'Append column of arrays in Pandas
I have a dataframe of arrays such as:
| A | B | C |
|:---- |:------:| -----:|
| [0,1,2,3] | [1,2,5,6] | [0,1,4,5] |
| [0,0,6,3] | [0,2,0,4] | [3,8,7,1] |
| [1,0,2,1] | [0,0,2,0] | [1,2,1,1] |
I want to convert to get the unique and count values of the whole dataframe.
| value | count |
|------ |------|
|0| 9|
|1| 9|
|2| 6|
|3| 3|
|4| 2|
|5| 2|
|6| 2|
|7| 1|
|8| 1|
Any idea? Thank you!
Solution 1:[1]
One option is to use pd.DataFrame.explode to expand arrays, combine columns into one with pd.concat and then count the values with pd.Series.value_counts.
import pandas as pd
import numpy as np
df = pd.DataFrame(
{
'A': [np.random.randint(9, size=5)] * 5,
'B': [np.random.randint(9, size=5)] * 5,
'C': [np.random.randint(9, size=5)] * 5
}
)
df = df.explode(df.columns)
pd.concat([df[col] for col in df.columns]).value_counts()
Note: you can only run df.explode with a list of columns if you're using pandas >= 1.3.0.
Solution 2:[2]
You can flatten the dataframe values to 1D array and then use np.unique to count value count
val, cnt = np.unique(np.array(df.values.tolist()).flatten(), return_counts=True)
out = pd.DataFrame({'value': val, 'count': cnt})
print(out)
value count
0 0 10
1 1 9
2 2 6
3 3 3
4 4 2
5 5 2
6 6 2
7 7 1
8 8 1
Solution 3:[3]
import pandas as pd
import numpy as np
a = [[0,1,2,3],[0,0,6,3],[1,0,2,1]]
b = [[1,2,5,6],[0,2,0,4],[0,0,2,0]]
c = [[0,1,4,5],[3,8,7,1],[1,2,1,1]]
df = pd.DataFrame(
{'a': a,
'b': b,
'c': c
})
pd.DataFrame(np.concatenate(df.values)).\
explode(0)[0].\
value_counts().\
to_frame().\
sort_index().\
reset_index().\
rename(columns={"index": "value", 0: "count"})
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | user3471881 |
| Solution 2 | Ynjxsjmh |
| Solution 3 |

