'Create new column with applying a function (TypeError: unhashable type: 'list')
df['identities']
| identities |
| --------------------------------------------------------- |
| 0 [93, 94, 127, 112, 93, 94, 127, 112, 20, 68, 6...|
| 1 [30, 30, 30, 30, 30, 30, 96, 30, 30, 30, 30, 3...|
| 2 [13, 15, 16, 13, 15, 16, 78, 13, 15, 16, 13, 1...|
| 3 [70, 90, 70, 90, 70, 90, 70, 90, 25, 92, 49, 5...|
| 4 [62, 13, 15, 16, 13, 15, 16, 13, 15, 16, 13, 1...|
| ... |
| 10695 [37, 39, 78, 29, 67, 74, 119, 36, 36, 78, 35, ...|
| 10696 [13, 15, 16, 70, 90, 13, 15, 16, 13, 15, 16, 1...|
| 10697 [37, 39, 37, 39, 95, 95, 37, 39, 37, 39, 37, 3...|
| 10698 [36, 36, 35, 132, 17, 133, 109, 29, 67, 74, 11...|
| 10699 [35, 132, 17, 133, 109, 35, 132, 17, 133, 109,...|
Name: identities, Length: 10700, dtype: object
def top_k_frequent(nums, k):
cnt = {}
for n in nums:
cnt[n] = cnt.get(n, 0) + 1
bucket = [[] for _ in range(len(nums)+1)]
for key, val in cnt.items():
bucket[val].append(key)
res = []
for i in reversed(range(len(bucket))):
if bucket[i]:
res.extend(bucket[i])
if len(res) >= k:
break
return res[:k]
df['identities']=df['identities'].apply(top_k_frequent(nums = df['identities'],k= 4))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10636/1247276605.py in <module>
----> 1 df['identities']=df['identities'].apply(top_k_frequent(nums = df['identities'],k= 4))
~\AppData\Local\Temp/ipykernel_10636/1426526573.py in top_k_frequent(nums, k)
2 cnt = {}
3 for n in nums:
----> 4 cnt[n] = cnt.get(n, 0) + 1
5
6 bucket = [[] for _ in range(len(nums)+1)]
TypeError: unhashable type: 'list'
Solution 1:[1]
Using just 1 row of the dataframe works with this code because the key provided to the dictionary will be an int, instead of a list of ints. Python will give you this TypeError: unhashable type: 'list' whenever you try to use a list as input to a dictionary (in your case, you're passing a list, n, to your dictionary, cnt). I'm not totally sure what you're trying to do in your code, but if you want to use the sequence of numbers contained in each row of the 'identities' column of your dataframe as input to the dictionary cnt, you'll have to wrap it in Python's tuple function. If you're trying to use each int of that list as input to the dictionary instead, you'll have to iterate across the list contained in each row of the 'identities' column of your dataframe.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
