'Group answers & index based on Questions in python
I have a dataframe as such for analysis purpose, I need to create a list of dictionaries as:
TARGET OUTPUT
[
{ 'is my anti hiv test conclusive or--Bla bla': [0, 1, 2] },
{'I have some hip pain 9 weeks--bla bla': [3, 4, 5, 6]}
]
Here the list is indices of answers and not the actual answers
Well yes, the obvious method is to use groupby but facing some errors
I tried printing before converting to list. And it seems fine actually,
Can y'all please help me figure out it's correct syntax so I could to my targeted output.
Dataset link If somebody needs the shared notebook link, let me know in the comments.
Solution 1:[1]
You need to actually select the column ("index") whose values you want to appear in the list:
df_ans = data.groupby(["question_text"])["index"].apply(list).to_dict()
instead of
df_ans = data.groupby(["question_text"]).apply(list).to_dict()
Otherwise you get a list of the columns, as in your example. That's what happens when you convert a DataFrame to a list, i.e. list(data) gives you the same as list(data.columns) .
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |



