'concat index and create list as value of the cell, with values that have been affected by the concat in (python pandas)

This is a bit weird to describe, basically I have this initial dataframe:

test_df
Out[149]: 
                           value
timestamp                       
2019-01-01 00:00:00+00:00  0.640
2019-01-01 01:00:00+00:00  0.224
2019-01-01 02:00:00+00:00  0.320
2019-01-01 03:00:00+00:00  0.304
2019-01-01 04:00:00+00:00  0.736
                         ...
2019-12-30 19:00:00+00:00  0.704
2019-12-30 20:00:00+00:00  0.272
2019-12-30 21:00:00+00:00  0.288
2019-12-30 22:00:00+00:00  0.272
2019-12-30 23:00:00+00:00  0.496

[8736 rows x 1 columns]

Then, based on the timestamp index I create a new column (timestamp_type), which has this atributes (hour,daytype,month):

                           value timestamp_type
timestamp                                      
2019-01-01 00:00:00+00:00  0.640          0,1,1
2019-01-01 01:00:00+00:00  0.224          1,1,1
2019-01-01 02:00:00+00:00  0.320          2,1,1
2019-01-01 03:00:00+00:00  0.304          3,1,1
2019-01-01 04:00:00+00:00  0.736          4,1,1
                         ...            ...
2019-12-30 19:00:00+00:00  0.704        19,0,12
2019-12-30 20:00:00+00:00  0.272        20,0,12
2019-12-30 21:00:00+00:00  0.288        21,0,12
2019-12-30 22:00:00+00:00  0.272        22,0,12
2019-12-30 23:00:00+00:00  0.496        23,0,12

Now I would like the timestamp_type column to be the index. As there are usually four (or five) datepoints in a year that have the same (hour,weekday,month) attribute, I will not be needing to have the same index four times. Instead I want to put these four or five values in a list that will be the value of the cell in that dataframe.

So the goal is to get something that looks like this:

                  values 
timestamp_type
0,1,1             [somevalue, somevalue, somevalue, somevalue]       
1,1,1             [somevalue, somevalue, somevalue, somevalue, somevalue]         
2,1,1             [somevalue, somevalue, somevalue, somevalue, somevalue]          
  ...

I hope I could explain the issue well enough.. I have gone through the pandas docs but couldn't find anything on that. Any input is greatly appreciated!



Solution 1:[1]

figured it out:

df2 = test_df.groupby('timestamp_type')['value'].apply(list)

df2
Out[33]: 
timestamp_type
0,0,1              [0.784, 0.8, 0.352, 0.784]
0,0,10           [0.336, 0.608, 0.624, 0.336]
0,0,11            [0.752, 0.32, 0.736, 0.512]
0,0,12     [0.72, 0.768, 0.752, 0.624, 0.608]
0,0,2              [0.368, 0.352, 0.8, 0.352]
                
9,6,5            [2.432, 0.272, 2.528, 2.432]
9,6,6      [2.432, 0.224, 2.256, 2.256, 2.64]
9,6,7              [2.336, 0.24, 0.144, 0.56]
9,6,8            [0.784, 0.688, 0.736, 0.704]
9,6,9     [0.576, 2.784, 0.672, 0.576, 2.992]
Name: value, Length: 2016, dtype: object

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Stefan 44