'Store series with different length in for loop
The original df is like below:
Hour Count
0 15
0 0
0 0
0 17
0 18
0 12
1 55
1 0
1 0
1 0
1 53
1 51
...
I was looping through this df hour by hour and remove Count=0 in that hour, then drew a boxplot of Count in that hour. Then I ended up with 24 graphs.
Can I put those 24 boxplots onto the same graph when looping? For example getting an output df2 like below and using plt.boxplot(df2), but I'm not sure if that Nan will cause error.
Hour=0 Hour=1 ...
0 15 55
1 17 53
2 18 51
3 12 Nan
Another thing is that after removing 0, each hour with have different length of data in Count. How to append this data and get a df2 like above?
You can use the code below for original df:
df = pd.DataFrame({
'Hour': {0:1, 1:1, 2:1, 3:1, 4:1, 5:1, 6:2, 7:2, 8:2, 9:2, 10:2, 11:2},
'Count': {0:15, 1:0, 2:0, 3:17, 4:18, 5:12, 6:55, 7:0, 8:0, 9:0, 10:53, 11:51}})
Here is the code for making hourly boxplots:
for i in range(2):
table1 = df[df['Hour'] == i]
table2 = table1[table1['large_cnt'] != 0]
fig = plt.figure(1, figsize=(9, 6))
plt.boxplot(table2['large_cnt'])
plt.show()
Solution 1:[1]
One option is to pivot the filtered DataFrame and plot the boxplot:
df.query('Count!=0').assign(i=lambda x: x.groupby('Hour').cumcount()).pivot('i', 'Hour', 'Count').boxplot();
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

