'Splitting dataframe with a specific rule at a specific row, on loop
Given any df with only 3 columns, and n rows. Im trying to split, horizontally, on loop, at the position where the value on a column is max.
Something close to what np.array_split() does, but not on equal sizes necessarily. It would have to be at the row with the value determined by the max rule, at that moment on the loop. I imagine the over or under cutting bit is not necessarily the harder part.
An example: (sorry, its my first time actually making a question. Formatting code here is unknown for me yet)
df = pd.DataFrame({'a': [3,1,5,5,4,4], 'b': [1,7,1,2,5,5], 'c': [2,4,1,3,2,2]})
This df, with the max value condition applied on column b (7), would be cutted on a 2 row df and other with 4 rows.
Solution 1:[1]
Perhaps this might help you. Assume our n by 3 dataframe is as follows:
df = pd.DataFrame({'a': [1,2,3,4], 'b': [4,3,2,1], 'c': [2,4,1,3]})
>>> df
a b c
0 1 4 2
1 2 3 4
2 3 2 4
3 4 1 3
We can create a list of rows where max values occur for each column.
rows = [df[df[i] == max(df[i])] for i in df.columns]
>>> rows[0]
a b c
3 4 1 3
>>> rows[2]
a b c
1 2 3 4
2 3 2 4
This can also be written as a list of indexes if preferred.
indexes = [i.index for i in rows]
>>> indexes
[Int64Index([3], dtype='int64'), Int64Index([0], dtype='int64'), Int64Index([1, 2], dtype='int64')]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | user16376004 |
