'Splitting dataframe with a specific rule at a specific row, on loop

Given any df with only 3 columns, and n rows. Im trying to split, horizontally, on loop, at the position where the value on a column is max.

Something close to what np.array_split() does, but not on equal sizes necessarily. It would have to be at the row with the value determined by the max rule, at that moment on the loop. I imagine the over or under cutting bit is not necessarily the harder part.

An example: (sorry, its my first time actually making a question. Formatting code here is unknown for me yet)

df = pd.DataFrame({'a': [3,1,5,5,4,4], 'b': [1,7,1,2,5,5], 'c': [2,4,1,3,2,2]})

This df, with the max value condition applied on column b (7), would be cutted on a 2 row df and other with 4 rows.



Solution 1:[1]

Perhaps this might help you. Assume our n by 3 dataframe is as follows:

    df = pd.DataFrame({'a': [1,2,3,4], 'b': [4,3,2,1], 'c': [2,4,1,3]}) 
    >>> df
       a  b  c
    0  1  4  2
    1  2  3  4
    2  3  2  4
    3  4  1  3

We can create a list of rows where max values occur for each column.

    rows = [df[df[i] == max(df[i])] for i in df.columns]

    >>> rows[0]
       a  b  c
    3  4  1  3

    >>> rows[2]
       a  b  c
    1  2  3  4
    2  3  2  4

This can also be written as a list of indexes if preferred.

    indexes = [i.index for i in rows]
    >>> indexes
    [Int64Index([3], dtype='int64'), Int64Index([0], dtype='int64'), Int64Index([1, 2], dtype='int64')]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 user16376004