'select only record where the cumulated sum of (VolumePred) is less than 450 [duplicate]

I have a dataframe :

    ConversionPred  VolumePred
                                      OSBrowser PageId      
    (11, 16)    955764  88273.0       125110.0
                955761  78408.0       104703.0
               1184903  57702.0       118085.0
                955767  49224.0        68942.0
               1149586  36405.0        53582.0
    ... ... ... ...
    (32, 16)    899748  0.0 4.0
    (11, 15)    835198  0.0 4.0
    (32, 16)    955761  0.0 151.0

For each group of OSBrowser, I have to select only record where the cumulated sum of (VolumePred) is less than 450

I tried with code :

subdata.loc[subdata['VolumePred'].cumsum() < 450, :]

But didn't work : I got this result :

                  ConversionPred  VolumePred
OSBrowser PageId                            
(11, 11)  789615            15.0        20.0
          923645             8.0        36.0

I don't know why only these 2 rows are selected ? why these rows :

(32, 16)    899748  0.0 4.0
(11, 15)    835198  0.0 4.0
(32, 16)    955761  0.0 151.0

are not selected?

strange



Solution 1:[1]

IIUC, try:

output = subdata[subdata.groupby(level=0).transform("cumsum").lt(450)]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 not_speshal