'How to create a new column of ranges (about 50 ranges) based on another column

I have a data frame which essentially looks like this:

number value
200 0
201 1
202 2
.. ..
399 3
400 4

What I want to do is to create a new column which has the range of 3 consecutive numbers:

number value range
200 0 200 - 202
201 1 200 - 202
202 2 200 - 202
.. .. ..
399 3 398 - 400
400 4 398 - 400

One thing I can do is to create my own function and write if statements like this:

def function(number):

    if number < 203 & number > 199:
        return "200-202"
    elif number < 206 & number > 202:
        return "203-205"
    ....
    and so on

But this would require I write about 70 if statements. I'm sure there is an easier way to do this. Can someone please guide me?



Solution 1:[1]

You can determine the range from the number itself.

Assuming you want to start on the first value and use ranges of n=3, you can use:

n = 3
first = df['number'].iloc[0]   # initial value (could be set to 0 to have fixed ranges)
start = (df['number']
         .sub(first).floordiv(n)
         .mul(n).add(first)
         )

df['range'] = start.astype(str)+'-'+start.add(n-1).astype(str)

Output:

   number  value    range
0     200      0  200-202
1     201      1  200-202
2     202      2  200-202
3     399      3  398-400
4     400      4  398-400

Solution 2:[2]

You should use pd.cut

import pandas as pd
import numpy as np
x = pd.DataFrame({
    'a': range(200,400),
    'b': np.random.randint(0,10,200)
})
wks = 2
x.loc[:,'range'] = pd.cut(x.a, bins=range(x.a.min(), x.a.max()+wks, wks), right=False)
with pd.option_context('display.max_rows',5):
    display(x)

Output:

    a   b   range
0   200 7   [200, 202)
1   201 6   [200, 202)
... ... ... ...
198 398 1   [398, 400)
199 399 3   [398, 400)

After which I presume you want to do something like:

with pd.option_context('display.max_rows',5):
    display(x.groupby('range').b.sum())

Output:

range
[200, 202)    13
[202, 204)     6
              ..
[396, 398)     6
[398, 400)     4
Name: b, Length: 100, dtype: int32

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway
Solution 2 scign