'Using row index number to calculate values

I'm having trouble using the row number as index. For example I want a new column that will give me the sales taking into account the next 4 days. I want to create column name:sale_next 4

The issue with my code is that I don't know how to make the index of the row_number() variable, since what I'm doing is fetching the actual value of the column.

#heres to create the data
df <- read.table(text = "day    price   price_change    sales   High_sales_ind
1 5 0   12  1
2   5   0   6   0
3   5   0   5   0
4   5   0   4   0
5   5   0   10  1
6   5   0   10  1
7   5   0   10  1
8   5   0   12  1
9   5   0   14  1
10  7   2   3   0
11  7   0   2   0", header = TRUE)


#my code

df<- df %>% mutate(sales_next4 = sales[row_number():sales_rownumber()+4)

What I need:

day	price	price_change	sales	High_sales_ind	sales_next4
1	5	0	12	1	27
2	5	0	6	0	25
3	5	0	5	0	29
4	5	0	4	0	34
5	5	0	10	1	42
6	5	0	10	1	46
7	5	0	10	1	39
8	5	0	12	1	31
9	5	0	14	1	19
10	7	2	3	0	5
11	7	0	2	0	2

Any help would be appreciated.

r dataframe dplyr tidyverse

Solution 1:^[1]

You can use rollapply from the zoo package for cases like this, assuming that the days are consecutive as in the example data provided.

You'll need to use the partial = and align = arguments to fill the column correctly, see ?rollapply for the details.

library(dplyr)
library(zoo)

df <- df %>% 
  mutate(sales_next4 = rollapply(sales, 4, sum, partial = TRUE, align = "left"))

Result:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

Solution 2:^[2]

You can use map() from purrr to do rolling sum depending on the day column.

library(dplyr)
library(purrr)

df %>%
  mutate(sales_next4 = map_dbl(day, ~ sum(sales[between(day, .x, .x+3)])))

#    day price price_change sales High_sales_ind sales_next4
# 1    1     5            0    12              1          27
# 2    2     5            0     6              0          25
# 3    3     5            0     5              0          29
# 4    4     5            0     4              0          34
# 5    5     5            0    10              1          42
# 6    6     5            0    10              1          46
# 7    7     5            0    10              1          39
# 8    8     5            0    12              1          31
# 9    9     5            0    14              1          19
# 10  10     7            2     3              0           5
# 11  11     7            0     2              0           2

Solution 3:^[3]

Using slider

library(dplyr)
library(slider)
df %>% 
  mutate(sales_next4 = slide_dbl(day, ~ sum(sales[.x]), .after = 3))
   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

Solution 4:^[4]

You can use Reduce() and data.table::shift()

library(data.table)
setDT(df)[, n4:=Reduce(`+`,shift(c(sales,0,0,0),-3:0))[1:.N]]

Output:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

or, could this as part of dplyr/mutate pipeline


mutate(df, sales_next4 = Reduce(`+`, data.table::shift(c(sales,0,0,0),0:-3))[1:nrow(df)])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1
Solution 2
Solution 3	akrun
Solution 4

'Using row index number to calculate values

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Solution 4:[4]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]

Solution 3:^[3]

Solution 4:^[4]