'Find the last non blank value of a column, but keep the rest blank in R

I have a dataframe as such, I would like to change the column UR by keeping only the last non blank value of this column, leaving the rest of the cells blank.

| Date| GEO| AGE|UR |:---- |:------:| -----:| -----:| | 2019-01-01 | can| 15-24| | 2019-01-01 | ont| 15-24|5.2 | 2019-02-01 | can| 15-24|6.4 | 2019-02-01 | ont| 15-24| | ----| ----| ----|----- | ----| ----| ----|----- | 2021-01-01 | can| 15-24| | 2021-01-01 | ont| 15-24|3.8 | 2021-02-01 | can| 15-24|12.3 | 2021-02-01 | ont| 15-24| | 2020-03-01 | can| 15-24| | 2020-03-01 | on| 15-24| So my results should be

| Date| GEO| AGE|UR |:---- |:------:| -----:| -----:| | 2019-01-01 | can| 15-24| | 2019-01-01 | ont| 15-24| | 2019-02-01 | can| 15-24| | 2019-02-01 | ont| 15-24| | ----| ----| ----| | ----| ----| ----| | 2021-01-01 | can| 15-24| | 2021-01-01 | ont| 15-24|3.8 | 2021-02-01 | can| 15-24|12.3 | 2021-02-01 | ont| 15-24| | 2020-03-01 | can| 15-24| | 2020-03-01 | on| 15-24| So my results should be

my codes is wrong

ur %>%
  group_by(GEO,SEX, AGE) %>%
  mutate(smaller = ifelse(!smaller==" ", row_number() == n(),
      smaller, " "))->ur  #create the last data point
r


Solution 1:[1]

Grouped by the 'GEO', 'AGE' (SEX was not showed in the example), find the last index of non-NA, create a condition with != with row_number() and replace the values to NA

library(dplyr)
df1 %>%
   group_by(GEO, AGE) %>%
   mutate(ind = replace(UR, row_number() != last(which(!is.na(UR))), NA)) %>%
   ungroup

-output

# A tibble: 6 × 5
  Date       GEO   AGE      UR   ind
  <chr>      <chr> <chr> <dbl> <dbl>
1 2019-01-01 can   15-24   9.3  NA  
2 2019-01-01 ont   15-24   5.2   5.2
3 2019-02-01 can   15-24   6.4  NA  
4 2019-02-01 ont   15-24  NA    NA  
5 2019-03-01 can   15-24  10.2  10.2
6 2019-03-01 ont   15-24  NA    NA  

Or using case_when

df1 %>% 
  group_by(GEO, AGE) %>%
  mutate(UR = case_when(row_number() == last(which(!is.na(UR))) ~ UR)) %>%
  ungroup

data

df1 <- structure(list(Date = c("2019-01-01", "2019-01-01", "2019-02-01", 
"2019-02-01", "2019-03-01", "2019-03-01"), GEO = c("can", "ont", 
"can", "ont", "can", "ont"), AGE = c("15-24", "15-24", "15-24", 
"15-24", "15-24", "15-24"), UR = c(9.3, 5.2, 6.4, NA, 10.2, NA
)), class = "data.frame", row.names = c(NA, -6L))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1