'Replace missing values in R dataframe
I have data:
| Date | Price |
|---|---|
| "2021-01-01" | 1 |
| "2021-01-02" | NA |
| "2021-01-03" | NA |
| "2021-01-04" | NA |
| "2021-01-05" | NA |
| "2021-01-06" | 6 |
| "2021-01-07" | NA |
| "2021-01-08" | NA |
| "2021-01-09" | 3 |
And I would like to replace missing values with means, so that the end result would look like this:
| Date | Price |
|---|---|
| "2021-01-01" | 1 |
| "2021-01-02" | 2 |
| "2021-01-03" | 3 |
| "2021-01-04" | 4 |
| "2021-01-05" | 5 |
| "2021-01-06" | 6 |
| "2021-01-07" | 5 |
| "2021-01-08" | 4 |
| "2021-01-09" | 3 |
Solution 1:[1]
One way would be to use na_interpolation from imputeTS library:
imputeTS::na_interpolation(c(1, NA, NA, 4))
# 1 2 3 4
imputeTS::na_interpolation(c(6, NA, NA, 3))
# 6 5 4 3
Solution 2:[2]
I consider that you have multiple price cols, where you got the price. Then you want to create a new column named Price which is the mean and without NA values.
library(tidyverse)
library(dplyr)
Date <- c("2021-01-01","2021-01-02","2021-01-03","2021-01-04","2021-01-05",
"2021-01-06", "2021-01-07", "2021-01-08","2021-01-09", "2021-01-08","2021-01-09")
your.price.col1 <- c(floor(runif(9,0,100)),NA,NA)
your.price.col2 <- c(floor(runif(9,0,100)),33,44)
df <- data.frame(Date, your.price.col1,your.price.col2)
# slice your price cols, which you want to include in the mean with [2:3] for col1 and col2
df %>%
mutate(Price = rowMeans(df[2:3], na.rm=T))
Date your.price.col1 your.price.col2 Price
1 2021-01-01 96 55 75.5
2 2021-01-02 22 43 32.5
3 2021-01-03 68 62 65.0
4 2021-01-04 18 51 34.5
5 2021-01-05 27 6 16.5
6 2021-01-06 26 30 28.0
7 2021-01-07 32 22 27.0
8 2021-01-08 53 95 74.0
9 2021-01-09 74 78 76.0
10 2021-01-08 NA 33 33.0
11 2021-01-09 NA 44 44.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | AlexB |
| Solution 2 |
