'How to avoid gaps due to missing values in matplot in R?

I have a function that uses matplot to plot some data. Data structure is like this:

test = data.frame(x = 1:10, a = 1:10, b = 11:20)
matplot(test[,-1])
matlines(test[,1], test[,-1])

So far so good. However, if there are missing values in the data set, then there are gaps in the resulting plot, and I would like to avoid those by connecting the edges of the gaps.

test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1]) 

enter image description here

In the real situation this is inside a function, the dimension of the matrix is bigger and the number of rows, columns and the position of the non-overlapping missing values may change between different calls, so I'd like to find a solution that could handle this in a flexible way. I also need to use matlines

I was thinking maybe filling in the gaps with intrapolated data, but maybe there is a better solution.



Solution 1:[1]

I came across this exact situation today, but I didn't want to interpolate values - I just wanted the lines to "span the gaps", so to speak. I came up with a solution that, in my opinion, is more elegant than interpolating, so I thought I'd post it even though the question is rather old.

The problem causing the gaps is that there are NAs between consecutive values. So my solution is to 'shift' the column values so that there are no NA gaps. For example, a column consisting of c(1,2,NA,NA,5) would become c(1,2,5,NA,NA). I do this with a function called shift_vec_na() in an apply() loop. The x values also need to be adjusted, so we can make the x values into a matrix using the same principle, but using the columns of the y matrix to determine which values to shift.

Here's the code for the functions:

# x -> vector
# bool -> boolean vector; must be same length as x. The values of x where bool 
#   is TRUE will be 'shifted' to the front of the vector, and the back of the
#   vector will be all NA (i.e. the number of NAs in the resulting vector is
#   sum(!bool))
# returns the 'shifted' vector (will be the same length as x)
shift_vec_na <- function(x, bool){
  n <- sum(bool)
  if(n < length(x)){
    x[1:n] <- x[bool]
    x[(n + 1):length(x)] <- NA
  } 
  return(x)
}

# x -> vector
# y -> matrix, where nrow(y) == length(x)
# returns a list of two elements ('x' and 'y') that contain the 'adjusted'
# values that can be used with 'matplot()'
adj_data_matplot <- function(x, y){
  y2 <- apply(y, 2, function(col_i){
    return(shift_vec_na(col_i, !is.na(col_i)))
  })
  
  x2 <- apply(y, 2, function(col_i){
    return(shift_vec_na(x, !is.na(col_i)))
  })
  return(list(x = x2, y = y2))
}

Then, using the sample data:

test <- data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] <- NA
test$b[7] <- NA
lst <- adj_data_matplot(test[,1], test[,-1])

matplot(lst$x, lst$y, type = "b")

plot

Solution 2:[2]

You could use the na.interpolation function from the imputeTS package:

test = data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1])

library('imputeTS')

test <- na.interpolation(test, option = "linear")
matplot(test[,-1])
matlines(test[,1], test[,-1])

enter image description here

Solution 3:[3]

Had also the same issue today. In my context I was not permitted to interpolate. I am providing here a minimal, but sufficiently general working example of what I did. I hope it helps someone:

mymatplot <- function(data, main=NULL, xlab=NULL, ylab=NULL,...){
    #graphical set up of the window
    plot.new()
    plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))
    mtext(text = xlab,side = 1, line = 3)
    mtext(text = ylab,side = 2, line = 3)
    mtext(text = main,side = 3, line = 0)
    axis(1L)
    axis(2L)
    #plot the data
    for(i in 1:nrow(data)){
        nin.na <- !is.na(data[i,])
        lines(x=which(nin.na), y=data[i,nin.na], col = i,...)
    }
}

The core 'trick' is in x=which(nin.na). It aligns the data points of the line consistently with the indices of the x axis.
The lines

plot.new()  
plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))  
mtext(text = xlab,side = 1, line = 3)  
mtext(text = ylab,side = 2, line = 3)  
mtext(text = main,side = 3, line = 0)  
axis(1L)  
axis(2L)`

draw the graphical part of the window. range(data, na.rm=TRUE) adapts the plot to a proper size being able to include all data points. mtext(...) is used to label the axes and provides the main title. The axes themselves are drawn by the axis(...) command.
The following for-loop plots the data.
The function head of mymatplot provides the ... argument for an optional passage of typical plot parameters as lty, lwt, cex etc. via . Those will be passed on to the lines.
At last word on the choice of colors - they are up to your flavor.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Oriol Mirosa
Solution 3