'Custom rounding to reference values (nDimensional)

I would like to extract the coordinate from a vector that is closest to a test coordinate.

The task would be very similar to the previously posted:(Find the approximate value in the vector) but adapted to nDimensional cases and with multiple inputs.

In other words, given:

 test=t(data.frame(
      c(0.9,1.1,1),
      c(7.5,7.4,7.3),
      c(11,11,11.2)
    ))
    
 reference=t(data.frame(
      c(1,0,0.5),
      c(2,2,2),
      c(3.3,3.3,3.3),
      c(9,9,9),
      c(10,11,12)
    )) 

result <- approximate(test,reference)

  1    0    0.5
  9    9    9
 10   11   12

I programmed a function using euclidean distances and old school loops but when the inputs dataframes are big it results in looong executing times.

Anyone can figure it out a more efficient way of doing it? Thank you in advance.

PS:This is the function I created that works but takes a while (in case someone could find it useful)

approximate_function<- function(approximate,reference){
  # Function that returns for each entrance of approximate the closest value of reference
  # It uses a euclidean distance.
  # each entrance must be a row in the dataframe
  # the number of columns of the df indicates the dimension of the points
  
  
# Sub function to calculate euclidean distance

  distance_function<- function(a,b){
    
    squaresum<-0
    for(id in 1:length(a)){
      squaresum=squaresum+(a[id]-b[id])^2
    }
    
    result=sqrt(squaresum)
    
    return(result)
  }  

    
  result<-data.frame()
  
  #Choose 1 item from vector to aproximate at a time
  for(id_approximate in 1:nrow(approximate)){
    
    distance=c()
    
    #Compare the value to aproximate with the reference points and chose the one with less distance
    for(id_reference in 1:nrow(reference)){
      distance[id_reference]<-distance_function(approximate[id_approximate,],reference[id_reference,])
      
      }
    
    result<-rbind(
      result,
      reference[which.min(distance),]
    )
    
  }
  
  return(result)
    

}

r rounding

Solution 1:^[1]

This way the calculation is done instantly.

approximate_function<- function(approximate,reference){

  # Function that returns for each entrance of approximate the closest value of reference
  # It uses a euclidean distance.
  # each entrance must be a row in the dataframe
  # the number of columns of the df indicates the dimension of the points
  
  
  results=data.frame()
  
  #Choose 1 item from vector to aproximate at a time
  for(id in 1:nrow(approximate)){
    
    
   #calculates euclidean distances regardless the dimension
   sumsquares=rep(0,nrow(reference))
    
   for(dim in 1:ncol(approximate)){
     sumsquares = sumsquares + (approximate[id,dim]-reference[,dim])^2
   }
    
   distances=sqrt(sumsquares)
    
    
    results<- rbind(
      results,
      reference[which.min(distances),]
    )
    
  
  }
  
  return(results)

}

Solution 2:^[2]

You've got a few calculations that will be slow.

First:

 test=t(data.frame(
    c(0.9,1.1,1),         
    c(7.5,7.4,7.3),
    c(11,11,11.2)
  ))

This one probably doesn't matter, but it would be better as

test=rbind(
      c(0.9,1.1,1),
      c(7.5,7.4,7.3),
      c(11,11,11.2)
    )

Same for setting up reference.

Second and third: You set up result as a dataframe, then add rows to it one at a time. Dataframes are much slower for row operations than matrices, and gradually growing structures in R is slow. So set it up as a matrix from the beginning at the right size, and assign results into specific rows.

EDITED to add:

Fourth: there's no need for the inner loop. You can calculate all the squared differences in one big matrix, then use rowSums or colSums to get the squared distances. This is easiest if you're working with matrix columns instead of rows, because vectors will be properly replicated automatically.

Fifth: There's no need to take the square root; if the squared distance is minimized, so is the distance.

Here's the result:

approximate <- function(test, reference){

  # transpose the reference
  reference <- t(reference)
  
  # set up the result, not transposed
  result <- test*NA
  
  #Choose 1 item from vector to aproximate at a time
  for(id in seq_len(nrow(test))){
    
    squareddist <- colSums((test[id,] - reference)^2)
    
    result[id,] <- reference[, which.min(squareddist)]
  
  }
  return(result)
  
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Ossan
Solution 2

'Custom rounding to reference values (nDimensional)

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]