'Assign a value, if a number is in between two numbers

Im trying to assign the value of -1, to every number in my vector that is inbetween 2 and 5.

I thought an if - then statement would work. I am having some trouble. I dont think (2<x<5) is right but I am not sure how to write inbetween in R. Can anyone help? Thanks

x <- c(3.2,6,7.8,1,3,2.5)
if (2<x<5){
    cat(-1)
} else {
    cat (x)
}


Solution 1:[1]

There are a number of syntax error in your code.

Try using findInterval

x[findInterval(x, c(2,5)) == 1L] <- -1
x
## [1]  -1.0  6.0  7.8  1.0 -1.0 -1.0

read ?findInterval for more details on the use of findInterval

You could also use replace

replace(x, x > 2 & x < 5, -1)

Note that

  • for 2<x<5 you need to write x > 2 & x < 5
  • cat will output to the console or a file / connection. It won't assign anything.

Solution 2:[2]

You probably just want to replace those elements with -1.

> x[x > 2 & x < 5] <- -1; x
[1] -1.0  6.0  7.8  1.0 -1.0 -1.0

You could also use ifelse.

> ifelse(x > 2 & x < 5, -1, x)
[1] -1.0  6.0  7.8  1.0 -1.0 -1.0

Solution 3:[3]

I compared the solutions with microbenchmark:

library(microbenchmark)
library(TeachingDemos)

x = runif(100000) * 1000
microbenchmark(200 %<% x %<% 500
               , x > 200 & x < 500
               , findInterval(x, c(200, 500)) == 1
               , findInterval(x, c(200, 500)) == 1L
               , times = 1000L
               )

Here are the results:

                               expr       min        lq      mean    median        uq       max neval
                  200 %<% x %<% 500 17.089646 17.747136 20.477348 18.910708 21.302945 113.71473  1000
                  x > 200 & x < 500  6.774338  7.092153  8.746814  7.233512  8.284603 103.64097  1000
  findInterval(x, c(200, 500)) == 1  3.578305  3.734023  5.724540  3.933615  6.777687  91.09649  1000
 findInterval(x, c(200, 500)) == 1L  2.042831  2.115266  2.920081  2.227426  2.434677  85.99866  1000

You should take findInterval. Please consider to compare it to 1L instead of 1. It is nearly twice as fast.

Solution 4:[4]

My preference for assigning a value to a variable based on a clearly defined numeric interval is to use base R syntax:

 DF$NewVar[DF$LowerLimit <= DF$OriginalVar & DF$OriginalVar < DF$UpperLimit] = "Normal"
 DF$NewVar[DF$LowerLimit < DF$OriginalVar] = "Low"
 DF$NewVar[DF$OriginalVar >= DF$UpperLimit] = "High"

I think this syntax is clearer than any number of R functions, largely because the code can be quickly customized to specify inclusive vs exclusive intervals. In practice, it's quite common to encounter situations where an interval can be defined as either inclusive (i.e., [-x to +x]) or exclusive (i.e., (-x to +x)) or a combination (i.e., [-x to +x)).

Additionally, base syntax provides clarity to the code if somebody else is reviewing it later. Each unique library of functions seems to have its own peculiar and slightly different syntax to achieve the same level of specificity as clearly defining the intervals using base R syntax.

Solution 5:[5]

Here is another approach that is a little more similar to the original:

library(TeachingDemos)

x <- c(3.2,6,7.8,1,3,2.5)

(x <- ifelse( 2 %<% x %<% 5, -1, x ) )

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Aaron left Stack Overflow
Solution 3 Dirk
Solution 4
Solution 5 Greg Snow