'Implementation of standard recycling rules

One nice feature of R which is related to its inherent vectorized nature is the recycling rule described in An Introduction to R in Section 2.2.

Vectors occurring in the same expression need not all be of the same length. If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Shorter vectors in the expression are recycled as often as need be (perhaps fractionally) until they match the length of the longest vector. In particular a constant is simply repeated.

Most standard functions use this, but the code that does so is buried in the underlying C code.

Is there a canonical way to implement the standard recycling rules for a function entirely in R code? That is, given a function like

mock <- function(a, b, c) {
    # turn a, b, and c into appropriate recycled versions

    # do something with recycled a, b, and c in some appropriately vectorized way
}

where a, b, and c are vectors, possibly of different lengths and unknown types/classes, is there a canonical way to get a new set of vectors which are recycled according to the standard recycling rules? In particular, I can't assume that "do something" step will do the proper recycling itself, so I need to do it myself beforehand.

r


Solution 1:[1]

I'd likely use the length.out argument of rep() to do most of the real work.

Here's an example that creates a better.data.frame() function (it should really be called "better".data.frame()), which places no restrictions on the lengths of the vectors it's handed as arguments. In this case, I recycle all of the vectors to the length of the the longest one, but you can obviously adapt this to serve your own recycling needs!

better.data.frame <- function(...) {
    cols <- list(...)
    names(cols) <- sapply(as.list(match.call()), deparse)[-1]

    # Find the length of the longest vector
    # and then recycle all columns to that length.
    n <- max(lengths(cols))
    cols <- lapply(cols, rep, length.out = n)

    as.data.frame(cols)
}

# Try it out
a <- Sys.Date() + 0:9
b <- 1:3
c <- letters[1:4]

data.frame(a,b,c)
# Error in data.frame(a, b, c) : 
#   arguments imply differing number of rows: 10, 3, 4

better.data.frame(a,b,c)
#             a b c
# 1  2012-02-17 1 a
# 2  2012-02-18 2 b
# 3  2012-02-19 3 c
# 4  2012-02-20 1 d
# 5  2012-02-21 2 a
# 6  2012-02-22 3 b
# 7  2012-02-23 1 c
# 8  2012-02-24 2 d
# 9  2012-02-25 3 a
# 10 2012-02-26 1 b

Solution 2:[2]

One short-and-dirty route for numerical arguments is to rely on cbind's automatic recycling. For example:

f.abc <- function(a,b,c) {

     df.abc <- as.data.frame( suppressWarnings( cbind(a=a, b=b, c=c) ) )

     #Then use, for example, with() to use a, b and c inside the data frame, 
     #or apply(df.abc,1, ...) 
}

It does rely heavily on there being no other legitimate cause for warnings, though.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 amo-ej1