'R make.unique starting in 1
I have a data frame with columns that are in groups of 4 like so:
a b c d a b c d a b c d a b c d...
Then, I use the function rep to create tags for the columns:
rep(c("a", "b", "c", "d"), len=ncol)
Finally I use the function make.unique to create the tags:
a b c d a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3...
However, I would like to get:
a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3 a4 b4 c4 d4...
Is there an easy way to accomplish this? In the make.unique documentation does not mention any parameters to obtain this behaviour.
Solution 1:[1]
n <- 4
ncol <- 16
paste(letters[seq(n)], rep(seq(ncol/n), each = n, len = ncol), sep = "")
Solution 2:[2]
make.unique.2 = function(x, sep='.'){
ave(x, x, FUN=function(a){if(length(a) > 1){paste(a, 1:length(a), sep=sep)} else {a}})
}
Testing against your example:
> u = rep(c("a", "b", "c", "d"), 4)
> make.unique.2(u)
[1] "a.1" "b.1" "c.1" "d.1" "a.2" "b.2" "c.2" "d.2" "a.3" "b.3" "c.3" "d.3"
[13] "a.4" "b.4" "c.4" "d.4"
If an element is not duplicated, it is left alone:
> u = c('a', 'a', 'b', 'c', 'c', 'c', 'd')
> make.unique.2(u)
[1] "a.1" "a.2" "b" "c.1" "c.2" "c.3" "d"
Solution 3:[3]
Wouldn't call this pretty, but it does the job:
> ncol <- 10
> apply(expand.grid(c("a","b","c","d"),1:((ncol+3)/4)), 1,
+ function(x)paste(x,collapse=""))[1:ncol]
[1] "a1" "b1" "c1" "d1" "a2" "b2" "c2" "d2" "a3" "b3"
where ncol is the number of tags to generate.
Solution 4:[4]
Here is a further variant. Applying the function make.unique.2 by @adn.bps can still produces some duplicates:
> u = c("a", "a", "b", "c", "c", "d", "c", "a.1")
> make.unique.2(u)
[1] "a.1" "a.2" "b" "c.1" "c.2" "d" "c.3" "a.1"
To avoid that, I've done:
dotify <- function(x, avoid){
l <- length(x)
if(l == 1L){
return(x)
}
numbers <- 1L:l
out <- paste0(x, ".", numbers)
ndots <- 1L
while(any(out %in% avoid)){
ndots <- ndots + 1L
out <- paste0(x, paste0(rep(".", ndots), collapse = ""), numbers)
}
out
}
make.unique2 <- function(x){
if(anyDuplicated(x)){
splt <- split(x, x)
u <- names(splt)
for(i in 1L:length(splt)){
splt_i <- splt[[i]]
j <- match(splt_i[1L], u)
avoid <- u[-j]
splt_i_new <- dotify(splt_i, avoid)
u <- c(avoid, splt_i_new)
splt[[i]] <- splt_i_new
}
x <- unsplit(splt, x)
}
x
}
make.unique2(u)
# [1] "a..1" "a..2" "b" "c.1" "c.2" "d" "c.3" "a.1"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mdsumner |
| Solution 2 | adn bps |
| Solution 3 | NPE |
| Solution 4 | Stéphane Laurent |
