'How to turn whole dataframe from string variables into numbers?
I have a dataframe full of answers to a survey, so each column is filled with Never, Sometimes and Always and I need to change Never to the numeric 0, sometimes to the numeric 1 and always to the numeric 2. Is there a way to apply this change to the whole dataframe instead of individual columns?
Solution 1:[1]
Suppose your data frame looks like this:
df
#> Q1 Q2 Q3
#> 1 Never Always Always
#> 2 Always Never Never
#> 3 Never Never Never
#> 4 Sometimes Never Never
#> 5 Never Sometimes Never
#> 6 Always Sometimes Sometimes
#> 7 Always Sometimes Never
#> 8 Sometimes Sometimes Never
#> 9 Sometimes Always Sometimes
#> 10 Always Never Sometimes
Then you can do
df[] <- sapply(df, function(x) match(x, c("Never", "Sometimes", "Always")) - 1)
Which results in
df
#> Q1 Q2 Q3
#> 1 0 2 2
#> 2 2 0 0
#> 3 0 0 0
#> 4 1 0 0
#> 5 0 1 0
#> 6 2 1 1
#> 7 2 1 0
#> 8 1 1 0
#> 9 1 2 1
#> 10 2 0 1
Reproducible data frame
set.seed(1)
df <- replicate(3, sample(c("Never", "Sometimes", "Always"), 10, TRUE))
df <- setNames(as.data.frame(df), c("Q1", "Q2", "Q3"))
Solution 2:[2]
Another approach could be using a named vector, probably more appropriate if you want more flexible in your translations.
set.seed(1)
df <- replicate(3, sample(c("Never", "Sometimes", "Always"), 10, TRUE))
df <- setNames(as.data.frame(df, stringsAsFactors = F), c("Q1", "Q2", "Q3"))
t <- c(0:2)
names(t) <- c("Never", "Sometimes", "Always")
as.data.frame(lapply(df, function(x) t[x]))
# Q1 Q2 Q3
# 1 0 2 2
# 2 2 0 0
# 3 0 0 0
# 4 1 0 0
# 5 0 1 0
# 6 2 1 1
# 7 2 1 0
# 8 1 1 0
# 9 1 2 1
# 10 2 0 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Allan Cameron |
| Solution 2 | Merijn van Tilborg |
