'How to remove '.' from column names in a dataframe?
My dataframe which I read from a csv file has column names like this
abc.def, ewf.asd.fkl, qqit.vsf.addw.coil
I want to remove the '.' from all the names and convert them to
abcdef, eqfasdfkl, qqitvsfaddwcoil.
I tried using the sub command sub(".","",colnames(dataframe))
but this command took out the first letter of each column name and the column names changed to
bc.def, wf.asd.fkl, qit.vsf.addw.coil
Anyone know another command to do this. I can change the column name one by one, but I have a lot of files with 30 or more columns in each file.
Again, I want to remove the "." from all the colnames. I am trying to do this so I can use "sqldf" commands, which don't deal well with "."
Thank you for your help
Solution 1:[1]
1) sqldf can deal with names having dots in them if you quote the names:
library(sqldf)
d0 <- read.csv(text = "A.B,C.D\n1,2")
sqldf('select "A.B", "C.D" from d0')
giving:
A.B C.D
1 1 2
2) When reading the data using read.table
or read.csv
use the check.names=FALSE
argument.
Compare:
Lines <- "A B,C D
1,2
3,4"
read.csv(text = Lines)
## A.B C.D
## 1 1 2
## 2 3 4
read.csv(text = Lines, check.names = FALSE)
## A B C D
## 1 1 2
## 2 3 4
however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces.
3) To simply remove the periods, if DF
is a data frame:
names(DF) <- gsub(".", "", names(DF), fixed = TRUE)
or it might be nicer to convert the periods to underscores so that it is reversible:
names(DF) <- gsub(".", "_", names(DF), fixed = TRUE)
This last line could be alternatively done like this:
names(DF) <- chartr(".", "_", names(DF))
Solution 2:[2]
To replace all the dots in the names you'll need to use gsub, rather than sub, which will only replace the first occurrence.
This should work.
test <- data.frame(abc.def = NA, ewf.asd.fkl = NA, qqit.vsf.addw.coil = NA)
names(test) <- gsub( ".", "", names(test), fixed = TRUE)
test
abcdef ewfasdfkl qqitvsfaddwcoil
1 NA NA NA
Solution 3:[3]
UPDATE dplyr 0.8.0
As of dplyr 0.8 funs()
is soft deprecated, use formula notation.
a dplyr
way to do this using stringr
.
library(dplyr)
library(stringr)
data <- data.frame(abc.def = 1, ewf.asd.fkl = 2, qqit.vsf.addw.coil = 3)
renamed_data <- data %>%
rename_all(~str_replace_all(.,"\\.","_")) # note we have to escape the '.' character with \\
Make sure you install the packages with install.packages()
.
Remember you have to escape the .
character with \\.
in regex, which functions like str_replace_all
use, .
is a wildcard.
Solution 4:[4]
You can also try:
names(df) = gsub(pattern = ".", replacement = "", x = names(df))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | r.bot |
Solution 3 | |
Solution 4 | dare_devils |