Editing a lot of variable names in a R data frame

Someone I work with asked about how to easily update lots of variable names in a R data frame after importing a CSV file. Apparently the column headers in the CSV file are long and unwieldy, and simply opening the CSV file beforehand and editing the variable names is not desirable. So what to do? Well you’re going to have to actually type the new variable names at least once. There’s no getting around that. But it would be nice to type them only once and never again. This would be useful if you need to import the same CSV file over and over as it gets updated over time. Maybe it’s a quarterly report or weekly data dump of some sort. In that case we can do something like the following:

# read in data set and save variable names into a vector
ds <- read.csv("dataset.csv",header=TRUE)
old_names <- names(ds)

# write variable names to text file
writeLines(old_names, "names.txt")

We imported the CSV file, saved the variable names to a vector, and then wrote that vector to a text file with each variable on its own line. Now we can open the text file in an editor and edit the variable names. I find this easier than doing it in the context of R code. No need for quotes, functions, assignment operators, etc. When you’re done, save and close your text file. Now you’re ready to change your variable names:

names(ds) <- readLines("names.txt")

Done! The next time you need to import the same CSV file, you can just run these two lines:

ds <- read.csv("dataset.csv",header=TRUE)
names(ds) <- readLines("names.txt")

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.