当某些数字包含逗号作为千位分隔符时如何读取数据? [英] How to read data when some numbers contain commas as thousand separator?
问题描述
我有一个 csv 文件,其中一些数值用逗号作为千位分隔符的字符串表示,例如"1,513"
而不是 1513
.将数据读入 R 的最简单方法是什么?
I have a csv file where some of the numerical values are expressed as strings with commas as thousand separator, e.g. "1,513"
instead of 1513
. What is the simplest way to read the data into R?
我可以使用 read.csv(..., colClasses="character")
,但是在将这些列转换为数字之前,我必须从相关元素中去除逗号,然后我找不到一种巧妙的方法来做到这一点.
I can use read.csv(..., colClasses="character")
, but then I have to strip out the commas from the relevant elements before converting those columns to numeric, and I can't find a neat way to do that.
推荐答案
我想使用 R 而不是预处理数据,因为它在修改数据时更容易.按照 Shane 的使用 gsub
的建议,我认为这是我所能做的最简洁的:
I want to use R rather than pre-processing the data as it makes it easier when the data are revised. Following Shane's suggestion of using gsub
, I think this is about as neat as I can do:
x <- read.csv("file.csv",header=TRUE,colClasses="character")
col2cvt <- 15:41
x[,col2cvt] <- lapply(x[,col2cvt],function(x){as.numeric(gsub(",", "", x))})
这篇关于当某些数字包含逗号作为千位分隔符时如何读取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!