当一些数字包含逗号作为千分隔符时,如何读取数据? [英] How to read data when some numbers contain commas as thousand separator?
问题描述
我有一个csv文件,其中一些数值表示为字符串,以逗号作为千分隔符,例如。 1,513
而不是 1513
。
I have a csv file where some of the numerical values are expressed as strings with commas as thousand separator, e.g. "1,513"
instead of 1513
. What is the simplest way to read the data into R?
我可以使用 read.csv(...,colClasses =character)
,但是我必须从相关的元素中删除逗号,然后将这些列转换为数字,我找不到一个干净的方式来做。
I can use read.csv(..., colClasses="character")
, but then I have to strip out the commas from the relevant elements before converting those columns to numeric, and I can't find a neat way to do that.
推荐答案
我想使用R而不是预处理数据,因为它使数据修改时更容易。根据Shane的建议,使用 gsub
,我认为这是我可以做的整洁:
I want to use R rather than pre-processing the data as it makes it easier when the data are revised. Following Shane's suggestion of using gsub
, I think this is about as neat as I can do:
x <- read.csv("file.csv",header=TRUE,colClasses="character")
col2cvt <- 15:41
x[,col2cvt] <- lapply(x[,col2cvt],function(x){as.numeric(gsub(",", "", x))})
这篇关于当一些数字包含逗号作为千分隔符时,如何读取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!