使用货币列作为数字读取R中的csv文件 [英] Read csv file in R with currency column as numeric
问题描述
我正在尝试将包含政治捐款信息的csv文件读入R。据我了解,默认情况下这些列是作为因素导入的,但是我需要将数量列(数据集中的 CTRIB_AMT)作为数值列导入,以便我可以运行各种不适用于的功能因素。该列的格式为货币,前缀为 $。
I'm trying to read into R a csv file that contains information on political contributions. From what I understand, the columns by default are imported as factors, but I need the the amount column ('CTRIB_AMT' in the dataset) to be imported as a numeric column so I can run a variety of functions that wouldn't work for factors. The column is formatted as a currency with a "$" as prefix.
我最初使用简单的读取命令导入文件:
I used a simple read command to import the file initially:
contribs <- read.csv('path/to/file')
,然后尝试将CTRIB_AMT从货币转换为数字:
And then tried to convert the CTRIB_AMT from currency to numeric:
as.numeric(as.character(sub("$","",contribs$CTRIB_AMT, fixed=TRUE)))
但是那没有用。我要用于CTRIB_AMT列的函数是:
But that didn't work. The functions I'm trying to use for the CTRIB_AMT columns are:
vals<-sort(unique(dfr$CTRIB_AMT))
sums<-tapply( dfr$CTRIB_AMT, dfr$CTRIB_AMT, sum)
counts<-tapply( dfr$CTRIB_AMT, dfr$CTRIB_AMT, length)
请参阅相关问题此处。
是否有任何关于如何最初导入文件,列为数字或在导入后如何转换的想法?
Any thoughts on how to import file initially so column is numeric or how to convert it after importing?
推荐答案
我不确定如何直接读取它,但是一旦它位于其中就可以对其进行修改:
I'm not sure how to read it in directly, but you can modify it once it's in:
> A <- read.csv("~/Desktop/data.csv")
> A
id desc price
1 0 apple $1.00
2 1 banana $2.25
3 2 grapes $1.97
> A$price <- as.numeric(sub("\\$","", A$price))
> A
id desc price
1 0 apple 1.00
2 1 banana 2.25
3 2 grapes 1.97
> str(A)
'data.frame': 3 obs. of 3 variables:
$ id : int 0 1 2
$ desc : Factor w/ 3 levels "apple","banana",..: 1 2 3
$ price: num 1 2.25 1.97
我认为这可能只是您的子程序中缺少的转义符。 $表示正则表达式中的行尾。 \ $是美元符号。但是然后您必须逃脱……
I think it might just have been a missing escape in your sub. $ indicates the end of a line in regular expressions. \$ is a dollar sign. But then you have to escape the escape...
这篇关于使用货币列作为数字读取R中的csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!