如何读取c中的csv文件,其中一些值包含百分比符号(%) [英] How to read csv file in R where some values contain the percent symbol (%)

查看:142
本文介绍了如何读取c中的csv文件,其中一些值包含百分比符号(%)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有清除/自动转换在R中以百分比(尾随符号)格式化的CSV值? >

以下是一些示例数据:

  $ b 2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%

可以使用以下命令阅读:

  = read.csv(Example.csv)

但是所有的%并转换为因子:

 > str(junk)
'data.frame':4 obs。的3个变量:
$ actual:num 2.15 0.917 7.941 4.964
$ simulated:num 8.607 8.027 0.215 3.524
$ percent.error:因子w / 4级-300% 775%,...:1 2 4 3

但我希望它们是数字值。



read.csv有额外的参数吗?有没有办法轻松地后处理所需的列转换为数值?其他解决方案?



注意:当然在这个例子中,我可以简单地重新计算的值,但在我真正的应用程序与更大的数据文件,这是不实际的。

解决方案

在R中没有百分比类型。所以你需要做一些后处理: / p>

  DF < -  read.table(text =actual,simulated,percent error 
2.1496,8.6066,-300 %
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%,sep =,,header = TRUE)

DF [,3]< - as.numeric(gsub(%,,DF [,3]))/ 100

#实际模拟percent.error
#1 2.1496 8.6066 -3.00
#2 0.9170 8.0266 -7.75
#3 7.9406 0.2152 0.97
#4 4.9637 3.5237 0.29


Is there a clean/automatic way to convert CSV values formatted with as percents (with trailing % symbol) in R?

Here is some example data:

actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%

Which can be read using:

junk = read.csv("Example.csv")

But all of the % columns are read as strings and converted to factors:

> str(junk)
 'data.frame':  4 obs. of  3 variables:
 $ actual       : num  2.15 0.917 7.941 4.964
 $ simulated    : num  8.607 8.027 0.215 3.524
 $ percent.error: Factor w/ 4 levels "-300%","-775%",..: 1 2 4 3

but I would like them to be numeric values.

Is there an additional parameter for read.csv? Is there a way to easily post process the needed columns to convert to numeric values? Other solutions?

Note: of course in this example I could simply recompute the values, but in my real application with a larger data file this is not practical.

解决方案

There is no "percentage" type in R. So you need to do some post-processing:

DF <- read.table(text="actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%", sep=",", header=TRUE)

DF[,3] <- as.numeric(gsub("%", "",DF[,3]))/100

#  actual simulated percent.error
#1 2.1496    8.6066         -3.00
#2 0.9170    8.0266         -7.75
#3 7.9406    0.2152          0.97
#4 4.9637    3.5237          0.29

这篇关于如何读取c中的csv文件,其中一些值包含百分比符号(%)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆