在R中处理聚合函数中的NA [英] Handling NA's in aggregate function in R

查看:169
本文介绍了在R中处理聚合函数中的NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用聚合函数从csv文件中获取每日总和,但是遇到以下错误:

I am trying to get the daily sum from a csv file using the aggregate function but I am encountering the following errors:

Error in Summary.factor(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), na.rm = FALSE) : ‘sum’ not meaningful for factors
Calls: aggregate ... aggregate.data.frame -> lapply -> FUN -> lapply ->          Summary.factor 
Execution halted

此处是数据链接数据

这是我的代码:

dat<-read.csv("Laoag_tc_induced.csv",header=TRUE,sep=",")
dat[dat == -999] <- NA
dat[dat == -888] <- 0
dat$Date <- as.Date(strptime(dat$key, '%Y_%m_%d_%H'))

df <- data.frame(dat$Date,dat$RR,dat$dist)
df <- aggregate(RR ~ Date, dat,sum)

names(df)[1] <- "Date"
names(df)[2] <- "Rain"

write.table(df,file="test.csv",sep=",")

我尝试使用:

df <- aggregate(RR ~ Date, dat,sum,na.rm=TRUE)

df <- aggregate(RR ~ Date,dat,sum,na.rm=TRUE,na.action=na.pass)

错误仍然相同:

‘sum’ not meaningful for factors

推荐答案

'RR'中有某些元素,即" NA",将列的类更改为factor(也使用stringsAsFactors = FALSE).选项是将na.strings中的NA字符串指定为读取为NA

There are certain elements in the 'RR' i.e. " NA", changed the class of the column to factor (also use stringsAsFactors = FALSE). The option would be to specify the NA strings within na.strings to be read as NA

dat <- read.csv(file, header = TRUE, stringsAsFactors = FALSE, 
          na.strings = "   NA", strip.white = TRUE)

完成OP的转换/替换后,

After doing the OP's transformation/replacement,

res <- aggregate(RR ~ Date, dat,sum)
head(res, 5)
#        Date  RR
#1 1994-08-09 0.0
#2 1994-08-10 0.0
#3 1994-08-11 0.0
#4 1994-08-12 0.3
#5 1994-08-13 0.0

OP指出要更改日期,因此根据提供的数据可以正常工作

As the OP stated that the date are getting changed, it is working fine based on the data provided

dat[78:81,]
#   X.1          key     SN CY     Lat.x    Lon.x     X   RR     Lat.y    Lon.y     dist       Date
#78  78  1994_8_19_0 199419 19 0.3700098 2.230531 49133 28.8 0.3176499 2.104727 824.8680 1994-08-19
#79  79  1994_8_19_6 199419 19 0.3787364 2.214823 49134 28.8 0.3176499 2.104727 765.4631 1994-08-19
#80  80 1994_8_19_12 199419 19 0.3857178 2.200860 49135 28.8 0.3176499 2.104727 720.0335 1994-08-19
#81  81 1994_8_19_18 199419 19 0.3926991 2.190388 49136 28.8 0.3176499 2.104727 700.1729 1994-08-19

与csv数据中的那个相同

which is same as the one in the csv data

这篇关于在R中处理聚合函数中的NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆