mean(,na.rm=TRUE) 仍然返回 NA [英] mean( ,na.rm=TRUE) still returns NA

查看:102
本文介绍了mean(,na.rm=TRUE) 仍然返回 NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 R 很陌生(从 SPSS 转过来).我在运行 Mavericks 的 Mac 上使用 RStudio.请用 2 个音节的词回答我的问题,因为这是我第一次真正尝试这样的事情.我已经完成了一些基本教程,可以使所有示例数据都能正常工作.

I'm very new to R (moving over from SPSS). I'm using RStudio on a Mac running Mavericks. Please answer my question in words of 2 syllables as this is my first real attempt at anything like this. I've worked through some basic tutorials and can make things work on all the sample data.

我有一个包含 64,000 行和大约 20 列的数据集.我想获得变量hold_time"的平均值,但无论我尝试什么,我都会得到 NA 或 NA 以及一条警告消息

I have a data set with 64,000-ish rows and about 20 columns. I want to get the mean of the variable "hold_time", but whatever I try I get either NA or NA and a warning message

我已经尝试了以下所有方法:

I have tried all of the following:

> summary(data_Apr_Jun$hold_time,na.rm=TRUE)
      5       6       7       4       8       2       1       3      10 
   9596    9191    3192    1346    1145     977     940     655     534 
     11       9      12       0      13      15      14      16      17 
    490     444     249     128     106      86      73      68      40 
     98     118     121     128     125      97     101     188      86 
     31      29      28      28      27      27      26      26      26 
    102     105     113      81     119     139     127     134     152 
     25      25      25      25      24      24      23      23      23 
     18      69      96     106     110     111     120     190      76 
     23      23      23      22      22      22      22      22      22 
     82     132     135     156     166      94     115     116     117 
     22      21      21      21      21      21      20      20      20 
    142     153     165      19      93     100     104     112     126 
     20      20      20      20      20      19      19      19      19 
    131     138     143     157     177     189      61      87     103 
     19      19      19      19      19      19      19      19      18 
    108     148     176     212      54      56      64      74      79 
     18      18      18      18      18      18      18      18      18 
     99     107     129     163     168     171     178     226     236 
     18      17      17      17      17      17      17      17      17 
     59      71      78      95     114     122     123     130 (Other) 
     17      17      17      17      16      16      16      16    2739 
   NA's 
  29807 
> mean(as.numeric(data_Apr_Jun$hold_time,NA.rm=TRUE))
[1] NA
> data_Apr_Jun$hold_time[data_Apr_Jun$hold_time=="NA"]<-0
> mean(as.numeric(data_Apr_Jun$hold_time))
[1] NA
> mean(data_Apr_Jun$hold_time)
[1] NA
Warning message:
In mean.default(data_Apr_Jun$hold_time) :
  argument is not numeric or logical: returning NA
> mean(as.numeric(data_Apr_Jun$hold_time,na.rm=TRUE))
[1] NA
> colMeans(data_Apr_Jun$hold_time)
Error in colMeans(data_Apr_Jun$hold_time) : 
  'x' must be an array of at least two dimensions
> colMeans(data_Apr_Jun)
Error in colMeans(data_Apr_Jun) : 'x' must be numeric
> mean(data_Apr_Jun$hold_time,na.omit)
[1] NA
Warning message:
In mean.default(data_Apr_Jun$hold_time, na.omit) :
  argument is not numeric or logical: returning NA

因此,即使我正在删除 NA,它们似乎也没有被删除.我很困惑.

So even though I am removing the NAs they don't seem to be being removed. I am flummoxed.

推荐答案

你好 Rnovice 不幸的是有几个错误...让我们一一解决:

Hello Rnovice unfortunatly there are several errors... Lets resolve them one by one:

> mean(as.numeric(data_Apr_Jun$hold_time,NA.rm=TRUE))
[1] NA

这是因为您以错误的方式使用了 na.rm:应该是

This is because you use na.rm in a wrong manner: it should be

mean(as.numeric(data_Apr_Jun$hold_time),na.rm=TRUE)

  1. na.rmmean 的参数,而不是 as.numeric 的参数(注意括号)
  2. na.rm R 区分大小写
  1. na.rm is an argument of mean, not of as.numeric (caution with the brackets)
  2. is na.rm R is case sensitive

====================================================================================

==================================================================================

> data_Apr_Jun$hold_time[data_Apr_Jun$hold_time=="NA"]<-0

R 不允许与 NA 进行比较,因为我在这里指出:返回 NA 的一些奇怪之处
你的意思是

R does not allow comparison with NA as i pointed our here: Something weird about returning NAs
What you mean is

data_Apr_Jun$hold_time[which(is.na(data_Apr_Jun$hold_time))] <- 0

另外一个注释 =="NA" 是与字符串 "NA" 进行比较.试试 is.na("NA")is.na(NA) 看看区别.

One more remark =="NA" is comparing with a string "NA". Try is.na("NA") and is.na(NA) to see the difference.

====================================================================================

==================================================================================

colMeans(data_Apr_Jun$hold_time)
Error in colMeans(data_Apr_Jun$hold_time) : 
  'x' must be an array of at least two dimensions

尝试data_Apr_Jun$hold_time,你会看到,它返回一个向量.这就是为什么 colwise 均值(由 colMeans 计算)毫无意义的原因.

try data_Apr_Jun$hold_time and you will see, that it returns a vector. This is why a colwise mean (computed by colMeans) makes no sence.

希望通过这些提示可以理解/解决其余部分.您已经意识到的一件非常重要的事情:
使用 R!您走对了!

Hope the rest is understandable/solveable with these hints. One very importent thing that you already realized:
Use R! you are on the right track!

这篇关于mean(,na.rm=TRUE) 仍然返回 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆