R:数值变量均值的 NA [英] R: NA on mean for a numerical variable

查看:78
本文介绍了R:数值变量均值的 NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据蜂窝运营商确定一些值.我有一个包含来自所有运营商的数据的主数据框,并且我已经通过提供者从主数据框创建了 3 个单独的数据框:

I am trying to determine some values based on cellular carrier. I have a main data frame that contains data from all carriers, and I have created 3 individual data frames from the main data frame by provider:

verizondf <- maindata[maindata$network == "Verizon",]
attdf <- maindata[maindata$network=="ATT",]
tmobiledf <- maindata[maindata$network=="TMobile",]

我想获得其中一个变量下载"的平均值,这是一个数值.

I want to get the average for one of the variables, "download", which is a numerical value.

在 verizondf 数据框上,它工作正常:

On the verizondf data frame, it works fine:

> mean(verizondf$download)
[1] 462004.4

对于其他 2 个,我得到 NA:

For the other 2, I get NA:

> mean(attdf$download)
[1] NA

我想知道数据类型是否在某个时候发生了变化,但我检查了一下,它仍然是数字:

I wondered if the data type had changed at some point, but I checked and it is still numeric:

> str(attdf$download)
 num [1:5516] 321585 50722 400085 287968 138301 ...

可能导致此问题的原因是什么?

What could be causing this issue?

推荐答案

其他人已经在评论中指出了这一点,我可以在这里给出更全面"的解释.

Others have pointed this out with their comments, I can give a "fuller" explanation here.

当您使用 ?mean 查看帮助手册页时,您将获得说明,包括以下信息:

When you look at the help manual pages using ?mean you will get the description, including this info:

用法

均值(x, ...)

## 默认 S3 方法:平均值(x,修剪 = 0,na.rm = FALSE,...)

## Default S3 method: mean(x, trim = 0, na.rm = FALSE, ...)

查看参数"部分,您将看到:

Looking under the "Arguments" section, you will see this:

na.rm
一个逻辑值,指示在 > 计算进行之前是否应该去除 NA 值.

na.rm
a logical value indicating whether NA values should be stripped before the > computation proceeds.

这告诉您 mean 的默认值是去除 NA,如果您的数据包含 NA,这将导致 NA 的平均值.

This tells you that the default for mean is to not strip out NA's, which will lead to a mean of NA if your data contains NA's.

如果你想在你有 NA 值时计算一个数字平均值(这没关系,因为你有 NA 的事实......总是正确的!!!)你会使用mean 带有参数 na.rm = TRUE.

If you want a numeric mean computed when you have NA values (and this is ok, given the fact you have NA's...something that is not always true!!!) you would use mean with the argument na.rm = TRUE.

这篇关于R:数值变量均值的 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆