包含 NA 的字段的范围 [英] Range on a field containing NAs
问题描述
我正在使用一个数据集,其中 csv 文件的第 11 列包含数字数据.它也包含一些 NA 值.这是对象的str:
I'm using a data set where the 11th column on a csv file has numeric data. It contains some NA values too. Here is the str of the object:
str(dataheart)
num [1:4706] 14.3 18.5 18.1 NA NA NA 17.7 18 15.9 NA ...
所以,作为 R 的新学生,我曾期望 range(dataheart)
的结果是最小值和最大值.通过查看带有数据的 CSV 文件,我知道最小值和最大值分别为 10.1 和 21.9.
So, as a new student of R, I had expected the result of range(dataheart)
to be the min and max values.From looking at the CSV file with data, I know that the min and max are 10.1 and 21.9.
但上面返回一个向量
[1] NA NA
我对这个函数的理解有误吗?
Is my understanding of this function incorrect?
推荐答案
你需要
range(x,na.rm=TRUE)
见?range
为了额外的功劳,这里列出了 base
和 stats
包中使用 na.rm
的函数:
For extra credit, here's a list of the functions in the base
and stats
packages that use na.rm
:
uses_na_rm <- function(x) is.function(fx <- get(x)) &&
"na.rm" %in% names(formals(fx))
basevals <- ls(pos="package:base")
basevals[sapply(basevals,uses_na_rm)]
## [1] "colMeans" "colSums"
## [3] "is.unsorted" "mean.default"
## [5] "pmax" "pmax.int"
## [7] "pmin" "pmin.int"
## [9] "range.default" "rowMeans"
## [11] "rowsum.data.frame" "rowsum.default"
## [13] "rowSums" "Summary.data.frame"
## [15] "Summary.Date" "Summary.difftime"
## [17] "Summary.factor" "Summary.numeric_version"
## [19] "Summary.ordered" "Summary.POSIXct"
## [21] "Summary.POSIXlt"
statvals <- ls(pos="package:stats")
statvals[sapply(statvals,uses_na_rm)]
## [1] "density.default" "fivenum" "heatmap" "IQR"
## [5] "mad" "median" "median.default" "medpolish"
## [9] "quantile.default" "sd" "var"
为了进一步考虑 R 中哪些函数处理 NA
以及如何处理,可以使用 na.action
参数(lm
和朋友们).
For further consideration of which functions in R deal with NA
s and how, one could do an analogous search for functions with an na.action
argument (lm
and friends).
这篇关于包含 NA 的字段的范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!