错误 - 替换有 [x] 行,数据有 [y] [英] Error - replacement has [x] rows, data has [y]
问题描述
我在数据框(df")中有一个数字列(value"),我想根据value"生成一个新列(valueBin").我有以下条件代码来定义 df$valueBin:
I have a numeric column ("value") in a dataframe ("df"), and I would like to generate a new column ("valueBin") based on "value." I have the following conditional code to define df$valueBin:
df$valueBin[which(df$value<=250)] <- "<=250"
df$valueBin[which(df$value>250 & df$value<=500)] <- "250-500"
df$valueBin[which(df$value>500 & df$value<=1000)] <- "500-1,000"
df$valueBin[which(df$value>1000 & df$value<=2000)] <- "1,000 - 2,000"
df$valueBin[which(df$value>2000)] <- ">2,000"
我收到以下错误:
"$<-.data.frame
(*tmp*
, "valueBin", value = c(NA, NA, NA, :替换有 6530 行,数据有 6532"
"Error in
$<-.data.frame
(*tmp*
, "valueBin", value = c(NA, NA, NA, : replacement has 6530 rows, data has 6532"
df$value
的每个元素都应该适合我的 which()
语句之一.df$value
中没有缺失值.尽管即使我只运行第一个条件语句 (<=250),我也会得到完全相同的错误,"...replacement has 6530 rows..."
尽管比6530 条记录的 value<=250,并且 value 永远不会是 NA.
Every element of df$value
should fit into one of my which()
statements. There are no missing values in df$value
. Although even if I run just the first conditional statement (<=250), I get the exact same error, with "...replacement has 6530 rows..."
although there are way fewer than 6530 records with value<=250, and value is never NA.
这个 SO 链接指出了一个类似的错误,当使用聚合()是一个错误时,但它建议安装我拥有的 R 版本.加上错误报告说它已修复.R 聚合错误:"replacement has <foo>行,数据有
This SO link notes a similar error when using aggregate() was a bug, but it recommends installing the version of R I have. Plus the bug report says its fixed. R aggregate error: "replacement has <foo> rows, data has <bar>"
这个 SO 链接似乎与我的问题更相关,这里的问题是他/她的条件逻辑问题,导致生成的替换数组元素较少.我想这也一定是我的问题,一开始我想我必须有一个<="而不是一个<"反之亦然,但经过检查,我很确定它们都是正确的,可以无重叠地涵盖价值"的每个值.'[<-.data.frame' 中的 R 错误... 替换有 # 项,需要 #
This SO link seems more related to my issue, and the issue here was an issue with his/her conditional logic that caused fewer elements of the replacement array to be generated. I guess that must be my issue as well, and figured at first I must have a "<=" instead of an "<" or vice versa, but after checking I'm pretty sure they're all correct to cover every value of "value" without overlaps. R error in '[<-.data.frame'... replacement has # items, need #
推荐答案
你可以使用 cut
df$valueBin <- cut(df$value, c(-Inf, 250, 500, 1000, 2000, Inf),
labels=c('<=250', '250-500', '500-1,000', '1,000-2,000', '>2,000'))
数据
set.seed(24)
df <- data.frame(value= sample(0:2500, 100, replace=TRUE))
这篇关于错误 - 替换有 [x] 行,数据有 [y]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!