错误 - 替换有 [x] 行,数据有 [y] [英] Error - replacement has [x] rows, data has [y]

查看:23
本文介绍了错误 - 替换有 [x] 行,数据有 [y]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在数据框(df")中有一个数字列(value"),我想根据value"生成一个新列(valueBin").我有以下条件代码来定义 df$valueBin:

I have a numeric column ("value") in a dataframe ("df"), and I would like to generate a new column ("valueBin") based on "value." I have the following conditional code to define df$valueBin:

df$valueBin[which(df$value<=250)] <- "<=250"
df$valueBin[which(df$value>250 & df$value<=500)] <- "250-500"
df$valueBin[which(df$value>500 & df$value<=1000)] <- "500-1,000"
df$valueBin[which(df$value>1000 & df$value<=2000)] <- "1,000 - 2,000"
df$valueBin[which(df$value>2000)] <- ">2,000"

我收到以下错误:

"$<-.data.frame(*tmp*, "valueBin", value = c(NA, NA, NA, :替换有 6530 行,数据有 6532"

"Error in $<-.data.frame(*tmp*, "valueBin", value = c(NA, NA, NA, : replacement has 6530 rows, data has 6532"

df$value 的每个元素都应该适合我的 which() 语句之一.df$value 中没有缺失值.尽管即使我只运行第一个条件语句 (<=250),我也会得到完全相同的错误,"...replacement has 6530 rows..." 尽管比6530 条记录的 value<=250,并且 value 永远不会是 NA.

Every element of df$value should fit into one of my which() statements. There are no missing values in df$value. Although even if I run just the first conditional statement (<=250), I get the exact same error, with "...replacement has 6530 rows..." although there are way fewer than 6530 records with value<=250, and value is never NA.

这个 SO 链接指出了一个类似的错误,当使用聚合()是一个错误时,但它建议安装我拥有的 R 版本.加上错误报告说它已修复.R 聚合错误:"replacement has <foo>行,数据有 "

This SO link notes a similar error when using aggregate() was a bug, but it recommends installing the version of R I have. Plus the bug report says its fixed. R aggregate error: "replacement has <foo> rows, data has <bar>"

这个 SO 链接似乎与我的问题更相关,这里的问题是他/她的条件逻辑问题,导致生成的替换数组元素较少.我想这也一定是我的问题,一开始我想我必须有一个<="而不是一个<"反之亦然,但经过检查,我很确定它们都是正确的,可以无重叠地涵盖价值"的每个值.'[<-.data.frame' 中的 R 错误... 替换有 # 项,需要 #

This SO link seems more related to my issue, and the issue here was an issue with his/her conditional logic that caused fewer elements of the replacement array to be generated. I guess that must be my issue as well, and figured at first I must have a "<=" instead of an "<" or vice versa, but after checking I'm pretty sure they're all correct to cover every value of "value" without overlaps. R error in '[<-.data.frame'... replacement has # items, need #

推荐答案

你可以使用 cut

 df$valueBin <- cut(df$value, c(-Inf, 250, 500, 1000, 2000, Inf), 
    labels=c('<=250', '250-500', '500-1,000', '1,000-2,000', '>2,000'))

数据

 set.seed(24)
 df <- data.frame(value= sample(0:2500, 100, replace=TRUE))

这篇关于错误 - 替换有 [x] 行,数据有 [y]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆