通过使用break& amp; amp;分位数失败? [英] Create bins dynamically in dataframe with by using breaks & quantiles fails?

查看:203
本文介绍了通过使用break& amp; amp;分位数失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:
我在我以前分享的代码中犯了一个错误。我用b替换了bins但错过了一个...

I have made a mistake in my previous code that I shared. I replaced the "bins" with "b" but missed one...

我现在也使用正确的data.frame(y而不是原始的df.score)

I also use the correct data.frame now (y instead of the original df.score)

新代码:

# some data
x <- runif(1000)
x2 <- rnorm(1000)
y <- data.frame(x,x2)
# we want to bin the dataframe y acording to values in x into b bins
b = 10
bins=10

# we create breaks in several ways
breaks=unique(quantile(x, probs=seq.int(0,1, by=1/b)))
breaks=unique(quantile(y$x, probs=seq.int(0,1, length.out=b+1)))

# now to the question
# this wokrs
y$b <- with(y, cut(x, breaks=unique(quantile(x, probs=seq.int(0,1, length.out=11))), include.lowest=TRUE))
table(y$b)
# this works too
y$b2 <- with(y, cut(x, breaks=unique(quantile(x, probs=seq.int(0,1, length.out=(bins+1)))), include.lowest=TRUE))
table(y$b2)
# this does not work
y$b3 <- with(y, cut(x, breaks=unique(quantile(x, probs=seq.int(0,1, length.out=(b+1)))), include.lowest=TRUE))

seq.int(0,1,length.out =(b + 1))中的错误:
'length.out'必须是非负数
另外:警告消息:
在Ops.factor(b,1)中:+对因素无效

Error in seq.int(0, 1, length.out = (b + 1)) : 'length.out' must be a non-negative number In addition: Warning message: In Ops.factor(b, 1) : + not meaningful for factors

现在,如果我将代码分解,就没有问题!

Now if I split the code up there is no issue !!!

brks=unique(quantile(x, probs=seq.int(0,1, length.out=(b + 1))))
y$b3 <- with(y, cut(x, breaks=brks, include.lowest=TRUE))

我在这里丢失了...

I am lost here...

这是更多动态代码的一部分,根据数据中的细节编织在一起设置。

This is part of more dynamic code, knitred together based on details in the data set.

所以我想快速创建仓并报告。代码现在可以工作,但是我不明白为什么当我使用代码工作时使用bins这个词,当使用b它失败...?

So I want to create bins on the fly and report on them. The code works now but I do not understand why when I use the word "bins" the code works and when using the "b" it fails...?

OLD从这里
我需要动态添加到数据框,所以我可以稍后再报告。

OLD from here I need to add bins dynamically to a dataframe so I can report on them later.

# some data
x <- runif(1000)
x2 <- rnorm(1000)
y <- data.frame(x,x2)
# we want to bin the dataframe y acording to values in x into b bins
b = 10

# we create breaks in several ways
breaks=unique(quantile(x, probs=seq.int(0,1, by=1/b)))
breaks=unique(quantile(y$x, probs=seq.int(0,1, length.out=b+1)))

# now to question
# this works

y$bins <- with(df.score, cut(x, breaks=unique(quantile(Pchurn, probs=seq.int(0,1, length.out=11))), include.lowest=TRUE))
table(y$bins)

所以如果我想直接使用bin var完全一样,它会失败:

So if I want to do the exact same using the bin var directly it fails:

# this does not work
y$bins <- with(df.score, cut(x, breaks=unique(quantile(Pchurn, probs=seq.int(0,1, length.out=bins+1))), include.lowest=TRUE))


Error in seq.int(0, 1, length.out = (bins + 1)) : 
  'length.out' must be a non-negative number
In addition: Warning message:
In Ops.factor(bins, 1) : + not meaningful for factors

我在这里缺少什么?

推荐答案

我想你想要(代替 b $ c> bins 在#this不工作下方的length参数中:

I think you want this (substituting b for bins in the length parameter calc just below "#this does not work":

y$bins <- with(df.score, cut(x, 
                    breaks=unique(quantile(Pchurn, 
                                         probs=seq.int(0,1, length.out=b+1))), 
                    include.lowest=TRUE))

难以测试没有得分变量以及对目标的更完整的描述,但至少代码不会在工作空间中抛出错误。

Hard to test without a score variable and a more complete description of the goals, but at least the code does not throw an error with this in the workspace.

 df.score=data.frame(Pchurn=rnorm(100), x=rnorm(100))

这篇关于通过使用break&amp; amp; amp;分位数失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆