如何在 R 中使用中断进行切割 [英] How does cut with breaks work in R

查看:19
本文介绍了如何在 R 中使用中断进行切割的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解 cut 如何划分和创建间隔;尝试了 ?cut,但无法弄清楚 r 中的 cut 是如何工作的.
这是我的问题:

I am trying to understand how cut divides and creates intervals; tried ?cut but can't be able to figure out how cut in r works.
Here is my problem:

set.seed(111)
data1 <- seq(1,10, by=1)
data1 
[1]  1  2  3  4  5  6  7  8  9 10
data1cut<- cut(data1, breaks = c(0,1,2,3,5,7,8,10), labels = FALSE)
data1cut
[1] 1 2 3 4 4 5 5 6 7 7

1.为什么 data1cut 结果中没有包含 8、9、10?
2. 为什么 summary(data1)summary(data1cut) 产生不同的结果?

1. Why did 8,9,10 not included in data1cut result?
2. why did summary(data1) and summary(data1cut) produces different result?

summary(data1)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
1.00    3.25    5.50    5.50    7.75   10.00 

summary(data1cut)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
1.00    3.25    4.50    4.40    5.75    7.00  

我应该如何更好地使用 cut 以便我可以根据 summary(data1) 的结果创建 4 个 bin?>

How should i better use cut so that i can create say 4 bins based on the results of summary(data1)?

bin1 [1 -3.25]
bin2 (3.25 -5.50]
bin3 (5.50 -7.75]
bin4 (7.75 -10] 

谢谢.

推荐答案

cut 在您的示例中将向量拆分为以下部分:0-1 (1);1-2 (2);2-3 (3);3-5 (4);5-7 (5);7-8 (6);8-10 (7)

cut in your example splits the vector into the following parts: 0-1 (1); 1-2 (2); 2-3 (3); 3-5 (4); 5-7 (5); 7-8 (6); 8-10 (7)

括号中的数字是 cut 根据提供的 breaks 值分配给每个 bin 的默认标签.

The numbers in brackets are default labels assigned by cut to each bin, based on the breaks values provided.

cut 默认排除下限.如果你想改变它,那么你需要在 include.lowest 参数中指定它.

cut by default is exclusive of the lower range. If you want to change that then you need to specify it in the include.lowest argument.

  1. 您没有分配标签,并且此函数中的默认参数为 FALSE,因此使用级别代码(括号中)的整数向量代替.

  1. You did not assign labels and default argument in this function is FALSE so an integer vector of level codes (in brackets) is used instead.

summary(data1) 是原始数据的摘要,summary(data1cut) 是您的分割摘要.

summary(data1) is a summary of raw data and summary(data1cut) is a summary of your splits.

您可以获得所需的拆分:

You can get the split you need using:

data2cut<- 
  cut(data1, breaks = c(1, 3.25, 5.50, 7.75, 10),
      labels = c("1-3.25", "3.25-5.50", "5.50-7.75", "7.75-10"),
      include.lowest = TRUE)

结果如下:

> data2cut

 [1] 1-3.25    1-3.25    1-3.25    3.25-5.50 3.25-5.50 5.50-7.75 5.50-7.75 7.75-10   7.75-10  
[10] 7.75-10  
Levels: 1-3.25 3.25-5.50 5.50-7.75 7.75-10

我希望现在清楚了.

这篇关于如何在 R 中使用中断进行切割的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆