如何根据 R 中的列值范围拆分数据框? [英] How do I split a data frame based on range of column values in R?
问题描述
我有一个这样的数据集:
I have a data set like this:
Users Age
1 2
2 7
3 10
4 3
5 8
6 20
如何将此数据集拆分为 3 个数据集,其中第一个包含年龄在 0-5 岁之间的所有用户,第二个是 6-10 岁,第三个是 11-15 岁?
How do I split this data set into 3 data sets where the first consists of all users with ages between 0–5, second is 6–10 and third is 11–15?
推荐答案
您可以将 split
与 cut
结合起来,在一行代码中完成,避免了需要为不同的数据范围使用一堆不同的表达式进行子集:
You can combine split
with cut
to do this in a single line of code, avoiding the need to subset with a bunch of different expressions for different data ranges:
split(dat, cut(dat$Age, c(0, 5, 10, 15), include.lowest=TRUE))
# $`[0,5]`
# Users Age
# 1 1 2
# 4 4 3
#
# $`(5,10]`
# Users Age
# 2 2 7
# 3 3 10
# 5 5 8
#
# $`(10,15]`
# [1] Users Age
# <0 rows> (or 0-length row.names)
cut
根据指定的断点拆分数据,split
根据提供的类别拆分数据框.如果将此计算的结果存储到名为 l
的列表中,则可以使用 l[[1]]
, l[[2] 访问较小的数据帧]]
和 l[[3]]
或更详细的:
cut
splits up data based on the specified break points, and split
splits up a data frame based on the provided categories. If you stored the result of this computation into a list called l
, you could access the smaller data frames with l[[1]]
, l[[2]]
, and l[[3]]
or the more verbose:
l$`[0,5]`
l$`(5,10]`
l$`(10, 15]`
这篇关于如何根据 R 中的列值范围拆分数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!