基于列表中元素的子集数据 [英] Subset Data Based On Elements In List

查看：65 发布时间：2020/10/17 0:23:52 r dataframe subset

本文介绍了基于列表中元素的子集数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在 R 中，我尝试将 data.frame 子集命名为 Data 通过使用存储在列表中的元素。

In R, I am trying to subset the data.frame named Data by using element stored in a list.

数据

Data <- read.table(text = "  Data_x  Data_y  Column_X 
                                -34      12       A
                                -36      20       D
                                -36      12       E
                                -34      18       F
                                -34      10       B
                                -35      24       A
                                -35      16       B
                                -33      22       B
                                -33      14       C
                                -35      22       D", header = T)

代码

variableData <- list("A", "B")
subsetData_1 <- subset(Data, Column_X == variableData[1])
subsetData_2 <- subset(Data, Column_X == variableData[2])
subsetData <- rbind(subsetData_1, subsetData_2)

问题

首先，列表中的元素可以大于两个，并且不是固定的。甚至可以包含100个以上的元素。

第二，我只想保留一个 data.frame 来存储所有子集数据使用列表中的所有元素提取。如果还有更多元素，比方说100，那么我不想为每个元素重复 subset（）。

First, the elements in the list can be more than two and is not fixed. Can even have more than 100 elements.
Second, I want to keep only one data.frame which will store all the subset data extracted using all the elements in list. If there are more elements, lets say 100, then I don't want to repeat subset() for each of the elements.

有没有比上面的代码更好的方法了？由于我的方法还不够好，因此会影响性能。

Is there a better way to approach this than the code above? As my approach is not good enough and will take performance hit.

任何建议都会有所帮助，谢谢。

Any suggestion will be helpful, thanks.

推荐答案

经典愉快地。

x <- lapply(variableData, function(x){subset(Data, Column_X == x)})
x
# [[1]]
# Data_x Data_y Column_X
# 1    -34     12        A
# 6    -35     24        A
# 
# [[2]]
# Data_x Data_y Column_X
# 5    -34     10        B
# 7    -35     16        B
# 8    -33     22        B

它返回所有子集的列表。要 rbind 所有这些列表元素，只需

it returns a list of all the subsets. To rbind all these list elements just

do.call(rbind, x)
#   Data_x Data_y Column_X
# 1    -34     12        A
# 6    -35     24        A
# 5    -34     10        B
# 7    -35     16        B
# 8    -33     22        B

但是，正如@Frank指出的那样，您可以使用代码中的基本子集：

however, as @Frank pointed out, you could use basic subsetting in your code:

Data[Data$Column_X %in% variableData,]
#   Data_x Data_y Column_X
# 1    -34     12        A
# 5    -34     10        B
# 6    -35     24        A
# 7    -35     16        B
# 8    -33     22        B

警告

"Warning

这是一个方便使用的功能，可以交互使用，对于编程，最好使用标准的子集功能，例如 [，尤其是参数子集的非标准评估会产生意想不到的后果。（？subset ）

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences." (?subset)

此外，行的顺序为保持。

Furthermore, thus the order of your rows will be kept.

这篇关于基于列表中元素的子集数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

基于列表中元素的子集数据 [英] Subset Data Based On Elements In List

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

基于列表中元素的子集数据 [英] Subset Data Based On Elements In List

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭