如何基于其他列的排列在数据框中创建新列？ [英] How can I create a new column in a dataframe based on permutations of other columns?

查看：72 发布时间：2020/10/17 1:04:18 r dataframe

本文介绍了如何基于其他列的排列在数据框中创建新列？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有一个数据帧，如下所示：

Suppose I have a dataframe which looks like this:

    var1   var2   var3   var4  
a   TRUE   FALSE  TRUE   FALSE
b   TRUE   TRUE   TRUE   FALSE
c   FALSE  TRUE   FALSE  TRUE
d   TRUE   FALSE  FALSE  FALSE
e   TRUE   FALSE  TRUE   FALSE
f   FALSE  TRUE   FALSE  TRUE

我想创建一个新列，将 a 分配给 f 根据类别分别基于 TRUE 和 FALSE 的每个排列的类别。

I want to create a new column which assigns a to f to categories based on what permutation of TRUE and FALSE each has for the variables along the top.

在此简化示例中，结果如下：

In this simplified example, the result would look like:

    var1   var2   var3   var4    category
a   TRUE   FALSE  TRUE   FALSE      A
b   TRUE   TRUE   TRUE   FALSE      B
c   FALSE  TRUE   FALSE  TRUE       C
d   TRUE   FALSE  FALSE  FALSE      D
e   TRUE   FALSE  TRUE   FALSE      A
f   FALSE  TRUE   FALSE  TRUE       C

否两次，每个 TRUE 和 FALSE 的唯一排列成为不同的类别，并且由于 a 和 e 具有相同的排列，它们以同一类别（ A ）结尾。

Notice that each unique permutation of TRUE and FALSE becomes a different category, and since a and e have the same permutation, they end up in the same category (A).

是否有一种简单的方法可以做到这一点，如果顶部有很多变量，并且不限于，则可以使用该方法是和否，但是数据框是否填充有类别/数字？


Is there an easy way to do this, which can work if there is a large number of variables along the top, and potentially not limited to TRUE and FALSE but also if the dataframe was filled with categories/numbers?
推荐答案
您可以执行以下操作
## paste the rows together, creating a character vector
x <- do.call(paste, df)
## match it against itself and apply to 'LETTERS', and assign as new column
df$category <- LETTERS[match(x, x)]
df
#    var1  var2  var3  var4 category
# a  TRUE FALSE  TRUE FALSE        A
# b  TRUE  TRUE  TRUE FALSE        B
# c FALSE  TRUE FALSE  TRUE        C
# d  TRUE FALSE FALSE FALSE        D
# e  TRUE FALSE  TRUE FALSE        A
# f FALSE  TRUE FALSE  TRUE        C

如果我们使用命名列表作为环境，那么上面的代码可以单行编写。这样可以避免对全球环境进行任何新的分配。
The above code can be written as a one-liner if we use a named list as an environment.  This avoids making any new assignments to the global environment.
df$category <- LETTERS[with(list(x = do.call(paste, df)), match(x, x))]

 数据： 
df <- structure(list(var1 = c(TRUE, TRUE, FALSE, TRUE, TRUE, FALSE), 
    var2 = c(FALSE, TRUE, TRUE, FALSE, FALSE, TRUE), var3 = c(TRUE, 
    TRUE, FALSE, FALSE, TRUE, FALSE), var4 = c(FALSE, FALSE, 
    TRUE, FALSE, FALSE, TRUE)), .Names = c("var1", "var2", "var3", 
"var4"), row.names = c("a", "b", "c", "d", "e", "f"), class = "data.frame")


                        这篇关于如何基于其他列的排列在数据框中创建新列？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何基于其他列的排列在数据框中创建新列？ [英] How can I create a new column in a dataframe based on permutations of other columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何基于其他列的排列在数据框中创建新列？ [英] How can I create a new column in a dataframe based on permutations of other columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭