然后按行排序跨数据帧的特定列进行连接 [英] Row-wise sort then concatenate across specific columns of data frame
问题描述
(相关的问题不包括在内排序.不需要排序时只需使用paste
即可.)
(Related question that does not include sorting. It's easy to just use paste
when you don't need to sort.)
我有一个结构不太理想的表,其中的字符列是通用的"item1","item2"等.我想创建一个新的字符变量,这些变量是按字母顺序,逗号分隔的这些列的连接.例如,在第5行中,如果item1 =牛奶",item2 =鸡蛋"和item3 =黄油",则第5行中的新变量可能是黄油,鸡蛋,牛奶"
I have a less-than-ideally-structured table with character columns that are generic "item1","item2" etc. I would like to create a new character variable that is the alphabetized, comma-separated concatenation of these columns. So for example, in row 5, if item1 = "milk", item2 = "eggs", and item3 = "butter", the new variable in row 5 might be "butter, eggs, milk"
我在下面编写了一个对两个字符变量起作用的函数f()
.但是,我遇到了麻烦
I wrote a function f()
below that works on two character variables. However, I am having trouble
- 使用
mapply
或其他向量化"(我知道这实际上只是一个for循环) - 将函数泛化为任意数量的列
- Using
mapply
or other "vectorization" (I know it's really just a for loop) - Generalizing the function to an arbitrary number of columns
任何帮助,不胜感激.
Any help much appreciated.
df <- data.frame(a =c("foo","bar"),
b= c("baz","qux"))
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"
f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz")
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?
df$new_var <- mapply(f, df$a, df$b)
df
# a b new_var <- new_var is not what I want
# 1 foo baz 1, 2
# 2 bar qux 1, 2
# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"),
b= c("baz","qux"))
dt[,new_var:=mapply(f, a, b)]
dt
# a b new_var <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux
推荐答案
我首先想到的是这样做:
My first thought would've been to do this:
dt[, new_var := paste(sort(.SD), collapse = ", "), by = 1:nrow(dt)]
但是您可以对函数进行一些简单的修改:
But you could make your function work with a couple of simple modifications:
f = function(...) paste(c(...)[order(c(...))],collapse=", ")
dt[, new_var := do.call(function(...) mapply(f, ...), .SD)]
这篇关于然后按行排序跨数据帧的特定列进行连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!