apply-strsplit-rowwise,包括排序和嵌套粘贴 [英] apply-strsplit-rowwise including sort and nested paste
问题描述
我想我只是看不到它,但是我在网上找到的所有类似内容(在邮件列表档案或FAQ中)都无法真正阐明我的问题.
I guess I just don't see it, but all the similar thing I found on the Net, in the Mailinglist archives or the FAQ could not really elucidate my issue.
我找到的最接近的是:逐行应用strsplit
The closest I have found was this: apply strsplit rowwise
我有一个df,有两个字符列和一个数字列.像这样填充:
I have a df, with two character columns and one numerical column. Filled like this:
df=data.frame(name1=c("A","B","C","D"),
name2=c("B","A","D","C"),
nums=c(1,1,4,4),
stringsAsFactors=F)
现在,我仅在两个名称列的基础上,想在其中找到唯一的行.对于这些列,这些列的顺序没有任何意义,因此,如果我理解正确的话,我将无法使用 duplicated
.
Now I would like to find the unique rows in this, however, only based on the two name columns. And for those columns, the order of the columns has no significance, thus i can not use duplicated
, if I understood it correctly.
因此,我想到了将两个名称列按行组合,按行进行排序,并打印出矢量的 paste
(length = 2与 sapply之类的组合代码>).
So I thought about combining the two name columns row wise, make a rowwise sorting, and print out a paste
of the vector (length=2 in combination with something like sapply
).
但是我没有使它起作用.
However I did not get it to work.
到目前为止,我使用了for循环,但这要花一些时间才能处理原始数据.
So far, I used a for loop, but this takes ages on the original data.
for(i in 1:length(df$name1)){
mysort=sort(c(df$name1[i],df$name2[i]))
df$combname[i]=paste(mysort[1],mysort[2])
}
欢迎提出任何建议.也许我只是以错误的方式理解了 unique
和 sapply
.
Any suggestions are welcome. Maybe I just understand unique
and sapply
in a wrong way.
推荐答案
不带for循环的解决方案.
Solution without for loop.
df$combname <- apply(df[1:2], 1, function(x) paste(sort(x), collapse=""))
这篇关于apply-strsplit-rowwise,包括排序和嵌套粘贴的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!