apply-strsplit-rowwise 包括排序和嵌套粘贴 [英] apply-strsplit-rowwise including sort and nested paste

查看:22
本文介绍了apply-strsplit-rowwise 包括排序和嵌套粘贴的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我只是没看到,但我在网上、邮件列表档案或常见问题解答中发现的所有类似内容都无法真正阐明我的问题.

I guess I just don't see it, but all the similar thing I found on the Net, in the Mailinglist archives or the FAQ could not really elucidate my issue.

我发现的最接近的是这个:应用strsplit rowwise

The closest I have found was this: apply strsplit rowwise

我有一个 df,有两个字符列和一个数字列.填写如下:

I have a df, with two character columns and one numerical column. Filled like this:

df=data.frame(name1=c("A","B","C","D"),
          name2=c("B","A","D","C"),
          nums=c(1,1,4,4),
          stringsAsFactors=F)

现在我想在此找到唯一的行,但是,仅基于两个名称列.对于那些列,列的顺序没有意义,因此如果我理解正确的话,我不能使用 duplicated.

Now I would like to find the unique rows in this, however, only based on the two name columns. And for those columns, the order of the columns has no significance, thus i can not use duplicated, if I understood it correctly.

所以我考虑将两个名称列按行合并,按行排序,然后打印出向量的 paste(length=2 与 sapply 之类的组合)代码>).

So I thought about combining the two name columns row wise, make a rowwise sorting, and print out a paste of the vector (length=2 in combination with something like sapply).

但是我没有让它工作.

到目前为止,我使用了 for 循环,但这需要很长时间才能处理原始数据.

So far, I used a for loop, but this takes ages on the original data.

for(i in 1:length(df$name1)){
           mysort=sort(c(df$name1[i],df$name2[i]))
           df$combname[i]=paste(mysort[1],mysort[2])
    }

欢迎提出任何建议.也许我只是以错误的方式理解了 uniquesapply.

Any suggestions are welcome. Maybe I just understand unique and sapply in a wrong way.

推荐答案

没有 for 循环的解决方案.

Solution without for loop.

df$combname <- apply(df[1:2], 1, function(x) paste(sort(x), collapse=""))

这篇关于apply-strsplit-rowwise 包括排序和嵌套粘贴的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆