R:pmatch来完成更困难的任务 [英] R: pmatch for a more difficult task

查看:71
本文介绍了R:pmatch来完成更困难的任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

感谢@nullglob,

Thanks @nullglob,

我试图再次运行它,但是输出却不同.如果我滥用了您的代码,您介意教我吗?抱歉,我可能误会了它的工作方式.希望您不介意给我更多建议.

I tried to run it again, but my output is different. Could you mind to teach me if I have misuse your code? Sorry that I may have misunderstand the way how it works. I hope you don't mind to give me some more advice.

 df1 <- data.frame(
    A=c("x01","x02","y03","z02","x04", "x33", "z03"),
    B=c("A01BB01","A02BB02","C02AA05","B04CC10","C01GX02", "yyy", "zzz"))




 df2 <- data.frame(
    X=c("a","b","c","d","e", "f"),
    Y=c("A01BB","A02","C02A","B04","C01GX", "xxx"))





with(c(df1,df2),{
   i <- pmatch(Y,B)
   iunmatched <- which(is.na(i))
   nunmatched <- length(iunmatched)
   nexcess <- length(B) - length(X)
   data.frame(A = c(A,rep(NA,nunmatched)),
              B = c(B,rep(NA,nunmatched)),
              X = c(X[i],rep(NA,nexcess),X[iunmatched]),
              Y = c(Y[i],rep(NA,nexcess),Y[iunmatched]))  })

       A  B  X  Y
    1  1  1  1  1
    2  2  2  2  2
    3  5  5  3  5
    4  6  3  4  3
    5  3  4  5  4
    6  4  6 NA NA
    7  7  7 NA NA
    8 NA NA  6  6

======================原始问题=====

======================ORIGINAL Question=====

感谢您回答我的上一个问题. (http://stackoverflow.com/q/6592214/602276)

Thanks for answers to my previous question. (http://stackoverflow.com/q/6592214/602276)

要以此答案为基础,我想做一个更困难的任务的pmatch.

To build upon this answer, I want to do the pmatch for a more difficult task.

df1 <- data.frame(
  A=c("x01","x02","y03","z02","x04", "x33", "z03")
  B=c("A01BB01","A02BB02","C02AA05","B04CC10","C01GX02", "yyy", "zzz")
)

    A       B
1 x01 A01BB01
2 x02 A02BB02
3 y03 C02AA05
4 z02 B04CC10
5 x04 C01GX02
6 x33     yyy
7 z03     zzz

我的df2修改如下:

df2 <- data.frame(
  X=c("a","b","c","d","e", "f"),
  Y=c("A01BB","A02","C02A","B04","C01GX", "xxx")
)

  X     Y
1 a A01BB
2 b   A02
3 c  C02A
4 d   B04
5 e C01GX
6 f   xxx

困难是由于df1和df2的行数不同,我无法在正确的开头进行cbind

The difficulty is due to df1 and df2 has different no of rows, i cannot do cbind at the right beginning

更糟糕的是,df1和df2之间存在一些不匹配,它们对应的行应相应地得出NA.

Morover, there is some mismatch between df1 and df2, their corresponding line should results NA accordingly.

预期输出如下:

   A       B   X     Y
1 x01 A01BB01   a A01BB
2 x02 A02BB02   b   A02
3 y03 C02AA05   c  C02A
4 z02 B04CC10   d   B04
5 x04 C01GX02   e C01GX
6 x33     yyy   NA  NA
7 z03     zzz   NA  NA
7 NA      NA    f   xxx

您介意教我如何使用R进行操作吗?非常感谢.

Could you mind to teach me how to do it with R? Thanks a lot.

推荐答案

这并非完美的解决方案,但似乎可以解决问题:

This is not exactly an elegant solution, but it seems to do the trick:

with(c(df1,df2),{
  i <- pmatch(Y,B)
  iunmatched <- which(is.na(i))
  nunmatched <- length(iunmatched)
  nexcess <- length(B) - length(X)
  data.frame(A = c(A,rep(NA,nunmatched)),
             B = c(B,rep(NA,nunmatched)),
             X = c(X[i],rep(NA,nexcess),X[iunmatched]),
             Y = c(Y[i],rep(NA,nexcess),Y[iunmatched]))
})

输出应为:

     A       B    X     Y
1  x01 A01BB01    a A01BB
2  x02 A02BB02    b   A02
3  y03 C02AA05    c  C02A
4  z02 B04CC10    d   B04
5  x04 C01GX02    e C01GX
6  x33     yyy <NA>  <NA>
7  z03     zzz <NA>  <NA>
8 <NA>    <NA>    f   xxx

这篇关于R:pmatch来完成更困难的任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆