如何在具有不同“模糊逻辑"的2个变量上模糊连接2个数据帧? [英] How to fuzzy join 2 dataframes on 2 variables with differing "fuzzy logic"?

查看:84
本文介绍了如何在具有不同“模糊逻辑"的2个变量上模糊连接2个数据帧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

# example
a <- data.frame(name=c("A","B","C"), KW=c(201902,201904,201905),price=c(1.99,3.02,5.00))
b <- data.frame(KW=c(201903,201904,201904),price=c(1.98,3.00,5.00),name=c("a","b","c"))

我想使用变量KW和价格将a和b与模糊逻辑匹配.我想允许KW的公差为+/- 1,价格公差为+/- 0.02.

I want to match a and b with fuzzy logic, using the variables KW and price. I want to allow a tolerance of +/- 1 for KW and a tolerance for +/- 0.02 in price.

所需的结果应如下所示:

The desired outcome should look like this:

name.x   KW.x price.x   KW.y price.y name.y
1    A 201902    1.99 201903    1.98      a
2    B 201904    3.02 201904    3.00      b
3    C 201905    5.00 201904    5.00      c

我更愿意使用fuzzyjoin软件包找到解决方案.到目前为止,我已尝试使用fuzzy_inner_join函数,并使用match_fun参数为KW和价格指定所需的tolrences.但是,我无法使它正常工作.

I would prefer to find a solution using the fuzzyjoin package. I tried so far using the fuzzy_inner_join function and specifying my desired tolrences for KW and price using the match_fun argument. However, I couldn't get it to work.

寻求帮助,如何解决此问题.

Looking for help, how to solve this problem.

推荐答案

您可以使用merge然后是subset符合我们要求条件的行来创建两个数据框的笛卡尔积.

You can create a cartesian product of two dataframes using merge and then subset the rows which follow our required conditions.

subset(merge(a, b, by = NULL), abs(KW.x - KW.y) <= 1 & 
                               abs(price.x - price.y) <= 0.02)

#  name.x   KW.x price.x   KW.y price.y name.y
#1      A 201902    1.99 201903    1.98      a
#5      B 201904    3.02 201904    3.00      b
#9      C 201905    5.00 201904    5.00      c

这篇关于如何在具有不同“模糊逻辑"的2个变量上模糊连接2个数据帧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆