基于R中另一个数据帧中的列填充数据帧中的列 [英] Filling a column in a dataframe based on a column in another dataframe in r
问题描述
我有一个像这样的评论数据框(df1)
I have a dataframe of comments which looks like this(df1)
Comments
Apple laptops are really good for work,we should buy them
Apple Iphones are too costly,we can resort to some other brands
Google search is the best search engine
Android phones are great these days
I lost my visa card today
我还有另一个商户名称数据框,看起来像这样(df2):
I have another dataframe of merchent names which looks like this(df2):
Merchant_Name
Google
Android
Geoni
Visa
Apple
MC
WallMart
如果df2中的商人名称出现在df 1的注释中,则将商家名称附加到R中df1的第二列。匹配不必是精确匹配。这是所需的近似值。此外,df1包含约50万行!
我的最终输出df可能是这样的
If a merchant_name in df2 appears in a Comment in df 1 ,append that merchant name to the second column in df1 in R.The match need not be an exact match.An approximation is what is required.Also,the df1 contains around 500K rows! My final ooutput df may look like this
Comments Merchant
Apple laptops are really good for work,we should buy them Apple
Apple Iphones are too costly,we can resort to some other brands Apple
Google search is the best search engine Google
Android phones are great these days Android
I lost my visa card today Visa
我该如何在R中高效地做到这一点??
谢谢
How may i do this and efficiently in R.?? Thanks
推荐答案
这是 regex
的工作。在 lapply
内检查 grepl
命令。
This is a job for regex
. Check out the grepl
command inside the lapply
.
comments = c(
'Apple laptops are really good for work,we should buy them',
'Apple Iphones are too costly,we can resort to some other brands',
'Google search is the best search engine ',
'Android phones are great these days',
'I lost my visa card today'
)
brands = c(
'Google',
'Android',
'Geoni',
'Visa',
'Apple',
'MC',
'WallMart'
)
brandinpattern = lapply(
brands,
function(brand) {
commentswithbrand = grepl(x = tolower(comments), pattern = tolower(brand))
if ( sum(commentswithbrand) > 0) {
data.frame(
comment = comments[commentswithbrand],
brand = brand
)
} else {
data.frame()
}
}
)
brandinpattern = do.call(rbind, brandinpattern)
> do.call(rbind, brandinpattern)
comment brand
1 Google search is the best search engine Google
2 Android phones are great these days Android
3 I lost my visa card today Visa
4 Apple laptops are really good for work,we should buy them Apple
5 Apple Iphones are too costly,we can resort to some other brands Apple
这篇关于基于R中另一个数据帧中的列填充数据帧中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!