使用Stringr在另一个附近找到词 [英] find word near another using stringr

查看:73
本文介绍了使用Stringr在另一个附近找到词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的问题,请考虑以下示例

I have a simple problem, consider this example

library(dplyr)
library(stringr)
dataframe <- data_frame(mytext = c('stackoverflow is pretty good my friend',
                                   'but sometimes pretty bad as well'))

# A tibble: 2 x 1
                                  mytext
                                   <chr>
1 stackoverflow is pretty good my friend
2       but sometimes pretty bad as well

我想计算 stackoverflow 接近 good 的次数。我使用以下正则表达式,但不起作用。

I want to count the number of times stackoverflow is near good. I use the following regex but it does not work.

dataframe %>%  mutate(mycount = str_count(mytext, 
 regex('stackoverflow(?:\\w+){0,5}good', ignore_case = TRUE)))
# A tibble: 2 x 2
                                  mytext mycount
                                   <chr>   <int>
1 stackoverflow is pretty good my friend       0
2       but sometimes pretty bad as well       0

有人可以告诉我我在这里想念什么吗?

Can someone tell me what am I missing here?

谢谢!

推荐答案

我想我明白了

dataframe %>%  
mutate(mycount = str_count(mytext, 
                 regex('stackoverflow\\W+(?:\\w+ ){0,5}good', ignore_case = TRUE)))

# A tibble: 4 x 2
                                  mytext mycount
                                   <chr>   <int>
1 stackoverflow is pretty good my friend       1
2       but sometimes pretty bad as well       0
3  stackoverflow good good stackoverflow       1
4                      stackoverflowgood       0

关键是添加与任何匹配的 \W + 元字符个字之间。

The key was adding the \W+ meta-character that matches anything between words.

这篇关于使用Stringr在另一个附近找到词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆