str_extract:从字符串中准确提取第 n 个单词 [英] str_extract: Extracting exactly nth word from a string

查看：66 发布时间：2021/7/6 20:09:33 r regex string stringr

本文介绍了str_extract:从字符串中准确提取第 n 个单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我知道这个问题已经在好几个地方问过了，但我没有看到确切的答案.

I know this question has been asked at several places, but I didnt see a precise answer to this.

所以我试图在正则表达式的帮助下从 R 中的字符串(trying to")中准确提取第二个单词.我不想使用 unlist(strsplit)

So I am trying to extract exactly the 2nd word from a string("trying to") in R with the help of regex. I do not want to use unlist(strsplit)

sen= "I am trying to substring here something, but I am not able to"

str_extract(sen, "trying to\\W*\\s+((?:\\S+\\s*){2})")

理想情况下，我想将here"作为输出，但我得到试图在此处进行子串"

Ideally I want to get "here" as an output, but I am getting "trying to substring here"

推荐答案

你实际上可以捕捉你需要的词str_match:

You may actually capture the word you need with str_match:

str_match(sen, "trying to\\W+\\S+\\W+(\\S+)")[,2]

或

str_match(sen, "trying to\\s+\\S+\\s+(\\S+)")[,2]

这里，\S+匹配1个或多个除空格以外的字符，\W+匹配一个或多个单词字符以外的字符，\s+ 匹配 1 个以上的空格.

Here, \S+ matches 1 or more chars other than whitespace, and \W+ matches one or more chars other than word chars, and \s+ matches 1+ whitespaces.

请注意，如果您的单词"被多个空格(例如标点符号)分隔，请使用 \W+.否则，如果只有空格，请使用 \s+.

Note that in case your "words" are separated with more than whitespace (punctuation, for example) use \W+. Else, if there is just whitespace, use \s+.

[,2] 将访问第一个捕获的值(与第一对未转义括号内的模式部分匹配的文本部分).

The [,2] will access the first captured value (the part of text matched with the part of the pattern inside the first unescaped pair of parentheses).

这篇关于str_extract:从字符串中准确提取第 n 个单词的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

str_extract:从字符串中准确提取第 n 个单词 [英] str_extract: Extracting exactly nth word from a string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

str_extract:从字符串中准确提取第 n 个单词 [英] str_extract: Extracting exactly nth word from a string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭