如何在特定字符串后提取所有内容? [英] How to extract everything after a specific string?
问题描述
我想提取 R 中字符串向量中-"之后的所有内容.
I'd like to extract everything after "-" in vector of strings in R.
例如:
test = c("Pierre-Pomme","Jean-Poire","Michel-Fraise")
我想得到
c("Pomme","Poire","Fraise")
谢谢!
推荐答案
With str_extract
.\\b
是匹配字边界的零长度标记.这包括任何非单词字符:
With str_extract
. \\b
is a zero-length token that matches a word-boundary. This includes any non-word characters:
library(stringr)
str_extract(test, '\\b\\w+$')
# [1] "Pomme" "Poire" "Fraise"
我们也可以使用带有 sub
的反向引用.\\1
指的是与第一个捕获组(.+)
匹配的字符串,它是在 -
之后一次或多次的任何字符结束:
We can also use a back reference with sub
. \\1
refers to string matched by the first capture group (.+)
, which is any character one or more times following a -
at the end:
sub('.+-(.+)', '\\1', test)
# [1] "Pomme" "Poire" "Fraise"
这也适用于 str_replace
如果已经加载:
This also works with str_replace
if that is already loaded:
library(stringr)
str_replace(test, '.+-(.+)', '\\1')
# [1] "Pomme" "Poire" "Fraise"
第三个选项是使用 strsplit
并从列表的每个元素中提取第二个单词(类似于@akrun 答案中的 word
):
Third option would be using strsplit
and extract the second word from each element of the list (similar to word
from @akrun's answer):
sapply(strsplit(test, '-'), `[`, 2)
# [1] "Pomme" "Poire" "Fraise"
stringr
也有这样的 str_split
变体:
stringr
also has str_split
variant to this:
str_split(test, '-', simplify = TRUE)[,2]
# [1] "Pomme" "Poire" "Fraise"
这篇关于如何在特定字符串后提取所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!