在R中的大写和小写字符之间拆分字符串? [英] Splitting string between capital and lowercase character in R?
问题描述
我有一个字符串向量:
v1 <- c("Firstname LastnameFirstname Lastname",
"Firstname Lastname",
"Firstname Lastname",
"Firstname LastnameFirstname Lastname")
我想将字符串拆分为小写字母后跟一个保留两个字母的大写字母.
I'd like to split the string between lowercase letter followed by a capital letter retaining both of the letters.
所需的输出是:
[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
StackExchange 中的以下示例我尝试使用 strsplit
函数和 gsub
:
Following examples in StackExchange I've tried with the strsplit
function with gsub
:
unlist(strsplit( gsub("([a-z][A-Z])","\\1~",v1), "~" ))
但这不会在字符之间分割,而是在分割点的正则表达式匹配之后:
but this does not split between the characters, rather after the regex match for split point:
[1] "Firstname LastnameF" "irstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname LastnameF" "irstname Lastname"
如何拆分仍然保留两个字符的字符?
How do I split between the characters still retaining both of the characters?
推荐答案
我们可以使用 regex lookaround 来匹配小写字母 (positive lookbehind - (?<=[az])
) 后跟大写字母(正向前瞻 -(?=[AZ])
)
We can use regex lookaround to match lower case letters (positive lookbehind - (?<=[a-z])
) followed by upper case letters (positive lookahead -(?=[A-Z])
)
unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl = TRUE))
#[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
#[4] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
这篇关于在R中的大写和小写字符之间拆分字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!