在R中的大写和小写字符之间拆分字符串? [英] Splitting string between capital and lowercase character in R?

查看:34
本文介绍了在R中的大写和小写字符之间拆分字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串向量:

v1 <- c("Firstname LastnameFirstname Lastname", 
"Firstname Lastname", 
"Firstname Lastname", 
"Firstname LastnameFirstname Lastname")

我想将字符串拆分为小写字母后跟一个保留两个字母的大写字母.

I'd like to split the string between lowercase letter followed by a capital letter retaining both of the letters.

所需的输出是:

[1] "Firstname Lastname" "Firstname Lastname"   "Firstname Lastname"  "Firstname Lastname"  "Firstname Lastname" "Firstname Lastname"

StackExchange 中的以下示例我尝试使用 strsplit 函数和 gsub:

Following examples in StackExchange I've tried with the strsplit function with gsub:

unlist(strsplit( gsub("([a-z][A-Z])","\\1~",v1), "~" ))

但这不会在字符之间分割,而是在分割点的正则表达式匹配之后:

but this does not split between the characters, rather after the regex match for split point:

[1] "Firstname LastnameF" "irstname Lastname"   "Firstname Lastname"  "Firstname Lastname"  "Firstname LastnameF" "irstname Lastname"  

如何拆分仍然保留两个字符的字符?

How do I split between the characters still retaining both of the characters?

推荐答案

我们可以使用 regex lookaround 来匹配小写字母 (positive lookbehind - (?<=[az])) 后跟大写字母(正向前瞻 -(?=[AZ]))

We can use regex lookaround to match lower case letters (positive lookbehind - (?<=[a-z])) followed by upper case letters (positive lookahead -(?=[A-Z]))

unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl = TRUE))
#[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" 
#[4] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"

这篇关于在R中的大写和小写字符之间拆分字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆